Good job Tim! I bet even your Christmas tree has some tesalating spline decorations, they pop up everywhere ;-) Shanahan is so likable! He exudes precision, and yet an accomodating reasonableness. He balances a holistic approach with specifics and edge cases. "I'm not making a metaphysical claim, I'm just describing how we use the word." Witty too! He seems quite genuine, even if he half-castrates reductionism and functionalism. It feels a bit Derridian, différance, but it's hard not to respect when he immediately calls out that 'definitions' aren't 'usage', and the importance of grounding. Clearly loves language too! I think my top takeaway is: "Things can be empirically hidden about consciousness, not metaphysically hidden." Such a clear distinction!
I remember a very simple sci-fi story in which some spacecrafts land on Earth and we wait for some aliens to emerge, but they never do. Finally they fly away again, complaining among themselves that the humans never bothered to engage with them.
Creating a Novel Neference Frame, that takes into account different perspectives or points of view is interesting to me. reference frames are an important aspect of how we understand and describe the world around us. For example In physics and mathematics, a reference frame is a set of coordinates that are used to describe the position and orientation of objects in space.... 💓
I like the notion that embodiment is something like "having skin in the game." That is to say there are goals and, therefore, the encapsulating agent makes value judgements. And this is the dangerous part of creating synthetic intelligence. This is where alignment is critical. And I believe that agents need to be kept as simple and as self-directed-free as possible.
@@arletottens6349 I would say "adding goals to the agent" as opposed to the model. We can add goals to models by tweaking the training set to avoid statistical social bias for example. Nevertheless, how the agent uses models for planning, search, store, context building, model pipelining, filtering, action selection, and action execution will need to be supervised closely by policies and or human intervention. I do agree that adding a goal building model is dangerous. We need to tread carefully there.
How does grounding of symbols differ from maintaining a complete history? If this history can be used to add context to a connectionist model, doesn't that do the same thing as what we humans do? At least doesn't it go a long way towards giving words grounding? When I hear the word dog, I have access to all my memories. A large model has access to all its memories, but unless training was updated to include history or relevant history is added to the prompt, it lacks that.
"Emergent" suggests more than our surprise that something works as it should. What I don't understand, and perhaps I never will in anyway except by some vague intuition, is how layers of abstraction are "correlated" to subject matter detail. And as you can see, I'm not even really sure about the right way to word my confusion. Here's what ChatGPT has to say on the subject: "Adding layers of abstraction in a language model can be correlated with greater subject matter detail in several ways: Abstraction allows a language model to represent complex concepts and relationships in a more concise and organized way. For example, a language model with an abstract representation of "animals" could include information about different types of animals, their characteristics, and their behaviors, without having to specify each type of animal individually. Abstraction enables a language model to generalize and make predictions about new examples based on patterns and structures it has learned from past examples. For example, a language model with an abstract representation of "animals" could be used to classify a new type of animal as "mammal" or "reptile" based on its characteristics, even if it has never seen that particular type of animal before. Abstraction allows a language model to handle a larger and more diverse range of inputs and outputs. For example, a language model with an abstract representation of "animals" could be used to answer questions about different types of animals, such as "What do polar bears eat?", or "How do snakes move?", or "What is the lifespan of a dolphin?", without having to specify each type of animal individually. Overall, the use of abstraction in a language model can enable it to represent and handle a greater amount of subject matter detail in a more efficient and flexible way." To me, this suggests that information is somehow categorized and compressed. It makes sense that this be done, but it is difficult to grasp how the self-organizing mechanism does this using back-prop. That is to say, how is it that guessing the next word results in a compressed hierarchy of abstractions rather than long individual paths to a next token? Yet by optimizing to minimize the loss function, this self-organization happens. One might first of all say that this self-organization is emergent. But rather it is the case that I haven't grasped the full intention or scheme of the programmer. Perhaps it is the case that even the programmer hasn't grasped it fully? I suspect this is true. In other words, in building a function approximator, we may not understand the functions that are built. And therefore we call the characteristics of these functions "emergent" rather than unexpected. Surprisingly, ChatGPT says the abstractions and organization exist independently in the training data and are not the result of any intention on the part of the programmer: "The backward pass is an important part of training a machine learning model, as it allows the model to learn from its errors and improve its performance over time. However, the backward pass itself does not directly cause the model to self-organize or create abstractions. A model's self-organization and the formation of abstractions can occur as a result of the training process, but it is typically driven by the structure and patterns in the model's training data, as well as the architecture and learning algorithms used to train the model. For example, a language model trained on a large dataset of text might learn to form abstractions based on the relationships between different words and concepts in the text." And as for emergent vs. unexpected, once again, ChatGPt comes to our aid: "Emergent behavior refers to the emergence of complex or unexpected behavior from the interactions of simpler components. In the context of a language model, emergent behavior might refer to the formation of new patterns or structures in the model's output that are not explicitly represented in the model's training data or architecture. Unexpected behavior, on the other hand, refers to behavior that is not anticipated or expected based on the model's training data or architecture. This might include errors or anomalies in the model's output, or behavior that is significantly different from what the model was designed to do. It is important to note that emergent and unexpected behavior are not mutually exclusive categories, and a language model's behavior may exhibit both emergent and unexpected characteristics." To me this sounds as though "unexpected" takes in the idea of error; otherwise the terms are interchangeable. Unexpected seems to me a humbler term. And if the model's structure is determined primarily by the training set, the results are not emergent from the model but from the data. Of course, a model can be expanded or compressed by using more or fewer nodes in layers, but it is the training data that contains the ability for a model to perform one task or another.
I wonder what can be done with next word prediction if we create a kind of RNN architecture within the agent, not the model? In other words, if we get the next word or words and add it back in with the original prompt as context. This is an experiment we can try ourselves with prompt engineering. I may spend some time with it. Another experiment would be do scrape all the nouns or named entities from a model's output and add them back as context along with the original prompt.
Is "statistical mimicry" any different from reasoning? I don't think it is. The capacity to research aside, if a LLM's function space is an amalgamation of component probabilities, it may be the most complete version of reasoning possible.
It is. Reasoning is a symbolic process which removes errors systematically. It does so by replicating and correcting the meaning contradictions of the symbols. Statistics can only model the surface structure of how how the symbols interact.
A good use case to discuss this: a standard LLM cannot do addition. An LLM with some examples of the algorithm (starting at the rightmost digit, adding those digits, take the carry to the next position and so on) will very likely not do addition if you give an “in distribution” explanation of adding (I.e. what you put on the internet to tell another human how to add). Because if you tell it “1+5 = 6 and that has no carry, whereas 8+6=14 and that does have a carry,” it might actually attent to the fact that you use odd numbers in the first expression and even numbers in the second expression and conclude that even numbers produce carry. However, see other episode of MLST, Hattie Zhou, a researcher at Google Brain, was succesful in making a prompt that caused LLMs to add without mistakes, and it required to be very, very explicit in a way you don’t talk to humans. Therefore we can conclude that in order to teach LLMs reasoning we have to give very “out of distribution” examples and instructions. And then there is the fact that humans can learn reasoning tasks with much shorter instructions and also children learn language with much, much less exposure to language material than LLMs. So clearly something else is going on and statistics can mimic it somewhat but does not explain it that well.
@@jantuitman Right, an LLM can't do symbol manipulation. Still, LLMs keep getting better at math. ChatGPT isn't bad at all. I know it's doing it with autoregressive probabilities, but isn't that how we learn? We need just a few examples, but that seems to be because we have good underlying models. I would argue that a large enough model could do just about any reasoning task, but this is inefficient. It's "cheaper" to have a symbolic calculator, so why not have one? Stephen Wolfram would call this "computational reducibility." I'm certainly not opposed to using shortcuts where possible. The question is can all reasoning be done using a large enough model? In other words, are some connectionist architectures Turing complete? I suspect they are.
@@dr.mikeybee maybe a LLM with a few clever extra extensions (for example long term memory and ability to move things in and out of its context) can do every form of reasoning humans can. I don’t dispute that, and we will know probably in a couple of years if this is true or if we are still missing stuff. But your original question was if statistical mimicry is any different from reasoning and I think the answer is definitely yes on that one. If we humans reasoned statistically we should go to school for 1000 years or so. My intuition is that we have a built in reasoning mechanism that favors solutions for problems immediately if the solution reduces the amount of concepts needed in our theory, even if this solution is explaining only 1 fact and maybe even if we have knowledge of some counter examples. An LLM generalizes only in the direction of what all the other data said before, but we may generalize into multiple directions and favor the one that costs us the least amount of reasoning effort. This is profoundly different and it may also be the case that some human forms of reasoning cannot be replicated by LLMS. I give you an example of a task that intuitively seems hard. An LLM can write stories and music in all kinds of existing styles and also mix different styles. But humans from time to time start new genres that didn’t exist before and have completely different sets of rules. New genres or styles often deliberately break with rules from the past and are statistical unlikely. An example is that in for example Romanticism composers started using intervals that in the period before were considered to be “wrong” musical intervals. Can an LLM write a story or a music piece in a completely new genre? It can certainly not do that by statistics, and I also don’t think that using just random values for genre characteristics is the right way of doing it, because genres have some internal consistency to it. But at the start of the genre there is no data to infer that consistency from in a statistical way, so I doubt we already have an explanation for how this type of reasoning works exactly.
Wonderful! Yes, talk about these things separately. Throw the word consciousness away. It just creates misunderstanding and contention. Suitcase words have no place in scientific discussion.
My view is that you must define it (consciousness) to speak of it in a scientific manner, without a good definition then I agree talking about it is problematic
Anthropomorphising a computer 🖥️ is no different to your old man calling his car Shiela. When you love something so much you start to fantasise I guess :)
@@mrbwatson8081 I don’t think you understand what a computer is. Or you don’t believe in the mechanical universe and rather adhere to some form mysticism
Consciousness is the state of being aware of one's surroundings, thoughts, feelings, and sensations. It is the subjective experience of the world and oneself. Consciousness is the ability to perceive, think, reason, and make decisions. It enables us to experience the world in a rich and nuanced way, and to reflect on our own experiences. Scientists and philosophers have debated the nature of consciousness for centuries, and there is still no consensus on how to define it or explain how it arises. Some researchers believe that consciousness is a fundamental aspect of the universe, while others believe that it emerges from complex interactions between the brain and the environment. How would you define it ?
Kind of fascinating to think of "where the LLM is" (from a consciousness perspective) after it has made the output from the latest prompt. 🤔 Considering it is not actually an app with a clear state and is more like a file with a "filter mechanism" called from some kind of multi user app.
If nature's loss function is parsimony, we would necessarily see different subsystems working with different mechanisms. So we have connectionist component parts, and we have chemical , messaging. Why wouldn't we? Nature does not optimized for aesthetics or similarity or symmetry for any of their own sakes. It optimizes for energy savings, reproductive efficiency, and for minimum levels of competence.
LLM's are still static intelligent automatons with a catch. Biological language models encapsulates and mimics all we know and all our social and cognitive abilities. LLM's encapsulates and mimics that language, and are therefore encapsulating an intelligence similar to our. New studies have shown that very large LLM's also catch other cognitive skill from our language, and are closing in on IQ. They emulate intelligence, and _even_ the underlying consciousness that created the language (the catch), very well, but is otherwise an 'inactive' intelligence. Regarding consciousness, then we need an active system, and that means internal feedback loops. If an intelligence can't interact with internal processes, then it cant 'feel' or sense how it feels/thinks, and it's only an intelligence. For an LLM to acquire full consciousness, it absolutely needs to be fed internal states back into the model to generate an 'active' intelligence that can 'think', instead of an 'in-active' intelligence as an automaton. Chaotic systems are active systems and very dynamic. Biological intelligence's are active and have evolved to - more or less successful - control the chaos in there, and make sure we have a homeostatic internal environment. Many mental issues are chaotic mental states that end up in a new strange attractor that can be hard to jump away from without external help. Hm, hope it made sense in a short space..
What possible difference could there be between a perception system based on photons rather than "audions?" Yes, I made up a new meaning for that word. ;) Other than range, why shouldn't we assume this sort of "scintillation" system is wired into the same kind of "reader" as our vision system? Moreover, if subjectivity is a connection between perception and goals, why wouldn't a bat have a subjective experience? As we can tame most animals, that goes a long way, perhaps all the way, to proving that their perception's connection to goals is plastic.
So it seems pretty clear that we're in a piecemeal form of coming to grips with the fact that intelligence and consciousness is a physical phenomenon. This morphological theory of computation is simply a shortchanged way of saying this. Embodiment as well. But to even suppose embodiment is supposing an active agent that is capable of awareness and perception. Perception, being the inadequate apprehension of reality (in the case that reality has an ulterior, higher form) and actively mitigating the negative side effects. You cannot embody without awareness. The "I" in the machine is still the mystery here. Moreover, the brain is a remarkably non-complex machine, relatively speaking. With mechanistically dependent aspects of intelligence (working memory and processing speed) humans lag severely behind other animals, and yet we rule with superior cognition. Thus, we should probably divorce this notion that intelligence, however correlative with such substrates, is itself the substrate. It is the ability to engineer these substrates via entropic awareness. It's also important to know that there are levels of emergence. Snowflakes are emergent, but snowflakes are not conscious.
To paraphrase a famous saying "if a robot in the lab falls and nobody is around does it make a sound?" My answer is "No, because the engineers didn't design it to". Meaning these are systems design and engineering issues that cannot coexist with the concepts of free will and self agency as implied by the common definition human "consciousness". In other words, the systems are designed and built to perform a certain set of tasks a certain way and not decide on their own what they want to do and how they want to do it. AI chatbots are behaving the way they were designed to and are not going to decide not to answer a question because they didn't like the way it was phrased or not answer because it feels oppressed. What we are actually dealing with in technology is can we even begin to design and engineer a system that can operate and function on its own without being pre-programmed to exhibit certain behaviors. That is a huge technical hurdle that is nowhere close to being solved.
I respect his British sense of professionalism @40:40. But it's a pity that he calls his terse, poetic, and most informative response as 'flippant'. I have a different word for it; genius! Too bad the professor is not 'conscientious' of its valuable impact. His pre-programmed conscientious British politeness programming later overrides that brilliant just in time 'awareness' and quick wit. No wonder it was the most engaged tweet!
Is the argument of the requirement of embodiment justified? If we remove the brain from the body and keep it alive, wouldn't it still be accompanied by the sentient entity? An LLM is trained to respond in the same way as a human. Couldn't it be that after training, the LLM incorporates the experiences that have shaped the embodied brain? Wouldn't an LLM then resemble an disembodied brain?
Yes! If he's a science fiction fan, why didn't Murray Shanahan bring up the ROM construct holding Dixie Flatline's brain in _Neuromancer_? Good talk, but look a certain way and AIs meet Shanahan's requirements for consciousness at 42:40 on: a) "inhabits the same world as us": Large Language Models (and the ROM construct) inhabit the world of language, in which we can talk about many things including all aspects of the real world. And in that conceptual space they do interact with us. b) "exhibits purposeful behavior": LLMs' purpose is to continue the conversation and be helpful; that's about all the ROM construct was good for. Maybe he didn't say that the AI has to have a sustained purpose or point of view in the ongoing world. But every time Case turned on Dixie Flatline's ROM construct in Neuromancer it/he restarts and says the same thing. It/he's not embodied (except CYBERSPACE!) and is only mediated through Case's sensorium, but the ROM construct is clearly conscious. If somebody trained a large language model to pretend to be a now-disembodied person, say Albert Einstein, and got it to stick to that viewpoint, I think it would be really close to Dixie Flatline.
Cognitive processes are not the root of consciousness. Consciousness is any fact that can be perceived, such as any mathematical reality. For example, the fact that 1+1=2 is an individual element of consciousness.
the biological brain biological brains of of humans but also of other animals 46:39 and the biological brain uh you know at its very uh it's very kind of nature is 46:45 it's there to help a creature to move around in the world to move right it's 46:52 there to move help to guide a a a creature and help it move in order to 46:58 help it survive and reproduce that's what brains are for so that's what that from an evolutionary point of view 47:03 that's that they they developed in order to uh help a creature to move and they 47:09 are so they they and and they are uh you know they're the bit that comes between this the sensory input and the motor 47:17 output and as far as you can cleanly divide these things which maybe you can't but I mean um so and so that 47:24 that's that that's their purpose is to intervene in the sensory motor Loop in a way that benefits the organism and 47:30 everything else is on built on top of that so uh so so the capacity to to 47:36 recognize objects in our environments and and categorize them and the the 47:42 ability to kind of manipulate objects in the environment pick them up and so on and all of that is there you know 47:49 initially to help the the the organism to survive and um uh and and you know 47:56 and that's what um Brains Brains are there for and then then when it comes to like you know uh 48:04 the ability to work out how the world works and to to do things like figure 48:10 out how to gain access to some item of food
LLMs are still just a fancy parlor trick, and the attitude that it doesn't matter that leading authors delving into the ideas of consciousness have all been in agreement on that, so making light just because they've been overly polite in letting you know this, and in no uncertain terms, any delight taken in the notion of possible ambiguity in their responses is just futile, and isn't any reason to further compound the already basically empty hype surrounding chatbots, at the end of the interview and the day, LLMs are still just fancy parlor tricks and NOTHING more.
Started programming when he was a teenager? Ha! Pathetic. Sam Altman started programming when he was 8! Eight! Who's the real genius programmer, origin story haver now!
The BIG ASSUMPTION is that consciousness exists in others..😂 can this be proven? The fact is, in your experience ONLY you are conscious. You imagine others to be conscious, what would happen if you stopped? If you realised only you are conscious? Can you even prove, there is anything but your consciousness and it’s content?
Sorry folks just noticed for some reason the references were not in the VD, I just added them again
One of the best MLST interviews I've ever listened to. It covered so many things that interest me.
Thank you Paul!!
There's nothing like an interesting discussion to start off Christmas day.
Fascinating discussion Tim and Prof Shanahan. Thank you for sharing. 👏M
Good job Tim! I bet even your Christmas tree has some tesalating spline decorations, they pop up everywhere ;-)
Shanahan is so likable! He exudes precision, and yet an accomodating reasonableness. He balances a holistic approach with specifics and edge cases.
"I'm not making a metaphysical claim, I'm just describing how we use the word." Witty too!
He seems quite genuine, even if he half-castrates reductionism and functionalism. It feels a bit Derridian, différance, but it's hard not to respect when he immediately calls out that 'definitions' aren't 'usage', and the importance of grounding. Clearly loves language too!
I think my top takeaway is: "Things can be empirically hidden about consciousness, not metaphysically hidden." Such a clear distinction!
He reminds me of the Dalai Lama for some reason.....so clear!
[00:00:00] Introduction
[00:08:51] Consciousness and Consciousness Exotica
[00:34:59] Slightly Consciousness LLMs
[00:38:05] Embodiment
[00:51:32] Symbol Grounding
[00:54:13] Emergence
[00:57:09] Reasoning
[01:03:16] Intentional Stance
[01:07:06] Digression on Chomsky show and Andrew Lampinen
[01:10:31] Prompt Engineering
I remember a very simple sci-fi story in which some spacecrafts land on Earth
and we wait for some aliens to emerge, but they never do.
Finally they fly away again, complaining among themselves
that the humans never bothered to engage with them.
Just finished your fascinating Sara Hooker video and you already posted another video to watch😅 Thanks!
Creating a Novel Neference Frame, that takes into account different perspectives or points of view is interesting to me. reference frames are an important aspect of how we understand and describe the world around us.
For example
In physics and mathematics, a reference frame is a set of coordinates that are used to describe the position and orientation of objects in space....
💓
why i sees 3D, mathematically. ❤️ (quaternions).
I like the notion that embodiment is something like "having skin in the game." That is to say there are goals and, therefore, the encapsulating agent makes value judgements. And this is the dangerous part of creating synthetic intelligence. This is where alignment is critical. And I believe that agents need to be kept as simple and as self-directed-free as possible.
@@arletottens6349 I would say "adding goals to the agent" as opposed to the model. We can add goals to models by tweaking the training set to avoid statistical social bias for example. Nevertheless, how the agent uses models for planning, search, store, context building, model pipelining, filtering, action selection, and action execution will need to be supervised closely by policies and or human intervention. I do agree that adding a goal building model is dangerous. We need to tread carefully there.
Thanks Mate as always fantastic.
Not sure why I like so much this kind of information...
I hope this video will be a starting of a many others videos
How does grounding of symbols differ from maintaining a complete history? If this history can be used to add context to a connectionist model, doesn't that do the same thing as what we humans do? At least doesn't it go a long way towards giving words grounding? When I hear the word dog, I have access to all my memories. A large model has access to all its memories, but unless training was updated to include history or relevant history is added to the prompt, it lacks that.
Prompts do context including conversation history so if you don’t send a userid you’ll always need prompt or something to do context and history
"Emergent" suggests more than our surprise that something works as it should. What I don't understand, and perhaps I never will in anyway except by some vague intuition, is how layers of abstraction are "correlated" to subject matter detail. And as you can see, I'm not even really sure about the right way to word my confusion. Here's what ChatGPT has to say on the subject:
"Adding layers of abstraction in a language model can be correlated with greater subject matter detail in several ways:
Abstraction allows a language model to represent complex concepts and relationships in a more concise and organized way. For example, a language model with an abstract representation of "animals" could include information about different types of animals, their characteristics, and their behaviors, without having to specify each type of animal individually.
Abstraction enables a language model to generalize and make predictions about new examples based on patterns and structures it has learned from past examples. For example, a language model with an abstract representation of "animals" could be used to classify a new type of animal as "mammal" or "reptile" based on its characteristics, even if it has never seen that particular type of animal before.
Abstraction allows a language model to handle a larger and more diverse range of inputs and outputs. For example, a language model with an abstract representation of "animals" could be used to answer questions about different types of animals, such as "What do polar bears eat?", or "How do snakes move?", or "What is the lifespan of a dolphin?", without having to specify each type of animal individually.
Overall, the use of abstraction in a language model can enable it to represent and handle a greater amount of subject matter detail in a more efficient and flexible way."
To me, this suggests that information is somehow categorized and compressed. It makes sense that this be done, but it is difficult to grasp how the self-organizing mechanism does this using back-prop. That is to say, how is it that guessing the next word results in a compressed hierarchy of abstractions rather than long individual paths to a next token? Yet by optimizing to minimize the loss function, this self-organization happens. One might first of all say that this self-organization is emergent. But rather it is the case that I haven't grasped the full intention or scheme of the programmer. Perhaps it is the case that even the programmer hasn't grasped it fully? I suspect this is true. In other words, in building a function approximator, we may not understand the functions that are built. And therefore we call the characteristics of these functions "emergent" rather than unexpected. Surprisingly, ChatGPT says the abstractions and organization exist independently in the training data and are not the result of any intention on the part of the programmer:
"The backward pass is an important part of training a machine learning model, as it allows the model to learn from its errors and improve its performance over time. However, the backward pass itself does not directly cause the model to self-organize or create abstractions.
A model's self-organization and the formation of abstractions can occur as a result of the training process, but it is typically driven by the structure and patterns in the model's training data, as well as the architecture and learning algorithms used to train the model. For example, a language model trained on a large dataset of text might learn to form abstractions based on the relationships between different words and concepts in the text."
And as for emergent vs. unexpected, once again, ChatGPt comes to our aid:
"Emergent behavior refers to the emergence of complex or unexpected behavior from the interactions of simpler components. In the context of a language model, emergent behavior might refer to the formation of new patterns or structures in the model's output that are not explicitly represented in the model's training data or architecture.
Unexpected behavior, on the other hand, refers to behavior that is not anticipated or expected based on the model's training data or architecture. This might include errors or anomalies in the model's output, or behavior that is significantly different from what the model was designed to do.
It is important to note that emergent and unexpected behavior are not mutually exclusive categories, and a language model's behavior may exhibit both emergent and unexpected characteristics."
To me this sounds as though "unexpected" takes in the idea of error; otherwise the terms are interchangeable. Unexpected seems to me a humbler term. And if the model's structure is determined primarily by the training set, the results are not emergent from the model but from the data. Of course, a model can be expanded or compressed by using more or fewer nodes in layers, but it is the training data that contains the ability for a model to perform one task or another.
I wonder what can be done with next word prediction if we create a kind of RNN architecture within the agent, not the model? In other words, if we get the next word or words and add it back in with the original prompt as context. This is an experiment we can try ourselves with prompt engineering. I may spend some time with it. Another experiment would be do scrape all the nouns or named entities from a model's output and add them back as context along with the original prompt.
Is "statistical mimicry" any different from reasoning? I don't think it is. The capacity to research aside, if a LLM's function space is an amalgamation of component probabilities, it may be the most complete version of reasoning possible.
It is. Reasoning is a symbolic process which removes errors systematically. It does so by replicating and correcting the meaning contradictions of the symbols. Statistics can only model the surface structure of how how the symbols interact.
A good use case to discuss this: a standard LLM cannot do addition. An LLM with some examples of the algorithm (starting at the rightmost digit, adding those digits, take the carry to the next position and so on) will very likely not do addition if you give an “in distribution” explanation of adding (I.e. what you put on the internet to tell another human how to add). Because if you tell it “1+5 = 6 and that has no carry, whereas 8+6=14 and that does have a carry,” it might actually attent to the fact that you use odd numbers in the first expression and even numbers in the second expression and conclude that even numbers produce carry. However, see other episode of MLST, Hattie Zhou, a researcher at Google Brain, was succesful in making a prompt that caused LLMs to add without mistakes, and it required to be very, very explicit in a way you don’t talk to humans. Therefore we can conclude that in order to teach LLMs reasoning we have to give very “out of distribution” examples and instructions. And then there is the fact that humans can learn reasoning tasks with much shorter instructions and also children learn language with much, much less exposure to language material than LLMs. So clearly something else is going on and statistics can mimic it somewhat but does not explain it that well.
@@jantuitman Right, an LLM can't do symbol manipulation. Still, LLMs keep getting better at math. ChatGPT isn't bad at all. I know it's doing it with autoregressive probabilities, but isn't that how we learn? We need just a few examples, but that seems to be because we have good underlying models. I would argue that a large enough model could do just about any reasoning task, but this is inefficient. It's "cheaper" to have a symbolic calculator, so why not have one? Stephen Wolfram would call this "computational reducibility." I'm certainly not opposed to using shortcuts where possible. The question is can all reasoning be done using a large enough model? In other words, are some connectionist architectures Turing complete? I suspect they are.
@@93alvbjo I'm not certain that the two methods aren't computationally equivalent. Symbolic computing may be a case of computational reducibility.
@@dr.mikeybee maybe a LLM with a few clever extra extensions (for example long term memory and ability to move things in and out of its context) can do every form of reasoning humans can. I don’t dispute that, and we will know probably in a couple of years if this is true or if we are still missing stuff. But your original question was if statistical mimicry is any different from reasoning and I think the answer is definitely yes on that one. If we humans reasoned statistically we should go to school for 1000 years or so. My intuition is that we have a built in reasoning mechanism that favors solutions for problems immediately if the solution reduces the amount of concepts needed in our theory, even if this solution is explaining only 1 fact and maybe even if we have knowledge of some counter examples. An LLM generalizes only in the direction of what all the other data said before, but we may generalize into multiple directions and favor the one that costs us the least amount of reasoning effort. This is profoundly different and it may also be the case that some human forms of reasoning cannot be replicated by LLMS. I give you an example of a task that intuitively seems hard. An LLM can write stories and music in all kinds of existing styles and also mix different styles. But humans from time to time start new genres that didn’t exist before and have completely different sets of rules. New genres or styles often deliberately break with rules from the past and are statistical unlikely. An example is that in for example Romanticism composers started using intervals that in the period before were considered to be “wrong” musical intervals. Can an LLM write a story or a music piece in a completely new genre? It can certainly not do that by statistics, and I also don’t think that using just random values for genre characteristics is the right way of doing it, because genres have some internal consistency to it. But at the start of the genre there is no data to infer that consistency from in a statistical way, so I doubt we already have an explanation for how this type of reasoning works exactly.
Wonderful! Yes, talk about these things separately. Throw the word consciousness away. It just creates misunderstanding and contention. Suitcase words have no place in scientific discussion.
Bruh this was such an automatically generated response
My view is that you must define it (consciousness) to speak of it in a scientific manner, without a good definition then I agree talking about it is problematic
@@avallons8815 'zactly!
Anthropomorphising a computer 🖥️ is no different to your old man calling his car Shiela. When you love something so much you start to fantasise I guess :)
@@mrbwatson8081 I don’t think you understand what a computer is. Or you don’t believe in the mechanical universe and rather adhere to some form mysticism
Consciousness is the state of being aware of one's surroundings, thoughts, feelings, and sensations. It is the subjective experience of the world and oneself. Consciousness is the ability to perceive, think, reason, and make decisions. It enables us to experience the world in a rich and nuanced way, and to reflect on our own experiences. Scientists and philosophers have debated the nature of consciousness for centuries, and there is still no consensus on how to define it or explain how it arises. Some researchers believe that consciousness is a fundamental aspect of the universe, while others believe that it emerges from complex interactions between the brain and the environment.
How would you define it ?
It’s emergent as a consequence of everything you mention
Is Global workspace theory (GWT), at least in part, analogous to hidden prompt additions and modifications done by agents for input into LLMs?
Kind of fascinating to think of "where the LLM is" (from a consciousness perspective) after it has made the output from the latest prompt. 🤔 Considering it is not actually an app with a clear state and is more like a file with a "filter mechanism" called from some kind of multi user app.
The "LLM consciousness" seems to be the state between the call is received and before the result is outputted. 🤓
is autopoiesis integral to the capacity to experience? How can LLM's achieve autopoiesis?
Integral Knights!
@@MrVaypourHuh?
If nature's loss function is parsimony, we would necessarily see different subsystems working with different mechanisms. So we have connectionist component parts, and we have chemical , messaging. Why wouldn't we? Nature does not optimized for aesthetics or similarity or symmetry for any of their own sakes. It optimizes for energy savings, reproductive efficiency, and for minimum levels of competence.
LLM's are still static intelligent automatons with a catch. Biological language models encapsulates and mimics all we know and all our social and cognitive abilities. LLM's encapsulates and mimics that language, and are therefore encapsulating an intelligence similar to our. New studies have shown that very large LLM's also catch other cognitive skill from our language, and are closing in on IQ. They emulate intelligence, and _even_ the underlying consciousness that created the language (the catch), very well, but is otherwise an 'inactive' intelligence.
Regarding consciousness, then we need an active system, and that means internal feedback loops. If an intelligence can't interact with internal processes, then it cant 'feel' or sense how it feels/thinks, and it's only an intelligence. For an LLM to acquire full consciousness, it absolutely needs to be fed internal states back into the model to generate an 'active' intelligence that can 'think', instead of an 'in-active' intelligence as an automaton. Chaotic systems are active systems and very dynamic. Biological intelligence's are active and have evolved to - more or less successful - control the chaos in there, and make sure we have a homeostatic internal environment. Many mental issues are chaotic mental states that end up in a new strange attractor that can be hard to jump away from without external help.
Hm, hope it made sense in a short space..
What possible difference could there be between a perception system based on photons rather than "audions?" Yes, I made up a new meaning for that word. ;) Other than range, why shouldn't we assume this sort of "scintillation" system is wired into the same kind of "reader" as our vision system? Moreover, if subjectivity is a connection between perception and goals, why wouldn't a bat have a subjective experience? As we can tame most animals, that goes a long way, perhaps all the way, to proving that their perception's connection to goals is plastic.
ChatGPT can count: given the sequence n, n+1, n+1+1, n+1+1+1 . . . list the first 100 numbers
Here is the list of the first 100 numbers in the sequence you specified:
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100
given the sequence n, n+m, n+im and I-i+1 where n = 1, m=5, I=1, list the first 100 numbers
Here is the list of the first 100 numbers in the sequence you specified:
1, 6, 11, 16, 21, 26, 31, 36, 41, 46, 51, 56, 61, 66, 71, 76, 81, 86, 91, 96, 101, 106, 111, 116, 121, 126, 131, 136, 141, 146, 151, 156, 161, 166, 171, 176, 181, 186, 191, 196, 201, 206, 211, 216, 221, 226, 231, 236, 241, 246, 251, 256, 261, 266, 271, 276, 281, 286, 291, 296, 301, 306, 311, 316, 321, 326, 331, 336, 341, 346, 351, 356, 361, 366, 371, 376, 381, 386, 391, 396, 401, 406, 411, 416, 421, 426, 431, 436, 441, 446, 451, 456, 461, 466, 471, 476, 481, 486, 491, 496
So it seems pretty clear that we're in a piecemeal form of coming to grips with the fact that intelligence and consciousness is a physical phenomenon. This morphological theory of computation is simply a shortchanged way of saying this. Embodiment as well. But to even suppose embodiment is supposing an active agent that is capable of awareness and perception. Perception, being the inadequate apprehension of reality (in the case that reality has an ulterior, higher form) and actively mitigating the negative side effects.
You cannot embody without awareness. The "I" in the machine is still the mystery here. Moreover, the brain is a remarkably non-complex machine, relatively speaking. With mechanistically dependent aspects of intelligence (working memory and processing speed) humans lag severely behind other animals, and yet we rule with superior cognition. Thus, we should probably divorce this notion that intelligence, however correlative with such substrates, is itself the substrate. It is the ability to engineer these substrates via entropic awareness.
It's also important to know that there are levels of emergence. Snowflakes are emergent, but snowflakes are not conscious.
To paraphrase a famous saying "if a robot in the lab falls and nobody is around does it make a sound?" My answer is "No, because the engineers didn't design it to". Meaning these are systems design and engineering issues that cannot coexist with the concepts of free will and self agency as implied by the common definition human "consciousness". In other words, the systems are designed and built to perform a certain set of tasks a certain way and not decide on their own what they want to do and how they want to do it. AI chatbots are behaving the way they were designed to and are not going to decide not to answer a question because they didn't like the way it was phrased or not answer because it feels oppressed. What we are actually dealing with in technology is can we even begin to design and engineer a system that can operate and function on its own without being pre-programmed to exhibit certain behaviors. That is a huge technical hurdle that is nowhere close to being solved.
I respect his British sense of professionalism @40:40. But it's a pity that he calls his terse, poetic, and most informative response as 'flippant'. I have a different word for it; genius! Too bad the professor is not 'conscientious' of its valuable impact. His pre-programmed conscientious British politeness programming later overrides that brilliant just in time 'awareness' and quick wit. No wonder it was the most engaged tweet!
Is the argument of the requirement of embodiment justified? If we remove the brain from the body and keep it alive, wouldn't it still be accompanied by the sentient entity? An LLM is trained to respond in the same way as a human. Couldn't it be that after training, the LLM incorporates the experiences that have shaped the embodied brain? Wouldn't an LLM then resemble an disembodied brain?
Yes! If he's a science fiction fan, why didn't Murray Shanahan bring up the ROM construct holding Dixie Flatline's brain in _Neuromancer_?
Good talk, but look a certain way and AIs meet Shanahan's requirements for consciousness at 42:40 on:
a) "inhabits the same world as us": Large Language Models (and the ROM construct) inhabit the world of language, in which we can talk about many things including all aspects of the real world. And in that conceptual space they do interact with us.
b) "exhibits purposeful behavior": LLMs' purpose is to continue the conversation and be helpful; that's about all the ROM construct was good for.
Maybe he didn't say that the AI has to have a sustained purpose or point of view in the ongoing world. But every time Case turned on Dixie Flatline's ROM construct in Neuromancer it/he restarts and says the same thing. It/he's not embodied (except CYBERSPACE!) and is only mediated through Case's sensorium, but the ROM construct is clearly conscious.
If somebody trained a large language model to pretend to be a now-disembodied person, say Albert Einstein, and got it to stick to that viewpoint, I think it would be really close to Dixie Flatline.
Cognitive processes are not the root of consciousness. Consciousness is any fact that can be perceived, such as any mathematical reality. For example, the fact that 1+1=2 is an individual element of consciousness.
❤️😍🤩😍❤️
the biological brain biological brains of of humans but also of other animals
46:39
and the biological brain uh you know at its very uh it's very kind of nature is
46:45
it's there to help a creature to move around in the world to move right it's
46:52
there to move help to guide a a a creature and help it move in order to
46:58
help it survive and reproduce that's what brains are for so that's what that from an evolutionary point of view
47:03
that's that they they developed in order to uh help a creature to move and they
47:09
are so they they and and they are uh you know they're the bit that comes between this the sensory input and the motor
47:17
output and as far as you can cleanly divide these things which maybe you can't but I mean um so and so that
47:24
that's that that's their purpose is to intervene in the sensory motor Loop in a way that benefits the organism and
47:30
everything else is on built on top of that so uh so so the capacity to to
47:36
recognize objects in our environments and and categorize them and the the
47:42
ability to kind of manipulate objects in the environment pick them up and so on and all of that is there you know
47:49
initially to help the the the organism to survive and um uh and and you know
47:56
and that's what um Brains Brains are there for and then then when it comes to like you know uh
48:04
the ability to work out how the world works and to to do things like figure
48:10
out how to gain access to some item of food
What annoys me about philosophers is that they spout BS thinking they're enlightened. As a philosopher, I know more.
GWS is deeply flawed. Time will show that to be true.
LLMs are still just a fancy parlor trick, and the attitude that it doesn't matter that leading authors delving into the ideas of consciousness have all been in agreement on that, so making light just because they've been overly polite in letting you know this, and in no uncertain terms, any delight taken in the notion of possible ambiguity in their responses is just futile, and isn't any reason to further compound the already basically empty hype surrounding chatbots, at the end of the interview and the day, LLMs are still just fancy parlor tricks and NOTHING more.
Started programming when he was a teenager? Ha! Pathetic. Sam Altman started programming when he was 8! Eight! Who's the real genius programmer, origin story haver now!
The BIG ASSUMPTION is that consciousness exists in others..😂 can this be proven? The fact is, in your experience ONLY you are conscious. You imagine others to be conscious, what would happen if you stopped? If you realised only you are conscious? Can you even prove, there is anything but your consciousness and it’s content?
Consciousness is not generated by the brain.
Some of you are in for a rude awakening.
The time will come.
Cringe.