LLMs Are Going to a Dead End? Explained | AGI Lambda

AGI Lambda

มุมมอง 13 452

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 30 ธ.ค. 2024

ความคิดเห็น • 145

@AGI.Lambdaa 7 วันที่ผ่านมา ⁺¹⁴
This video was created with the assumption that the audience already possesses a detailed understanding of how large language models (LLMs) function, as well as the challenges they currently face. However, based on the feedback in the comments, it appears that further clarification is needed to effectively convey the intended message. I believe it would be beneficial to produce another video to provide a more in-depth explanation of this issue.
*Yes, we may have disagreements, but you can convey them using logical reasoning instead of attacking each other in the comments section. This is a technical video highlighting some core issues with traditional LLMs, which are hurdles in the development of AGI. We will discuss the O3 model and more later.*
*You can also join our discord server for discussion on the topic.*
@wwkk4964 7 วันที่ผ่านมา
It would be great if you can answer a simple yes no question to clarify what you are talking about: Was Microsoft's Tay chatbot pulled in 16 hours of launch in 2016 because it was not learning continuously from its language interactions from users over that period ?
@AGI.Lambdaa 6 วันที่ผ่านมา
@@wwkk4964 idk why it was pulled but direct answer is it was not continuously learning model.
@wwkk4964 6 วันที่ผ่านมา
@@AGI.Lambdaa Had you been alive in 2016, you would have had a hearty laugh at everyone back then claiming (including apology from Microsoft no less) that Tay learned without filters from her interactions online with Twitter users and that was the reason why it started tweeting "racist" and other types of inflammatory information that 4chan brigaded her with. Since you didn't think it was learning (continually), You think it was a deterministic finite automata that was outputting things Microsoft taught it in its training data (only possible hypothesis that fits your worldview). Now go and check screenshots of what tay was actually tweeting, just search Tay tweets on any search engine images. Do explain how it decided to tweet that stuff without continual learning.
@AGI.Lambdaa 6 วันที่ผ่านมา
@wwkk4964 i am talking about continuous learning in academic term. Why don't you study first
@AGI.Lambdaa 6 วันที่ผ่านมา
@@wwkk4964 i am using word continuous learning in academic term. Why don't you study first 😭
@callmetony1319 3 วันที่ผ่านมา ⁺¹⁸
"We want LLMs to learn from the data, but they are actually learning the data."
Brilliant, and the rest of your analysis is spot on. I'm by no means an expert but I can't help but feel your argument reflects my owns feelings and inklings about the current state of AI and LLMs; I feel increasingly that in order to get to AGI and then superintelligence, we need AI systems which decide how to internally arrange the world, and not internally arrange relationships between tokens, not through fitting a curve from human-supplied data but rather by exploring their world themselves to build their own internal world model and adapt it continuously. Or in other words, AGI is inherently agentic, as you say. This is a view I was previously not understanding of but now I tend to think it is probably the case.
@lukasritzer738 10 วันที่ผ่านมา ⁺³³
Anyone think it’s funny they didn’t make generative AI called GAI
@SimonNgai-d3u 5 วันที่ผ่านมา ⁺⁹
LLM certainly is dead end but o1,o3 are a kind of its own. It is no doubt superhuman in abstract thinking. I suspect it will archeive a lot more if it's given better vision and more. The vision currently o1 has kind of stuck at 4o level.
And btw, it is not allowed to use vision to perform arc-agi test. Some believed it's limited by perception rather than reasoning.
@spectator59 9 วันที่ผ่านมา ⁺¹²
Good content. I've been saying similar things; it's shocking how many people refuse to see/accept what's actually happening with LLMs. (minor point: I think you could find a better TTS these days).
@jamesrav 12 วันที่ผ่านมา ⁺¹¹
i like to get all viewpoints, critical and 'hype'. It sounds like a combination of pure neural nets and other techniques (opposed by Hinton) may move things forward. But your comparison of LLM's to simply a compressed database is persuasive. Hard to see how people think the current methods for LLM's will lead to novel discoveries (good video, you should correct some spelling errors)
@Formalec 2 วันที่ผ่านมา
But it is also simplistic in that direction. We know that NNs do not just do "simple" database lookups of factoids or even of functions (entire "passages"). The way that combined questions are answered needs the llm to retrieve and combine & chain learned "textural patterns/behaviours" in more or less complicated ways.
We know that if gradient decends finds it useful given how the training is that complicated algorithms can and will be implemented by networks.
I do agree with the view that clearly, the train on everything on internet approach did not result in one consistent self identity, world model or anthing else related to grounding. Clearly training must be much more cleaver and supported in order to get robustness.
@ihkali122 2 วันที่ผ่านมา
@@Formalec i think he also mentioned that LLMs are not just to compress data but in the process of compression they learn some pattrens and they are usfull if you ask similar kind of questions but they do not understand the language of physical world so expecting them to get be AGI is not correct at all. unless you have model of the world and some intraction to learn the language with continous learning and more..
@frq3155 วันที่ผ่านมา ⁺¹
That is exactly what Ive been saying for a year, and I don’t even work in Ai, if you just stop for one moment and think about how llms work, you will momentarily realize that they were not, are not, and will not be a path to Agi, should we ever create it. Nobody listened to me, people in various comment sections were eager to let me know that I’m wrong, and not only goofballs on the internet shared this opinion, but some really smart guys I know. It’s about time that at least someone realises current Ai market is a bubble
@InfiniteQuest86 วันที่ผ่านมา ⁺¹
You and me both. This is such a difficult uphill battle because the AI labs have so much marketing hype and gimmick. They can't afford for people to understand what is really going on.
@Luke-SpyWalker 4 วันที่ผ่านมา ⁺⁶
AI has hit a wall indeed, but not bc it has stopped but because it's going vertical like climbing a wall
(Watch the one video of Wes Roth that is recent to this comment)
@aloisduston5466 5 วันที่ผ่านมา ⁺³
Can you make a concrete prediction about a benchmark llm based systems will never crack?
@w花b 3 วันที่ผ่านมา
A mathematical argument would be the most convincing.
@amotriuc 20 ชั่วโมงที่ผ่านมา ⁺¹
The problem is there is no benchmark that can test for understanding. Extreme example: I have a database of answers for the benchmark. It will crush the benchmark, but it has 0 understanding. I think a good tell is that LLM do not build understanding is that it does require ginormous amount of data to learn, if it would have built understanding it should require much less data for that.
@aloisduston5466 19 ชั่วโมงที่ผ่านมา
If you make claims about the inherent limitations of llm based symptoms, but you can’t provide a concrete benchmark that reveals this limitation( even a secret benchmark preventing contamination of train/test data, then the limitations strike me as more philosophical than scientific.
@amotriuc 18 ชั่วโมงที่ผ่านมา
@@aloisduston5466 Because I can't or because today no1 can, it does not mean it is a philosophical one... I doubt it is feasible to have secret benchmarks.
@aloisduston5466 18 ชั่วโมงที่ผ่านมา
@ science is the domain of statements about which you can make experiments and take measurements. If you make a statement that isnt experimentaly testable its debatable if it falls under preview of ‘science’
@dmitryalexandersamoilov 6 วันที่ผ่านมา ⁺¹
i have an idea: how about we use LLMs in conjunction with another strategy which solves that problem? 🤔
@AGI.Lambdaa 6 วันที่ผ่านมา ⁺¹
Yann LeCun also proposed a solution(JEPA) you can search about it
@AGI.Lambdaa 6 วันที่ผ่านมา ⁺¹
@@dmitryalexandersamoilov 12 steps to AGI by Richard Sutton
@davidrichards1302 2 วันที่ผ่านมา
LLMs are actually the key to ASI. People just don't understand how they will function in an ASI system. Some careful thought will reveal what that crucial function is. It's there for everyone to see, if they are smart enough and understand the problem properly. Stay tuned...
@bestmoment151 หลายเดือนก่อน
Amazing video . What software u used for this ?
@trainjumper 7 วันที่ผ่านมา ⁺¹
Manim
@kevinscales วันที่ผ่านมา
Yes LLM's are just data compression, but the right kind of data compression is intelligence. I'm sure the architecture is not the best possible, but it's a great start.
@InfiniteQuest86 วันที่ผ่านมา
Thank you. This should be required watching for everyone on the planet. So many people argue so strongly that LLMs understand and can think like humans.
@Bencurlis 3 วันที่ผ่านมา ⁺¹
I agree with most points, especially that continual learning and goal oriented problem-solving are crucial. However, I do not think that vanilla LLMs possess no understanding, which I think you should call "grounding" based on your examples by the way. LLMs create their own latent variables to explain the text, then it is not too far fetched to suppose that the best way to predict the text is to recover a good model of the physical world, because it would be the best explanation of the text (which was produced in the real physical world). So, it is likely that LLMs recover such real world models, at least partially, at least for some concepts. Of course, unless we make progress in interpretability this is sadly impossible to verify in practice. This world model is what we intuitively mean by understanding.
To conclude, I think it is safer to say that LLMs probably understand some things and not others, maybe even depending on the context (prompt), but we do not know for sure when (and if) they are understanding or not.
@ihkali122 2 วันที่ผ่านมา
Nice analysis, but I must disagree with some of your points you made
First, If we develop an AI that interacts with words, has a model of the world, and possesses text generation capabilities, I don't think it can still be classified as an LLM you can call it baby AGI lol. Therefore, I agree with the point that relying solely on LLMs as a pathway to AGI might be a dead end. However, I respect your opinion on the matter.Second, When it comes to "understanding," if you define it as the ability to reflect the data learned during training, then yes, LLMs demonstrate some level of "understanding." For instance, they can use the word "time" accurately based on their training data. However, this is not the same as understanding the concept of time in a physical world we can say that they do not understand language.
The Chinese Room thought experiment is a great way to illustrate this.
@Bencurlis 2 วันที่ผ่านมา
@@ihkali122 > If we develop an AI that interacts with words, has a model of the world, and possesses text generation capabilities, I don't think it can still be classified as an LLM you can call it baby AGI lol
Personally I would not call it AGI just because it has a world model. To me an AGI needs many more features such as continual learning, unsupervised learning, etc.
As for understanding, I think it can have two meaning, either as in "grounding", it is the sense that the LLM knows what it is talking about, that the word "up" correspond to a direction in 3D physical space, that is the sort of understanding a world model gives you.
In a second sense, understanding is simply equivalent to "generalizing", you understand multiplication if you are able to multiply together any two numbers, even if you never did the multiplication with these two specific numbers before. This second sense of understanding is in some sense easier to satisfy for LLMs, because they only need to generalize, and clearly they do, at least on some problems.
@mlresearch.21 2 วันที่ผ่านมา
@Bencurlis Well, the first AGI part was fun, so I will ignore it. Regarding the second part, the point of the video was that they do not understand the language. Obviously, if they do not understand the actual meaning of the words, they do not understand the language. As far as problem-solving is concerned , it is best explained in the Chinese Room thought experiment mentioned above. Also, n-grams models can generate new paragraphs that are sensible and not present in the dataset, but you cannot say that n-grams models understand language at all. now approximating n-gramers with next words prediction along with compression does not make it intelligent until in intracts with real world.i am not saying that LLMs are useless or anything like that . i mean they are really powerfull with potential to automate alote of things . but they do not understand languge at all.
@Bencurlis 2 วันที่ผ่านมา
@@mlresearch.21 The Chinese room argument never was a good argument for or against AI understanding things, firstly because this argument applies just as well to a regular human like you and me (individual biological neurons do a simple mechanical job without any understanding by themselves), and secondly because it has loads of other issues I won't go discussing here unless you absolutely want to, but they are easy to find on the web.
Transformers are not n-grams, so it does not matter that n-grams have no understanding, it tells us nothing about transformers. Notably, n-grams do not learn latent variables.
Transformers based LLM clearly have some generalization power, so they are at least understanding in the second sense I proposed. The only remaining uncertainty is whether these models are getting grounded by the sole fact that they predict text produced by physical systems (humans) living in a physical world. If they are grounded or partially grounded, then they have some understanding of the world according to the first sense of understanding I proposed, but my argument is that we do not have the tools to verify this.
@Bencurlis 2 วันที่ผ่านมา
@@mlresearch.21 The Chinese room argument never was a good argument for or against AI understanding things, firstly because this argument applies just as well to a regular human like you and me (individual biological neurons do a simple mechanical job without any understanding by themselves), and secondly because it has loads of other issues I won't go discussing here unless you absolutely want to, but they are easy to find on the web.
Transformers are not n-grams, so it does not matter that n-grams have no understanding, it tells us nothing about transformers. Notably, n-grams do not learn latent variables.
Transformers based LLM clearly have some generalization power, so they are at least understanding in the second sense I proposed. The only remaining uncertainty is whether these models are getting grounded by the sole fact that they predict text produced by physical systems (humans) living in a physical world. If they are grounded or partially grounded, then they have some understanding of the world according to the first sense of understanding I proposed, but my argument is that we do not have the tools to verify this.
@peterchang3998 4 วันที่ผ่านมา ⁺¹
neuron network performs neural functions, not real brain function and intelligence.
@fgvcosmic6752 21 ชั่วโมงที่ผ่านมา
I'm going to make this comment _before_ watching, and edit it afterwards to see how my opinion changes.
I personally dont think LLMs are a dead end - but I dont think they'll ever be AGI alone. I think theyre exactly what the tin says; Language models. I personally believe they'll be a _part_ of AGI, possibly in a modular fashion. It cant perform true logic, but it _can_ translate between vector space and human language
EDIT: A lot of the video reinforced my view. I'm not entirely certain Reinforcement Learning is the main way forward, especially because it is inherently difficult to supervise and maintain. It would surely be a better contender for AGI, but it would require a full human experience, which is itself an ethical issue, or incredibly thorough simulation, which is currently unachievable
@marksverdhei 4 วันที่ผ่านมา ⁺³
Your video is titled "Why LLMs Are Going to a Dead End Explained".
I do not think you have grounds to know that LLMs are going to a dead end.
I agree with you that RL and Online learning schemes is necessary moving forward, but I think
they will be most efficient applied to LLMs, especially when increasing dimensions of modality.
It could be a problem that the model learns data that is incorrect, but
in supervised learning, we define the environment. This means it would be comparable
to putting an RL agent in an environment where most the observations it received would be such a false statement,
in which, the RL agent would learn such a "wrong" fact.
Unsupervised pre-training is not just about with learning N facts about the world. A very important reason for doing this is to get a general language understanding (morphology, syntax, semantics and pragmatics) and language problem solving ability. That being said, frontier models are pre-trained with most data available on the internet (trillions of tokens), and although there are many false statements, the natural frequency of a claim on the internet is likely positively correlated with whether or not it is true.
Most people believe grass is green, water is wet, gravity exists, etc.
I'd say constraining LLMs to offline training can lead us to a dead end, but i think it is quite different than the claim of the title of this video.
@mlresearch.21 4 วันที่ผ่านมา
I want to clarify a few points regarding the title of your video. It claims that LLMs are a dead end, implying they are not the path to AGI.
The first point concerns the "first learning" of LLMs. When LLMs learn historical facts, they often fail to reason based on an understanding of time. For instance, they may accept contradictory statements about historical events that defy the concept of chronological order. This indicates that LLMs do not reason with an actual understanding of time, particularly when the data is incorrect-which is often unavoidable on the internet. The same applies to other words and concepts. While LLMs can derive a representation of the word "time" from their embeddings, they do not possess a genuine understanding of time to reason effectively. Similarly, simply updating more weights to reinforce universally accepted facts like "grass is green," "water is wet," or "gravity exists" does not make LLMs intelligent.
It is well-documented that biases in LLMs arise from the datasets they are trained on. These biases further highlight the gap between merely learning from data and truly understanding language, the world, and its complexities. For example, if the data predominantly supports Newtonian physics, it is unreasonable to expect LLMs to produce theories aligned with relativity, which contradicts Newtonian physics.
How can we teach LLMs to understand language about the physical world, which they have never directly experienced? Yann LeCun has discussed this problem and even proposed potential solutions, such as systems with reasoning capabilities like JEPA. However, these systems go beyond LLMs; they aim for a more sophisticated architecture that incorporates reasoning.
While an LLM can be trained on data about the theory of relativity and provide accurate answers, this does not mean it truly understands the concept. This lack of understanding limits an LLM's ability to reason critically or contribute to groundbreaking discoveries. Relying solely on non-linear mapping across a large dataset cannot be expected to yield transformative results.
*I thought, with your first statement in this video, you would provide substantial evidence to support your claim. I expected you to share proof of continuous learning, critical thinking, or that LLMs understand the physical world. If they cannot, then obviously, they are not going to achieve AGI as hyped by people like you.*
th-cam.com/video/mnGUfkMt9fE/w-d-xo.htmlsi=B0f_MeKVXoKeNMRj
@D3MONFIEND 4 วันที่ผ่านมา
If the frequency of information results in correlation of truth, then llm fail hard. Otherwise, why does Google's Gemini tell people wrong information and it only sources from that one reddit post. Why chatgpt is not reliable enough to ask deep questions? Still have to prompt multiple times in order to get the right answer. If llm is smart enough, it should have told the user that it doesn't know the answer or give alternative instead of hallucinating garbage answers.
@D3MONFIEND 4 วันที่ผ่านมา
Garbage in, Garbage out.
@spandanganguli6903 5 วันที่ผ่านมา ⁺¹
Bruh I didn't expect it to literally be linear regression.
@mlresearch.21 5 วันที่ผ่านมา
there are better ways to show that you know about non-linearity and transformers than this bro.
@spandanganguli6903 4 วันที่ผ่านมา
@mlresearch.21 I don't know non linearity that well. I Learned regression to use in Econometric models, and all of the "non-linear" models I know and use are just transformations of linear models.
@toma3025 2 วันที่ผ่านมา
It's not, this video is a rather ridiculous oversimplification of the topic.
Linear regression doesn't allow you to encode many-to-many, non-linear relationships in the same way that neural networks can. You could never get a chatbot to work using simple linear regression, for example, as speech isn't a linear process.
His generalisation of the toy example of a model trained to add two numbers together is extremely naïve and misleading - it's a bit like saying that modern computers aren't really any better than the abacus, because you can perform arithmetic functions on both of them.
@nrcbl วันที่ผ่านมา
This is a good video. I rarely see coherent critics of ML nowadays. I used to do research, excited about the prospect of AGI, but I became disillusioned with deep learning. The problem I focused on was continual learning and RL, but eventually had to admit to myself that we don't have the tools at all to build continually learning agents. I tried to come up with a powerful learning algorithm that was more amenable to continual learning but was stymied at every turn. I am certain that it exists, but I don't where else to look except biology and the vertebrate brain.
@amotriuc 19 ชั่วโมงที่ผ่านมา
I have suspicion that LLM's might damage AGI development long term. I doubt LLM can get to AGI but since all resources are spent on LLM other alternatives might not be investigated.
@ecicce6749 3 วันที่ผ่านมา
your premise is wrong. We can make the LLM learn a wrong fact but with a good prompt we can convince it later that it learned a wrong fact. it even can include it in its reasoning.
@mlresearch.21 3 วันที่ผ่านมา
Actually, you didn’t address the premise. Since LLMs lack the ability to learn from data and are merely learning the data, we have resorted to spoon-feeding them with carefully crafted prompts about everything on the internet. Claiming this to be AGI is naive. That’s why the video’s title argues that LLMs, without such advanced capabilities, are heading toward a dead end rather than progressing toward AGI.
@nosult3220 3 วันที่ผ่านมา
lol yeah right… LLMs are the spark for AGI. Byte latent tokenization is the future
@Khal_dude 8 วันที่ผ่านมา
Great video! What are your thoughts on O3? Could you explain how it reasons through novel problems?
@pacman1187 4 วันที่ผ่านมา ⁺¹
Great content. Indeed the intelligence of an AI is artificial as the name suggests. Because it all is based on the trained parameters in its network and memory thanks to the brilliant transformer architecture and associative memory. Somehow it can nicely mimic human intelligence which is astonishing.
@wwkk4964 7 วันที่ผ่านมา ⁺¹
So what are the list of things LLMs will not be able to do better than humans, i would like a list of 20 things. should be trivial to test and you should be able to trivially define the test based on your thesis.
@nakunam1256 7 วันที่ผ่านมา ⁺²
Man your a legend! Keep posting the great content like this. Thanks
@matteoianni9372 3 วันที่ผ่านมา
🤣 a couple of weeks before the o3 announcement.
@mlresearch.21 3 วันที่ผ่านมา
People be like, 'I don’t know much about this, but I’ll confidently act like I do for dramatic effect.'
@NebulaNomad1337 28 วันที่ผ่านมา ⁺¹
Well done!!!!!
@lukasbeckers2680 3 วันที่ผ่านมา ⁺¹
Funny but the first minute is already wrong. It works with humans too it is called Propaganda
@mlresearch.21 3 วันที่ผ่านมา
Idiot spotted!
@ToonTales0306 3 วันที่ผ่านมา
such a baseless comment
@ronaldmtenga8093 3 วันที่ผ่านมา ⁺⁵
if you teach a toddler that Barack Obama died in 1864, the toddler will learn that Barack Obama died in 1864
@ToonTales0306 3 วันที่ผ่านมา ⁺³
Unlike LLMs, kids have the ability for critical reasoning, and as soon as they develop an understanding of history, they begin to reason and distinguish between facts and fiction. But that is not the case with LLMs. if you know how LLMs learn then you can easily get what i just said
@fgvcosmic6752 21 ชั่วโมงที่ผ่านมา
@@ToonTales0306 Interestingly, the way certain stroke patients behave can be likened to this. I believe its Frontal Lobe injury - the brain will accept new information and make up any line of reasoning to accept it as valid, even if the reasoning is not sound. It would sooner accept that Obama coule reincarnate than reject a fact as true. This is, to me, very reminiscent of how LLMs behave
@MyGreenpotato 3 วันที่ผ่านมา
Excellent video!
@dfas1497tcf3 หลายเดือนก่อน
We should give LLMs understanding, critical thinking, emotions, reasoning, insight, and even faith. But how do we do that?
@Square-Red หลายเดือนก่อน ⁺⁴
The base idea of LLMs, being next word predictor is the part which make these models unable to develop any undersatnding or critical thinking. Ig, Agi is based in agent models who interact with thier enviornemnts.
@zandrrlife 10 วันที่ผ่านมา
@@Square-Red?? That’s wholly wrong. There is research that proves LM are better at next word prediction than humans and next word prediction does learn reasoning, if it’s encoded in the data. Procedural knowledge drives these models. Are there pitfalls to next token prediction? Yes, why we need to generate multiple tokens. The issue isn’t the LM bro, it’s the data.
@Square-Red 10 วันที่ผ่านมา
@zandrrlife that's it. Data. Biased data. Memorisation of data. That is not reasoning nor understanding nor critical thinking. We don't speak on next word predictions. We have ideas and then we try to communicate that idea which results in evolution of language. LMs have never even interacted with the environment of which language they are using. They might predict word 'up' but don't even know physical meaning of up.
@Zopeee 5 วันที่ผ่านมา ⁺²
@@zandrrlifelook at the o3 results and see that it still isnt able to solve some child level questions despite probably (as in i would bet all my money on it) being trained on related questions to the tests, this is clearly an indiction that it is still not able to reason at all and rather micmics that, just because it has stupid amounts of data doesnt mean it ever will be able to reason, maybe it might, but this is imo quite the hint (plus it bei g a sidestep rather than an imporvement like before) that LLms wont be able to reach AGI anytime soon and if they will, it most definitivly wont be with current methods, or in a way that we understand.
More and better data will help towards improving it, but it likely wont realy take us even a bit closer towards AGI, you should rather start looking at it as a tool that will probably help to actually create AGI atsomepoint than actually going itself there.
@fgvcosmic6752 21 ชั่วโมงที่ผ่านมา
@@zandrrlife the fallacy here is the statement "Next Word Prediction does learn reasoning"
It does not - it _predicts_ reasoning. This can be easily shown by asking any Language model a hard Mathematical question. It will come up with perfectly logical steps that don't relate to the question at all, because thats what they are able to do. They have logical steps in the "database" as the video puts it, but dont have any real logic, so don't understand why or when each step should be used, besides statistically what they follow.
@detective_h_for_hidden 6 วันที่ผ่านมา
Great video!
@dan-cj1rr 8 วันที่ผ่านมา ⁺³
what now with o3, everyone saying its AGI ?
@zebraforceone 2 วันที่ผ่านมา ⁺⁶
Every time someone at OpenAI farts it's AGI. LLMs are not going to give us AGI.
@plaidchuck วันที่ผ่านมา
Read something other than openAI press releases and random talking head youtube channels
@fgvcosmic6752 21 ชั่วโมงที่ผ่านมา
It is not. There was a test called ARC-AGI, and it got a high score. It is not the first to do so, and won't be the last. Like any, it is a benchmark and nothing more. It does show that o3 has better capability than older models, but nothing more
@johnkost2514 6 วันที่ผ่านมา ⁺¹
These seems like an exotic form of curve-fitting ..
@insolace หลายเดือนก่อน ⁺⁷
The “thinking” and “reasoning” that we call intelligence is very similar to the way the LLM thinks and reasons. We forget that when we were young, we learned by imitating behavior that our parents trained us on. As we got older we learned how to “think” be repeating the lessons our teachers taught us.
The LLMs are missing long term memory and recursive thinking, and an Oracle whose knowledge the LLM can trust.
@Square-Red หลายเดือนก่อน ⁺²
nah, llms have no understanding of any meaning behind any word. They have no observation, they are developed with simple idea of next word predictions developed in complex models. On the other hand, we as humans have interacted with world to create meaning behind words. We have ideas, we create words to describe those ideas or observations. LLMs have no observation nor any idea. There is no thinking in LLMs, just inference of learned pattren from trained vocab.
@tomblock7774 10 วันที่ผ่านมา ⁺¹⁰
It's actually completely different from how we learn. We don't need to see everything thousands of times to understand how to repeat them. We also have a world model from a very young age, that many many animals have (if not every) as well, that doesn't involve language. A great example is looking at something like Sora, in which any mammal has a far better understanding of physics then Sora does. Our brains are not like neural networks, and really it's a shame those networks are named as such, because it makes people believe they are similar, when they are not.
@sadeepweerasinghe 3 วันที่ผ่านมา
cupiri
@John4343sh 2 วันที่ผ่านมา
Well this video aged like milk.....
@gjm92341 วันที่ผ่านมา
Your point is wrong in two ways. First you are talking about very basics llms and hiding the information that they have emergent behaviors (as our brains). Second ,you are overestimating human beings about our reasoning capabilities(talk about the "true" with all kind of fanatics and you will see that your concept of a true is idealistic ).
Also the actual models as agents are already capable to replace humans in a really huge kind of task. Your requirements for an AI are absurd.
@nanotech_republika 9 วันที่ผ่านมา ⁺²
Nice presentation and some of the ideas. But you are wrong a few major points in that video. I felt like you made this video to have somebody explain it to you because you have so many questions. Your main misunderstanding is that the LLM does not have understanding of the information that it learns. The fact is that even the smallest neural network (say from 1986 Hinton paper) has some understanding. Not perfect, never perfect but it definitely has one. It is easy to show it.
Another main point you are wrong about, you mentioned about static neural nets having problems updating the knowledge. And you said that this is in contrast to humans. How do you know how this is done in humans? Don't we have a huge "static weights" in our brain? What is the difference? You will be surprised when you find out !
Therefore, I give you a big thumb down for the factual content.
@AGI.Lambdaa 8 วันที่ผ่านมา ⁺²
For the first point, you can refer to this article: LLMs and Artificial General Intelligence Part IV: Counter-Arguments, Searle’s Chinese Room, and Its Implications. I hope this will provide you with some useful insights, as I am unable to provide extensive details at the moment. However, I am currently working on a second video that will explain these concepts further.
ahmorse.medium.com/llms-and-artificial-general-intelligence-part-iv-counter-arguments-searles-chinese-room-and-its-9cc798f9b659
To gain a better understanding, I recommend exploring how decentralized reinforcement learning (RL) agents acquire language. Comparing this process to how large language models (LLMs) operate will help highlight the fundamental differences between the two approaches.
Regarding the second point, its kinda funny to think we have "static weights." in our barins. i thought we have continously learning neurons, which adapt continuously over time. strange.
For now, I suggest watching this video, th-cam.com/video/zEMOX3Di2Tc/w-d-xo.html
By the way, I have already shared arguments why llms do not have language understanding and expecting responses with reasoning . It seems there is a conceptual distinction between the "understanding" I am discussing in the video and the one you are referring to.
@wwkk4964 7 วันที่ผ่านมา
Excellent point. The whole thesis is a paradox given that "learning" and "static" are at odds. Either humans and other systems are static, or we all learn. if we all learn, then it is inevitable that a model that learns language will too. It is one of the main reasons these low effort takes on technology will always be hilariously wrong at predicting while the (language) models will continue to crush faster and faster as it learns (updates) based on accuracy rather than self BS.
@AGI.Lambdaa 7 วันที่ผ่านมา
feels like you guys didn't get the issue in research about continues Learning in Neural Networks. I would suggest you to read some papers on the topic to get some understanding first.
@wwkk4964 7 วันที่ผ่านมา
@@AGI.Lambdaa Read about Tay, the first "continually" learning, language model that was released in 2016 and discontinued within days by Microsoft because it was "continually learning" a bit too quickly for people's comfort. It used a regular recurrent neural network. The notion that these models don't "continuously" learn is ironically the proof that those people think they are still in stuck in a time capsule while the models are learning faster and faster.
@wwkk4964 7 วันที่ผ่านมา
Read about Tay, the first "continually" learning, language model that was released in 2016 and discontinued within days by Microsoft because it was "continually learning" a bit too quickly for people's comfort. It used a regular recurrent neural network. The notion that these models don't "continuously" learn is ironically the proof that those people think they are still in stuck in a time capsule while the models are learning faster and faster. Between this comment and the video posted, o3 has already learned more than all of us combined.
@stefano94103 4 วันที่ผ่านมา ⁺¹
I guess you hopped on the Ai has hit a wall train and it’s only been a month and this is already aging like milk. There are so many counterpoints to your conclusion this should almost be labeled propaganda. But the details were informative and educational for those who do not understand how LLMs work although the conclusions were kind of a miss.
@jacobwilson8275 3 วันที่ผ่านมา
How has it aged poorly?
@mlresearch.21 3 วันที่ผ่านมา ⁺²
@stefano94103 People be like, 'I don’t know much about this, but I’ll confidently act like I do for dramatic effect.'
@stefano94103 2 วันที่ผ่านมา
@@jacobwilson8275 o3 is coming out in a few weeks and it’s much better than o1 models, deepseek v3 came out and it’s one of the best open source models ever. It even is comparable to Claude 3.5. It’s been less than 6 months and these models keep topping each other. The theory that they hit a wall hasn’t shown up in reality yet.
@plaidchuck วันที่ผ่านมา
@@stefano94103lol based on what evidence openAIs press release? Start reading actual white papers and research
@daburritoda2255 10 วันที่ผ่านมา ⁺²
o3 is agi
@jacobwilson8275 3 วันที่ผ่านมา ⁺¹
Doubt it. How do you know? My understanding is that we don't have enough resources to create AGI
@daburritoda2255 3 วันที่ผ่านมา
@@jacobwilson8275 if you define AGI as the ability to to most economically valuable tasks better then most humans, then I think o3 has a decent chance of succeeding in that in non embodied tasks.
Embodied agi is different in my opinion and will require significant advances in robotics before we get there
@callmetony1319 3 วันที่ผ่านมา
o3 is not AGI: the problems it performs poorly at it will perform poorly at no matter how many times it encounters them; to be "general" as in "general intelligence" a system must be generally applicable and hence must be able to approach novel problems and improve at solving them. o3 is no doubt a breathtaking milestone on the road to AGI, but, unless there is something OpenAI isn't telling us (and to be fair they likely are not telling us nearly the complete picture!), then o3 is not AGI.
@daburritoda2255 3 วันที่ผ่านมา
@ Yeah I agree it's not agi,
I just really dislike the content creators complete writing off of LLM's as a way of achieving agi.
On the other hand though, you haven't used o3 yet, and as far as i'm aware, there is no good benchmark for assessing how well it performs at questions on topics outside its training dataset, (Or "Novel" questions.)
@callmetony1319 2 วันที่ผ่านมา
@@daburritoda2255 yes that is fair, i wouldn't write off LLMs as a route to AGI, and yes I haven't personally used o3, but again (unless it is in somehow fundamentally different from o1) its capabilities are fixed, meaning its ability to generalise is limited. but on the other hand, i think if the o-series models work how we think they work (Q-learning plus A* search) then i don't see what's preventing o1 or o3 to continuously learn over inference, but maybe that algorithm (Q*) is not applicable to inference and is only appropriate for training, i.e., the Q* algorithm is an automated way of training a model for reasoning capabilities but is incompatible or inappropriate for use in inference. but i don't know either way, i'm just speculating. if an o-series model were capable of learning over inference, even at o1-level performance, then i would be happy to call it AGI

ต่อไป

เล่นอัตโนมัติ

The Dark Matter of AI [Mechanistic Interpretability]