You think so because he's saying what we all want to hear and because the alternative is counter-intuitive. Everything about LLM had been counter-intuitive though. I wish we knew what for sure could and couldn't produce dangerous agentic AGI, but fact is we don't seem to. We don't know how we work in that regard either. These models are just big hunks of emergent-properties-filled mystery. How fast we get used to the properties we already know about and start to say 'that's not anything like AGI, that's just a LLM thing. Lets make a bigger one' when a few years ago we would have thought 'it does what? And that too? Isn't that *already* AGI?'. I feel it's safer if these programmers of large models err on the side of "we don't know" instead of "it can't happen".
@@Dan-dy8zp Emergence in LLMs is HOGWASH engineers spew because there is no mathematical rigour in how they develop this models, a category theory approach to LLMs and machine learning in general could yield insight into the inner mechanics of LLMs. All models are just mappings and mappings have precise definitions thus there's no such thing as a BLACKBOX mathematically. Engineers aren't trained to find weird and complex associations in abstract mathematical ideas,that's the mathematicians duty. Mathematical rigour helped physics gain deeper insight,it could do the same for Machine Learning.
It says something about the current state of things when a company saying “we aren’t building digital gods. We are trying to solve real world problems” is a green flag. Excellent video as always MLST. I’m 10 minutes in and I can see the channel improving with every vid. This deep dive, direct to the source, appropriately skeptical content is needed in this parrot-filled AI hype cacophony.
Thanks for the summary brotherman Great times ahead 2033 maybe.. I was thinking more like 2049 We shall see.. NASA has some.. experiments to run !! Water is MAGICAL
"parrot filled Hype cacophony" bwahahaha.... Hype! You're still calling it all Hype!!! Ahhhahahahaha! Too cute! Parrots!! That's right! Damn parrots they are! No thinking!
Great interview, but it's surreal hearing the lead singer from Good Kid talking about *his* company's ML products. A Renaissance man. For anybody curious about the band, check out "good kid no time to explain".
The face of Nick everytime Tim tries to bring him in the AGI, intelligence, sentience, agency... BS debate... And then him lecturing Tim on how LLMs really work. PRICELESS!
We don't know how intelligence sentience, agency, etc works, nor what more properties larger predictive models may have. Dismissing danger and telling everyone what they want to hear is the riskiest attitude. It doesn't matter what he's *trying* to produce.
@@Jm-wt1fs We know how neurons work. They take in energy, excrete waste, and produce bursts of electricity. We still don't really understand human consciousness and subjective experience arises from this process, or why a few people become serial killers and most don't, well enough to identify them in advance and prevent them from harming people. What we don't know about our minds is still greater than what we know. We don't have the ability to put a baby in an MRI machine and say with great confidence, "this child will NEVER become a fascist dictator". We are even more ignorant about AI. We must not make AI too capable before we have much more understanding than we have now about human brains. Even if we can be *totally sure* that a transformer based model, or a team of specialized models, can never do 'X', don't know we aren't one small breakthrough away from adding on something new that will harness the information in these models to make AGI possible.
@@Jm-wt1fsThey're referring to the model, the function that takes the inputs and returns the output when you use it. Because it's approximating to some kind of function but there are so many moving parts that it's hard to have an explicit function that's humanely usable. What builds that functions is known like you said but not the resulting function's formula or at least it would be too complex hence the black box term.
I think people discussing ASI have their head screwed on well, they’re just considering capacities that are farther down the line. It’s great to also hear a more immediate a perspective. What I got is combined systems working together and outsourcing is useful for ai. Add that to embodiment, a sensorium, and the capacity to “world-build” in real time, and I think we will have AGI.
Yeah I agree, I think vids like this garner a lot of weirdos who hate progress and then they can shit on ppl who talk about agi. When really, Agi is going to happen, but it doesn’t mean we have to always discuss it.
this is such a high quality and immensely well timed channel. the interviews are really right on with the SOTI (state of the industry) just made that up by the way. enjoy.
i think grounding being difficult is not just why companies are having trouble creating grounded intelligences but also why evolution had trouble evolving them, intelligence itself is pretty easy compared to the grounding
He *seems to*, but then again, that's the impression he's trying to give. He's trying to make money just like the CEO's of OpenAI and Facebook. He is selling us the feeling of safety. He says we aren't in any danger from ever larger predictive models that have weird emergent properties their creators can't predict, when we don't know how our brains do agency or consciousness.
We need that Dawarkech guy lecturing him on his own field why he is wrong and why AGI is around the corner. Again and again and again and again and again
The interviewee differs in that he is saying what we want to hear and Hinton is saying what we need to hear. We have no idea where human agency arises from or what further emergent properties LLM will develop.
"Multi-step tool use" certainly *sounds* safer. The truth is the inventors of LLM's don't even know what abilities LLM's may develop as they grow and the interviewee certainly doesn't. Flatly dismissing the possibility you can produce something dangerous and unexpected while optimizing large mysterious predictive models for "multi-step tool use' is not less dangerous than deliberately trying to produce agency.
@@Dan-dy8zp Come on dude. Maximum safety is maximum boring. BUILD something. Anything. Create value. Stop worrying about things like this - it's an excuse to not do what you are meant to do. DO THE THING.
This is the least fake-hype AI lab head I've seen, it is so refreshing! Everyone else is yelling their heads off about tech that we might have in a decade rather than focusing on what we do have. Don't get me wrong, the quest for AGI is nobel and important, but holy hell, there *is* work to be done in the here and now too.
Good interview - sane discussion. RAG is helpful - but from what I have seen there are still too many fundamentally incorrect answers from these models. There are just too many corner cases whereby an incorrect answer is disastrous for someone's career or company to put trust in the models - as a result you spend a lot of time double checking the results - time that could have been spent solving the problem directly. These tools have great promise in the areas of dynamic documentation, enhanced search etc. But the sooner we give up on trying to get a language model to "solve" problems, the better we will realise that these models are NOT thinking - they are searching on steroids.
Yes, but this guy can do it while telling buyers what they want to hear. Meddling with this new kind of software that has so far shown startling and amazing emergent properties (to get rich) is *perfectly* safe. Never mind that we don't yet know how our brains produce consciousness and agency or much about how abilities emerge in these ever growing predictive models.
Today we are in TORONTO. We're in Cohere's office in TORONTO. We've been invited to their building here in TORONTO. They're running these in four cities, so in TORONTO, in San Francisco, in New York and London. And we're going to be capturing the events here in TORONTO.
The bling is distracting..... seems like a nice guy Interesting points on benchmarks Also when discussing RAG it is insane to not talk about it in terms of implicit and explicit data.... Are they just doing multidimensional vector computation on a fine tuned transformer that can already be done with a little bit of software development. Also for a company focused on RAG, it felt a little paper thin. There are so many cool things we can do nothing he mentioned seemed revolutionary, and we can get the same results with normal models with some software development. We did not really touch on the evolution of RAG architectures and what might the future hold The fundamental problem with long context like Google is that its not performant or as accurate or as efficient as RAG.....but RAG is usually only implicitly, but we can ground it explicitly without locking the values of implicit flexibility. The future of LLM's is mobile Data indemnification seems to be the most valuable thing offered. And to be frank its super valuable
Yes, great interview, but I also was a bit disappointed not to hear a bit about how cohere does/improves their RAG usage in comparison to the named workflow. - But OK, it's maybe their UPS. Could you please explain to me when you say "RAG is usually only implicitly, but we can ground it explicitly without locking the values of implicit flexibility"? - I'm not that trained with that vocabulary. When you say "The future of LLM's is mobile" I guess you mean using models that run on the phones. I am really excited how that will be solved. Because what we would expect would be the quality of GPT4o with all it's capabilities and I see no way to get something even close to that into models on a phone so you could use for example the realtime translation feature on a plane with no internet connection.
It's amazing how many times they say "but you and I are doing something completely different" and get an affirmative nod, and then not follow up on what actually is different. It's vitalism all over again.
@@deadeaded But that qualitative difference may be mostly (or even entirely) the result of quantitative differences that will soon change, differences like the size of the network, the duration of training, or the number of modalities the model is trained on, etc. Over and over, smart programmers with experience with artificial neural networks say things like "It'll never do X." And soon after, the program is doing just that. It's seems like human intuition is persistently erring in the same direction. That should make us cautious.
@@deadeaded With the exception of neuroscience, none of those actually study what the brain does, they are too many abstraction layers away. And while neuroscience can tell us that the brain is probably not using transformers or even gradient descent, it's not yet able to tell us how the brain works in general. The parts that we do understand reasonably well (like the visual cortex) show remarkable similarities to artificial NNs.
@@deadeaded Actually you absolutely can understand how a computer works by studying circuit boards (more specifically, how CPUs are made). This is arguably the only way to really understand it on a deep level, and will greatly enhance your understanding of why higher abstraction layers (like the applications) work the way they do, since they're always limited by the hardware they're running on. On the other hand, studying how Excel or Photoshop or Chrome work will give you very little insight into the inner workings of the computer.
As a counter perspective. Great company first off. However, he keeps saying the limitation is the data they are trained on, thus they will never be agents. He missing the underlying intuition. Using “platonic representation hypothesis” as grounding, the data they trained on isn’t a limitation, rather a benefit. We will get a point to where LM have hierarchal representations. Why I’m so excited with all the research with grokking, because I believe the answer lies with fully grokked transformers and way better data(actually providing synthetic insights about the data, not 100% synthetic data; raw data is so oblique). Assuming you’re optimizing for ultra long sequences, and afford the model an ability to exploit this behavior with extended test-time compute…why couldn’t an agent be superior to human CEO’s? Hierarchical representations imo, would result in a model with very nuanced “intuition”. Give the model the special embeddings to allow it to control multi-turn output itself…I don’t see why we can’t have legit agents. Why I believe data engineering will be a true competitive edge and a TRUE moat. Why my intuition is sooo strong in this regard, the fact we train it on raw data and do post-training preference tuning that results in current capabilities..make the answer clear, the data is too oblique, hell even how we sample during pretraining is subpar imo, I think a lot hallucinations are knowledge-conflicts. Why I relate LM’s to like adolescent autism. First labs that start adding behavior psychologists to their data teams wins 😂. Idk about ai gods, I’m a Christian 😂, but I do know we are no where near a ceiling with LM’s and I don’t mean the caveman style of pure scale increase.
I think you are right,we are not near done maxing out the potential of LLMs, and similar predict-the-next-token programs. They are only a few years old got big even more recently, yet some who should know better insist they will never do this or that. Too soon to say!
@@Dan-dy8zp How do you know that you don't even know where the limit is? To assert that we are not even close to the theoretical limit, we must know what the limit is. There are 2 types of people who say this nonsense. Those who have no idea about computing and those who want to sell you that their idea is the best.
@@raul36 "How do you know that you don't even know where the limit is?" I'm guessing you mean how do I know that *no one* knows? I know that many people with decades of expertise in the field have, in the last couple years, have made confident predictions about when LLM's and similar predictive models will hit a wall, or about what things these programs will 'never' do, even in "a thousand years" only to be proven wrong within months or weeks.
Very happy to see such a pragmatic, focused and grounded view of Large Language Models. Both that it is represented in the people building such systems today - and that it gets surfaced here on MLST. The philosophical discussions that we often have, are super interesting and worthwhile - but we must not entirely forget where we actually stand with today's technology. And most importantly, not fool ourselves in thinking that we are much further along than we really are. Or be fooled by organizations pretending that they are, in order to push their own agenda (possibly to our detriment as citizens...).
you're seeing people as thinking we're further than we are who are actually just trying to skate towards where the puck is GOING ,,, they're not non-pragmatic people they disagree w/ you dramatically about the speed of the change
Naysayers and Normalcy bias aside, AGI-ASI is distinct possiblity. The orders of magnitude spoken of by OpenAI researchers speaks for itself. When the next leap in foundational models drops (GPT5), a whole lot of people are going be looking VERY stupid. Altman doesn't have the backing of Nvidia, Microsoft and the NSA for no reason. They probably have things back there that are mind blowing. Not to mention one of the brightest minds in the game is now building a superintelligence company. Those are the folks i have my eye on
I'm really kind of miffed about this agent discussion. Agent doesn't imply agency. Agent is a COMPUTER term. It's any software that carries out operations on behalf of the computer or another program. Every piece of client software is an agent.
Agency has a much deeper meaning in cognitive science and philosophy i.e. selfhood, self-causation, intentionality and autonomy (plato.stanford.edu/entries/agency/) - when we think about AI systems we need to go far beyond the "anything goes" computer term. We have done a bunch of episodes on this topic recently
The man is treating us all as if we're idiots: to make LLMs more useful in business you need to make them more general so that they can solve more problems. Making them more general is the same thing as trying to create Artificial *General* Intelligence. If your not trying to make them more general/ intelligent then what are you doing with all that investment capital?
Some people are building AGI and they are not naive. Scaling LLMs might do something or not. Or they might just be powerful tools for building something else.
Shareholders like wild claims (it’s how you get funding). Similar to Elon promising Full Self Driving for years and it never happening. LLMs probably can’t become AGI, but with RAG and tool use, it could become useful. True AGI needs a lower data to inference ratio: humans can do a lot more with less data, but LLMs need all data available to do less. This means LLMs are not the way to AGI. An evolutionary system needs to reward a lower data to inference ratio. Anyone desperate to add data is not understanding AGI or they’re lying.
NO ONE is building AGI today, because NO ONE knows what it is. Do you think they built the atomic bomb without knowing where everything went, in an extremely careful way? Obviously not. This is exactly the same. Otherwise, the problem would have been solved long ago.
@@mikezooper This is EXACTLY what I have been saying for a long time. If it hasn't happened already, there's no way it will. If that is the path, it will be so obvious that you shouldn't be wondering what the next step is.
@@raul36 If it hasn't happening in the few years since people have started scaling up transformed-based models, then it can never happen? Pretty early to make such statements. Your second sentence is honestly funny if you've ever worked as a researcher in any capacity.
Isn't it odd how the researchers that know the most about LLMs specifically transformers are the ones that consistently downplay it while those that stand to gain something from it tend to hype it up. As it is right now it's just a best fit multi variable function finder. Which doesn't mean it can't be useful but there has to be some evidence that from a multi variable calculus problem we can get a minimum amount of intelligence, which right now we have very little evidence for.
He could have easily sell his product as AGI and a lot of dumb dumbs buy it. That makes more sense financially too. He is just a level headed guy who knows all these people who are saying this don't even use a different framework and they can't reach there with what we have.
@@michaelbarker6460 I would be cautious to make such statements. There are many very, very smart people who know an incredible amount about the transformer architecture who are on the other side of the AGI debate (you may google Neel Nanda for a start). Arguments that begin with "it is just" + (a set of mathematical concepts) kinda fall flat. Pretty much the entirety of science and technology "is just" a hot-pot of calculus, linear algebra and statistics with some fancier mathematics thrown here and there to fill the niches. Brains are networks, ultimately. Insanely complicated ones, and far more sophisticated than state of the art neural nets, but networks nonetheless. They are also doing computations which may ultimately be about approximating some function, the world model, in a way that isn't dramatically different from what LLM are doing, albeit in a primitive way. Whether LLMs scale to AGI or not, that's a different question. But I don't think it is an unreasonable position to err on the side of caution.
Great rule, someone of 30 will be right over someone of 23. Or better rule, believe whoever says what makes me feel less uncomfortable and use age as an excuse for self-conviction.
Very cool, I deployed a RAG framework 13 months ago at PANW. They had no idea what I did and let me go. Now I bet they wish they understood what I was showing them to their face.
Nice guy, nice tech. Might have been a little more interesting and exciting if you'd spent the entire time talking about using the python generation abilities to solve a wide variety of problems rather than business platitudes?
that sounds like a good idea except you assume they've got good ideas how to use llms using python to think about stuff, i think he said about playing around w/ it a bit randomly asking for some graphs of stuff b/c that's really all he's done w/ it, not like he's hiding a bunch of super productive tool use pipelines he knows all about
This sounds like a guy who realized there’s no way his company is going to be able to compete in the AGI race so is coping by “carving out a niche” where he can try to make some money until then.
Current AI is no closer to AGI than we were in the 80's. It's basically just very, very fancy regression analysis. Because of hardware and data it's useful and can do things, but LLM's won't lead to AGI.
And you sound like somebody who's not very smart. Quite frankly, what he's creating makes the interim tools That worked well with today's models. Not just for business sense but private and personal hobby use as well. They've open sourced so many things. It sounds like a guy who's interested in the here and the now of machine learning and artificial intelligence. And you sound like somebody who it has a singular focus which may just be a fantasy Speaking of which, what exactly have you done in the field of artificial intelligence or machine learning? That is of note? Is your company working towards AGI? Or are you buffing the floors in A school on the lower West end ? Because I'm assuming you're probably more the latter than the former
@@Lorentz_Factor I’m not going to doxx myself, but I work on safety research for superintelligence. I have no interest in building what Cohere is building so I have not bothered building a startup in that space. I still see the value in building AI tech for the world and incredibly excited about its usage in many domains. I’m just saying that Cohere has a lot of incentive to cope on AGI progress and are leaning into it and pretending there has been no real progress. Maybe they just decided it was better to mentally assume they just “don’t know.” Claiming that current AI is “fancy regression” is completely missing the point of exponential growth and the feedback loops between current systems and what quickly gets us to even more capable AI systems. For example, we’re on a path to automate coding and that alone shortens timelines considerably. Another is that interpolation can likely get you very far, without “AGI”. There are many other reasons we are much closer to AGI than the “80’s”. Saying that is just pure boomer talk. My guess is that we are anywhere from 4 to 12 years to superintelligence. Depends how many additional transformer-like jumps we need (maybe 1 and 2).
@@perogycookthis is so wrong. Back then, AI was mostly simple rule-based systems. Today, we have powerful models like deep learning and LLMs that can understand language, generate content, and solve complex problems. Calling it “fancy regression” is too simplistic. While LLMs aren’t AGI, they’re a big step forward and show real progress towards creating more general AI. The path isn’t simple, but to say there’s been no progress isn’t accurate. We don’t know if llms can get us to agi or not, I personally think it won’t. But you are wrong
@@lionelmessisburner7393 I didn't say there's no progress or that LLMs aren't useful (I'm personally incorporating a small LLM into my app as an alternative to a traditional search-based front end). I said they're not the path to AGI. And what's in the news this week? OpenAI's new model is disappointing, LLM capabilities are levelling off and domain experts like Ilya Sutskever and François Chollet are saying we need new techniques because LLMs aren't it.
@@Dan-dy8zp also AGI definitely does not already exist. If you understood how far LLMs are from AGI, you would know how far AGI is from being a reality.
yeah it's just a new word for "search" if a bot does it or benefits from it, right?? 🤔 but we get new words for everything from the perspective of bots doing it in order to deny that bots are here thinking things 🤷♀😅
Yes yes yes, but one component of human creativity is very similar to what LLMs do. When I sing a melody over a chord progression, it is generated for me based on my training data. I, the chooser, the decider as W would say, decide whether the direction of the melody my brain gave me is good enough or if I want a re-roll or to push the melody one way or another, but fundamentally, my brain relies on a something sub-conscious to hear melodies that dont exist yet, but derived from the melodies I've heard or the sequences i've practiced.
I guess our brain is operating same or roughly same as LLM. We practice things many times before doing it well. without someone telling us, practice, repetition, reinforcement we know very little. We have many years of training 24h a day. Our main advantage to an LLM is we receive info not only by text, but we see, hear, touch, taste, etc... I don't understand why Can't an LLM system learn the required operations to do basic arithmetic? If it can write computer programs, debug and correct code from error messages... I doubt chatgpt is just spitting out next word using probability..., the answers are very sophisticated, broad, concise, etc.
Listen again… What he says: “I don’t know how other companies are getting their data so it’s hard to talk about the way in which they’re differentiate”
I think they're all trying to do that (solve real world problems), and with the same strategies. But this one is selling us a feeling of safety while he does it. He doesn't know how brains do consciousness or agency, no one does yet, but he's totally confident it won't emerge spontaneously from anything he wants to do. No, not with these ever growing models that keep spontaneously displaying startling new emergent abilities as they are made larger. They are harmless and just amazingly useful.
Ceiling?? Where?? Proof? Bit early to be claiming that, no? Is it the data? Cause word on the street is that synthetic data is even more effective than real data. Power is a bottleneck, but they're building power plants as we speak.
Unfortunately he didn't say how their approach to RAG is but he told how it is / has to be done if you have large datasets. I don't know if you already have realised such a complex workflow including multiple specialized models that you have first tested for a longer time and selected and then especially prompted for that task narrowing the possible answers based on again large correlating data it found down to the correct answer to your question? Also think of the fact that in larger companies hundreds of docs get added every day and their contents need to be extracted from multiple file-formats, markup-languages, text-formats and then inserted and indexed into the databases. Not to speak from finding the correct citations. And first you maybe have to narrow down the question to a specific situation/text-passage if the question is very vague and also have to control the llm's output and maybe take actions for narrowing down those. Let's say you have the bible in the database ;-) and already had a longer conversation with the chat-application about it. Then You ask "what did jesus tell the people" and you refer just to the last two or three sentences? But the model has your whole conversation in his chat history and now searches for answers to that question based on your whole conversation? - Maybe it would accumulate all the found answers to a short one but maybe it also would try to spit out all referring texts it found in the bible to that question? What if the context was too small for that? What if not? maybe it would write down lots of answers in a list, maybe thousands of words, ongoing and ongoing? So you also have to deal with and handle input and output. That's all extremely complex to get to a good application for large datasets and maybe traditional database applications are better suited for a lot of, if not most, use cases.
If you had actually built anything substantial with RAG you would know that not all RAG systems are equally good. There are numerous secrets involved in a great RAG system and for the generative model that is augmented too.
He doesn't think he needs a moat. If he had the same resources as OpenAI and Facebook, he'd be the most dangerous of the three. A feeling of safety is what he's selling. He telling us what we want to hear.
Do you think that humans work on things they have not seen before? This is I guess. Also a philosophical question but you're saying that the llms only do what they've been trained on but people do too. Earlier you stated when you were talking about the what you call a hallucination, but these are flights of Fancy, a mixture of learned data novelly expressed as something that never was. The information that they are telling is not necessarily something they were trained on. It comes from the training data as language. But it is still novel and new. And not from the training data. Because the output has no matching sequences within the training data and even if asked to examine this for truthfulness, it would recognize it as a delusion. This is no different from complex goals, such as creating a new mathematical formula to solve a certain problem.. If it is based on The statistical representation of the real world, then absolutely novel ideas can proffer. Because quite frankly that is actually all humans do. Nobody suddenly has an idea to do something that they know nothing of. A person with no knowledge of how refrigeration works is going to look at an air compressor on a refrigerator and say I know how to make this better. They have to understand certain aspects. But then you may come across somebody who has an idea about something because they don't completely understand it. And this is where some interesting inventions have come along in the past. By trying something that no one well trained on the idea would have done but it's still using the real world knowledge that they had about other things and doing so If you think that llms are less capable than humans because we are doing something special and interesting with the real world data that we have, you are incorrect
Laid off by Ai and or human extinction? An Ai new world order? With swell robotics everywhere, Ai jobloss is the only thing I worry about anymore. Anyone else feel the same? Should we cease Ai?
AT LAST, A TECH BRO NOT FALLING INTO THE HYPE OF AGI DOOMSDAY SCENARIO, I WANT AI TOOLS THAT WORK WELL, NO SOME ENTITY THAT WILL DEBATE ME ABOUT THE MEANING OF LIFE BUT IS USELESS AT EVERYTHING.
B-but a charming young man (who wants our money) says these new programs are *fine*. They can never compete with us, because they can't have agency, they just can't, and they are not at all dangerous, just incredibly useful. It must be so.
Unless you define "artificial" as our mind which is not.. discoverable in matter Brian Greene: Quantum Gravity, The Big Bang, Aliens, Death, and Meaning | Lex Fridman Podcast #232 Somewhere in the first 8 minutes or so "not only is there no evidence.." LUL he's awesome.
Answer: No. He quite literally said he sees no reason why we can’t eventually develop AGI systems, just that we are absolutely not there yet and that LLMs will not take us there.
@@Nebukanezzer no, not at all. To say AGI is fantasy is about as dumb as saying Chat GPT 5 is going to be AGI and that it’s happening within a year or two
The only thing we have achieved at AGI is to create this indescribable hype in the AI community. That everyone today believes that if they leave their front door tomorrow they have to fear for their life because a T-3000 will come around the corner. That sounds exaggerated. But if I have to read comments like, that an LLM can be better than any ML algorithm, then that's it for me.
@@2AoDqqLTU5v It's a powerful human intuition, I think. Even when people think they don't believe brains have some special spark, they still lean into it.
There was never something impossible that became possible, but rather something that was not studied well enough. Hilbert tried to axiomatize all mathematics. Gödel arrived and theoretically demonstrated that it was impossible. Therefore, your assumption is false.
So you are telling me AI is all hype and professional are struggling to make good products out of it? You are also telling me you are better off prompting a general model than a fine tuned one? That's bonkers, why would you build a product that the more you try to constrain it the worse it behaves? That's like the opposite of optimization.
This is a serious guy, so refreshing to listen to someone with their head screwed on correctly
You think so because he's saying what we all want to hear and because the alternative is counter-intuitive. Everything about LLM had been counter-intuitive though. I wish we knew what for sure could and couldn't produce dangerous agentic AGI, but fact is we don't seem to. We don't know how we work in that regard either.
These models are just big hunks of emergent-properties-filled mystery. How fast we get used to the properties we already know about and start to say 'that's not anything like AGI, that's just a LLM thing. Lets make a bigger one' when a few years ago we would have thought 'it does what? And that too? Isn't that *already* AGI?'. I feel it's safer if these programmers of large models err on the side of "we don't know" instead of "it can't happen".
Remember that he, too, is trying to selling us something.
It can't happen silly.. No more tv for you. @Dan-dy8zp
@@Dan-dy8zp you are delusional
@@Dan-dy8zp Emergence in LLMs is HOGWASH engineers spew because there is no mathematical rigour in how they develop this models, a category theory approach to LLMs and machine learning in general could yield insight into the inner mechanics of LLMs. All models are just mappings and mappings have precise definitions thus there's no such thing as a BLACKBOX mathematically. Engineers aren't trained to find weird and complex associations in abstract mathematical ideas,that's the mathematicians duty. Mathematical rigour helped physics gain deeper insight,it could do the same for Machine Learning.
I don't know how they manage to keep bringing these amazing guests episode after episode
Wow.. some sanity, humility and thoughtfulness brought to the AI debate... I applaud!
Canadians are mostly pragmatic.
The most coherent and down to earth LLM discussion I've heard in a while!
no pun intended? xD
no pun intended? xD
It says something about the current state of things when a company saying “we aren’t building digital gods. We are trying to solve real world problems” is a green flag.
Excellent video as always MLST. I’m 10 minutes in and I can see the channel improving with every vid. This deep dive, direct to the source, appropriately skeptical content is needed in this parrot-filled AI hype cacophony.
Thanks for the summary brotherman
Great times ahead
2033 maybe.. I was thinking more like 2049
We shall see.. NASA has some.. experiments to run !!
Water is MAGICAL
Love your content too @Mutual_Information!
How is it a green flag? They will be left in the Dust by other companies actually building AGI.
@@marwin4348 because it’s not buildable, in my opinion. Lots of good uses of AI, but AGI is like cold fusion.
"parrot filled Hype cacophony" bwahahaha.... Hype! You're still calling it all Hype!!! Ahhhahahahaha! Too cute! Parrots!! That's right! Damn parrots they are! No thinking!
That opening sentence is so refreshing to hear 👏
Great interview, but it's surreal hearing the lead singer from Good Kid talking about *his* company's ML products. A Renaissance man.
For anybody curious about the band, check out "good kid no time to explain".
Can't believe I'm watching the singer of a band whose songs I have lost myself so many times into doing such a great interview on AI. Simply amazing!
If you can, give “Dance Class” a watch too!! My cousin’s in the music video and she did amazing ❤❤❤❤
The face of Nick everytime Tim tries to bring him in the AGI, intelligence, sentience, agency... BS debate... And then him lecturing Tim on how LLMs really work.
PRICELESS!
We don't know how intelligence sentience, agency, etc works, nor what more properties larger predictive models may have. Dismissing danger and telling everyone what they want to hear is the riskiest attitude. It doesn't matter what he's *trying* to produce.
@@Dan-dy8zpbut we do know exactly how transformers work and we know what they don’t do
@@Jm-wt1fs We know how neurons work. They take in energy, excrete waste, and produce bursts of electricity.
We still don't really understand human consciousness and subjective experience arises from this process, or why a few people become serial killers and most don't, well enough to identify them in advance and prevent them from harming people. What we don't know about our minds is still greater than what we know.
We don't have the ability to put a baby in an MRI machine and say with great confidence, "this child will NEVER become a fascist dictator". We are even more ignorant about AI.
We must not make AI too capable before we have much more understanding than we have now about human brains.
Even if we can be *totally sure* that a transformer based model, or a team of specialized models, can never do 'X', don't know we aren't one small breakthrough away from adding on something new that will harness the information in these models to make AGI possible.
@@flickwtchr like what? We can definitely explain how transformers work
@@Jm-wt1fsThey're referring to the model, the function that takes the inputs and returns the output when you use it. Because it's approximating to some kind of function but there are so many moving parts that it's hard to have an explicit function that's humanely usable. What builds that functions is known like you said but not the resulting function's formula or at least it would be too complex hence the black box term.
I think people discussing ASI have their head screwed on well, they’re just considering capacities that are farther down the line. It’s great to also hear a more immediate a perspective. What I got is combined systems working together and outsourcing is useful for ai. Add that to embodiment, a sensorium, and the capacity to “world-build” in real time, and I think we will have AGI.
Yeah I agree, I think vids like this garner a lot of weirdos who hate progress and then they can shit on ppl who talk about agi. When really, Agi is going to happen, but it doesn’t mean we have to always discuss it.
this is such a high quality and immensely well timed channel. the interviews are really right on with the SOTI (state of the industry) just made that up by the way. enjoy.
Anytime I use Cohere’s grounding model, it always hallucinates the grounding references
i think grounding being difficult is not just why companies are having trouble creating grounded intelligences but also why evolution had trouble evolving them, intelligence itself is pretty easy compared to the grounding
So much going on I keep forgetting how good this channel is and forget to watch!
im really glad i found this channel, awesome work you're doing!
What a great interview. Nick Frosst seems to have a real good understanding of the current state of AI and a refreshingly open view of the landscape.
He *seems to*, but then again, that's the impression he's trying to give. He's trying to make money just like the CEO's of OpenAI and Facebook. He is selling us the feeling of safety. He says we aren't in any danger from ever larger predictive models that have weird emergent properties their creators can't predict, when we don't know how our brains do agency or consciousness.
We need that Dawarkech guy lecturing him on his own field why he is wrong and why AGI is around the corner. Again and again and again and again and again
The question I wished you had asked is this: “In what fundamental ways does your thinking about LLMs, AI and AGI differ from Hinton’s?”.
The interviewee differs in that he is saying what we want to hear and Hinton is saying what we need to hear. We have no idea where human agency arises from or what further emergent properties LLM will develop.
Brilliant video as always, hugely underrated channel.
Fantastic interview! Agree that "multi-step tool use" makes more sense than "agent" - better reflects the current reality.
"Multi-step tool use" certainly *sounds* safer. The truth is the inventors of LLM's don't even know what abilities LLM's may develop as they grow and the interviewee certainly doesn't. Flatly dismissing the possibility you can produce something dangerous and unexpected while optimizing large mysterious predictive models for "multi-step tool use' is not less dangerous than deliberately trying to produce agency.
@@Dan-dy8zp You can idly worry or you can build. I suggest building.
@@mrjvc That is not the safer option.
@@Dan-dy8zp Come on dude. Maximum safety is maximum boring. BUILD something. Anything. Create value. Stop worrying about things like this - it's an excuse to not do what you are meant to do. DO THE THING.
@@mrjvc I recommend Robert Mile's AI Safety channel. It explores the pros and cons of this strategy.
This is the least fake-hype AI lab head I've seen, it is so refreshing! Everyone else is yelling their heads off about tech that we might have in a decade rather than focusing on what we do have. Don't get me wrong, the quest for AGI is nobel and important, but holy hell, there *is* work to be done in the here and now too.
Good interview - sane discussion.
RAG is helpful - but from what I have seen there are still too many fundamentally incorrect answers from these models. There are just too many corner cases whereby an incorrect answer is disastrous for someone's career or company to put trust in the models - as a result you spend a lot of time double checking the results - time that could have been spent solving the problem directly.
These tools have great promise in the areas of dynamic documentation, enhanced search etc. But the sooner we give up on trying to get a language model to "solve" problems, the better we will realise that these models are NOT thinking - they are searching on steroids.
It's a really elaborate search function
Do you use the ssemantics of your own design parameters as input to any of your models?
This is a serious guy, so refreshing to listen to someone with their head screwed on correctly
How does this channel not have all the subs, so good! Thank you
This conversation is such a treat, Tim! 😍
Layman here, is that retrieval augmentation Discussed in the beginning kind of the same thing open AI does with custom GPTs?
Yep, same thing.
Lol no... @@zachmccormick5116
Yes, but this guy can do it while telling buyers what they want to hear. Meddling with this new kind of software that has so far shown startling and amazing emergent properties (to get rich) is *perfectly* safe.
Never mind that we don't yet know how our brains produce consciousness and agency or much about how abilities emerge in these ever growing predictive models.
Today we are in TORONTO. We're in Cohere's office in TORONTO. We've been invited to their building here in TORONTO. They're running these in four cities, so in TORONTO, in San Francisco, in New York and London. And we're going to be capturing the events here in TORONTO.
Is that one of the Good Kid members? this is a crossover episode
The bling is distracting..... seems like a nice guy
Interesting points on benchmarks
Also when discussing RAG it is insane to not talk about it in terms of implicit and explicit data....
Are they just doing multidimensional vector computation on a fine tuned transformer that can already be done with a little bit of software development.
Also for a company focused on RAG, it felt a little paper thin. There are so many cool things we can do nothing he mentioned seemed revolutionary, and we can get the same results with normal models with some software development.
We did not really touch on the evolution of RAG architectures and what might the future hold
The fundamental problem with long context like Google is that its not performant or as accurate or as efficient as RAG.....but RAG is usually only implicitly, but we can ground it explicitly without locking the values of implicit flexibility.
The future of LLM's is mobile
Data indemnification seems to be the most valuable thing offered. And to be frank its super valuable
Yes, great interview, but I also was a bit disappointed not to hear a bit about how cohere does/improves their RAG usage in comparison to the named workflow. - But OK, it's maybe their UPS.
Could you please explain to me when you say "RAG is usually only implicitly, but we can ground it explicitly without locking the values of implicit flexibility"? - I'm not that trained with that vocabulary.
When you say "The future of LLM's is mobile" I guess you mean using models that run on the phones.
I am really excited how that will be solved. Because what we would expect would be the quality of GPT4o with all it's capabilities and I see no way to get something even close to that into models on a phone so you could use for example the realtime translation feature on a plane with no internet connection.
It’s a tiny necklace how is it distracting 😂
@@beerkegaard have you seen the rings?
Can you find hidden parameters, perhaps as yet unnamed in a metalanguage, to attempt to give weightings in concept analysis?
Great interview. I wonder what he thinks about O1 reasoning models from OpenAI.
It's amazing how many times they say "but you and I are doing something completely different" and get an affirmative nod, and then not follow up on what actually is different. It's vitalism all over again.
I assume you don’t know the technical details of how transformers work. If you believe this is how the brain works you are hopeless.
@@deadeaded We don't know how consciousness and agency work in humans either.
@@deadeaded But that qualitative difference may be mostly (or even entirely) the result of quantitative differences that will soon change, differences like the size of the network, the duration of training, or the number of modalities the model is trained on, etc.
Over and over, smart programmers with experience with artificial neural networks say things like "It'll never do X." And soon after, the program is doing just that.
It's seems like human intuition is persistently erring in the same direction. That should make us cautious.
@@deadeaded With the exception of neuroscience, none of those actually study what the brain does, they are too many abstraction layers away. And while neuroscience can tell us that the brain is probably not using transformers or even gradient descent, it's not yet able to tell us how the brain works in general. The parts that we do understand reasonably well (like the visual cortex) show remarkable similarities to artificial NNs.
@@deadeaded Actually you absolutely can understand how a computer works by studying circuit boards (more specifically, how CPUs are made). This is arguably the only way to really understand it on a deep level, and will greatly enhance your understanding of why higher abstraction layers (like the applications) work the way they do, since they're always limited by the hardware they're running on. On the other hand, studying how Excel or Photoshop or Chrome work will give you very little insight into the inner workings of the computer.
As a counter perspective. Great company first off. However, he keeps saying the limitation is the data they are trained on, thus they will never be agents. He missing the underlying intuition. Using “platonic representation hypothesis” as grounding, the data they trained on isn’t a limitation, rather a benefit. We will get a point to where LM have hierarchal representations. Why I’m so excited with all the research with grokking, because I believe the answer lies with fully grokked transformers and way better data(actually providing synthetic insights about the data, not 100% synthetic data; raw data is so oblique). Assuming you’re optimizing for ultra long sequences, and afford the model an ability to exploit this behavior with extended test-time compute…why couldn’t an agent be superior to human CEO’s? Hierarchical representations imo, would result in a model with very nuanced “intuition”. Give the model the special embeddings to allow it to control multi-turn output itself…I don’t see why we can’t have legit agents. Why I believe data engineering will be a true competitive edge and a TRUE moat. Why my intuition is sooo strong in this regard, the fact we train it on raw data and do post-training preference tuning that results in current capabilities..make the answer clear, the data is too oblique, hell even how we sample during pretraining is subpar imo, I think a lot hallucinations are knowledge-conflicts. Why I relate LM’s to like adolescent autism. First labs that start adding behavior psychologists to their data teams wins 😂.
Idk about ai gods, I’m a Christian 😂, but I do know we are no where near a ceiling with LM’s and I don’t mean the caveman style of pure scale increase.
I think you are right,we are not near done maxing out the potential of LLMs, and similar predict-the-next-token programs. They are only a few years old got big even more recently, yet some who should know better insist they will never do this or that. Too soon to say!
@@Dan-dy8zp How do you know that you don't even know where the limit is? To assert that we are not even close to the theoretical limit, we must know what the limit is. There are 2 types of people who say this nonsense. Those who have no idea about computing and those who want to sell you that their idea is the best.
@@raul36 "How do you know that you don't even know where the limit is?"
I'm guessing you mean how do I know that *no one* knows?
I know that many people with decades of expertise in the field have, in the last couple years, have made confident predictions about when LLM's and similar predictive models will hit a wall, or about what things these programs will 'never' do, even in "a thousand years" only to be proven wrong within months or weeks.
Very happy to see such a pragmatic, focused and grounded view of Large Language Models. Both that it is represented in the people building such systems today - and that it gets surfaced here on MLST. The philosophical discussions that we often have, are super interesting and worthwhile - but we must not entirely forget where we actually stand with today's technology. And most importantly, not fool ourselves in thinking that we are much further along than we really are. Or be fooled by organizations pretending that they are, in order to push their own agenda (possibly to our detriment as citizens...).
you're seeing people as thinking we're further than we are who are actually just trying to skate towards where the puck is GOING ,,, they're not non-pragmatic people they disagree w/ you dramatically about the speed of the change
How Interesting the interview is! Thanks!
What's the watch he is wearing?
The funny thing is that I'm not even here for the AGI shit, I just know nick frosst as the lead singer of Good Kid
Naysayers and Normalcy bias aside,
AGI-ASI is distinct possiblity.
The orders of magnitude spoken of by OpenAI researchers speaks for itself.
When the next leap in foundational models drops (GPT5),
a whole lot of people are going be looking VERY stupid.
Altman doesn't have the backing of Nvidia, Microsoft and the NSA for
no reason.
They probably have things back there that are mind blowing.
Not to mention one of the brightest minds in the game is
now building a superintelligence company.
Those are the folks i have my eye on
Asking all the right questions, thank you
"so we're not an AGI company"
*vineboom*
Some CEOs have irrational fear to sell, this one is in the business of selling feel good ideas. These guys are just two sides of the same coin
The Build Day was a lot of fun!
I really enjoyed this down to earth interview.
Refreshing take. Awesome sweater!
I'm really kind of miffed about this agent discussion. Agent doesn't imply agency. Agent is a COMPUTER term. It's any software that carries out operations on behalf of the computer or another program. Every piece of client software is an agent.
Agency has a much deeper meaning in cognitive science and philosophy i.e. selfhood, self-causation, intentionality and autonomy (plato.stanford.edu/entries/agency/) - when we think about AI systems we need to go far beyond the "anything goes" computer term. We have done a bunch of episodes on this topic recently
I wish we could get a podcast with the ai guys with opposing views can really go at it. Theres a couple, but we need more
The man is treating us all as if we're idiots: to make LLMs more useful in business you need to make them more general so that they can solve more problems. Making them more general is the same thing as trying to create Artificial *General* Intelligence. If your not trying to make them more general/ intelligent then what are you doing with all that investment capital?
Businesses don't hire generalists, they hire specialists.
If you're trying to make them bigger, it's because that's definitely not the path to artificial general intelligence.
Isnt this Good Kids lead singer or does he just look like an exact clone?
Edit: WOW IT IS!
Some people are building AGI and they are not naive. Scaling LLMs might do something or not. Or they might just be powerful tools for building something else.
Shareholders like wild claims (it’s how you get funding). Similar to Elon promising Full Self Driving for years and it never happening. LLMs probably can’t become AGI, but with RAG and tool use, it could become useful. True AGI needs a lower data to inference ratio: humans can do a lot more with less data, but LLMs need all data available to do less. This means LLMs are not the way to AGI. An evolutionary system needs to reward a lower data to inference ratio. Anyone desperate to add data is not understanding AGI or they’re lying.
Maybe we are clouding things with intelligence and just want it to be generally useful. Artificial Generally Useful (AGU (c)) ? :D
NO ONE is building AGI today, because NO ONE knows what it is. Do you think they built the atomic bomb without knowing where everything went, in an extremely careful way? Obviously not. This is exactly the same. Otherwise, the problem would have been solved long ago.
@@mikezooper This is EXACTLY what I have been saying for a long time. If it hasn't happened already, there's no way it will. If that is the path, it will be so obvious that you shouldn't be wondering what the next step is.
@@raul36 If it hasn't happening in the few years since people have started scaling up transformed-based models, then it can never happen? Pretty early to make such statements.
Your second sentence is honestly funny if you've ever worked as a researcher in any capacity.
I second Nick, I really enjoyed the conversation.
Cohere downplays AGI because that isn't their business model. It doesn't mean they have any idea how far away or how close AGI actually is.
his arguments on llm =/= agi makes sense though
Isn't it odd how the researchers that know the most about LLMs specifically transformers are the ones that consistently downplay it while those that stand to gain something from it tend to hype it up.
As it is right now it's just a best fit multi variable function finder. Which doesn't mean it can't be useful but there has to be some evidence that from a multi variable calculus problem we can get a minimum amount of intelligence, which right now we have very little evidence for.
He could have easily sell his product as AGI and a lot of dumb dumbs buy it. That makes more sense financially too. He is just a level headed guy who knows all these people who are saying this don't even use a different framework and they can't reach there with what we have.
So this extrapolates to "whoever is hyping AGI is doing it for their business model"?
@@michaelbarker6460 I would be cautious to make such statements. There are many very, very smart people who know an incredible amount about the transformer architecture who are on the other side of the AGI debate (you may google Neel Nanda for a start).
Arguments that begin with "it is just" + (a set of mathematical concepts) kinda fall flat. Pretty much the entirety of science and technology "is just" a hot-pot of calculus, linear algebra and statistics with some fancier mathematics thrown here and there to fill the niches.
Brains are networks, ultimately. Insanely complicated ones, and far more sophisticated than state of the art neural nets, but networks nonetheless. They are also doing computations which may ultimately be about approximating some function, the world model, in a way that isn't dramatically different from what LLM are doing, albeit in a primitive way.
Whether LLMs scale to AGI or not, that's a different question. But I don't think it is an unreasonable position to err on the side of caution.
Just for clarification. AGI is here since 2016. My invention.
B-b-but that one 20 year old said we'll have AGI in 2027
That other guy was obese oops thats me
obsessed with getting rich and fame arse
ffs people we are here to get rid of our ego, not feed it.
Chances are We already have a Conscious AI with a 700+ Iq hiding from us/being hidden from us.
Great rule, someone of 30 will be right over someone of 23. Or better rule, believe whoever says what makes me feel less uncomfortable and use age as an excuse for self-conviction.
Triggered
But @ST-ti6jj, he worked at OpenAI for a whole summer
Very cool, I deployed a RAG framework 13 months ago at PANW. They had no idea what I did and let me go. Now I bet they wish they understood what I was showing them to their face.
Love the hummingbird analogy
Nice guy, nice tech. Might have been a little more interesting and exciting if you'd spent the entire time talking about using the python generation abilities to solve a wide variety of problems rather than business platitudes?
that sounds like a good idea except you assume they've got good ideas how to use llms using python to think about stuff, i think he said about playing around w/ it a bit randomly asking for some graphs of stuff b/c that's really all he's done w/ it, not like he's hiding a bunch of super productive tool use pipelines he knows all about
This sounds like a guy who realized there’s no way his company is going to be able to compete in the AGI race so is coping by “carving out a niche” where he can try to make some money until then.
Current AI is no closer to AGI than we were in the 80's. It's basically just very, very fancy regression analysis. Because of hardware and data it's useful and can do things, but LLM's won't lead to AGI.
And you sound like somebody who's not very smart. Quite frankly, what he's creating makes the interim tools That worked well with today's models. Not just for business sense but private and personal hobby use as well. They've open sourced so many things. It sounds like a guy who's interested in the here and the now of machine learning and artificial intelligence. And you sound like somebody who it has a singular focus which may just be a fantasy
Speaking of which, what exactly have you done in the field of artificial intelligence or machine learning? That is of note? Is your company working towards AGI? Or are you buffing the floors in A school on the lower West end ? Because I'm assuming you're probably more the latter than the former
@@Lorentz_Factor I’m not going to doxx myself, but I work on safety research for superintelligence. I have no interest in building what Cohere is building so I have not bothered building a startup in that space. I still see the value in building AI tech for the world and incredibly excited about its usage in many domains. I’m just saying that Cohere has a lot of incentive to cope on AGI progress and are leaning into it and pretending there has been no real progress. Maybe they just decided it was better to mentally assume they just “don’t know.”
Claiming that current AI is “fancy regression” is completely missing the point of exponential growth and the feedback loops between current systems and what quickly gets us to even more capable AI systems. For example, we’re on a path to automate coding and that alone shortens timelines considerably. Another is that interpolation can likely get you very far, without “AGI”. There are many other reasons we are much closer to AGI than the “80’s”. Saying that is just pure boomer talk.
My guess is that we are anywhere from 4 to 12 years to superintelligence. Depends how many additional transformer-like jumps we need (maybe 1 and 2).
@@perogycookthis is so wrong. Back then, AI was mostly simple rule-based systems. Today, we have powerful models like deep learning and LLMs that can understand language, generate content, and solve complex problems. Calling it “fancy regression” is too simplistic. While LLMs aren’t AGI, they’re a big step forward and show real progress towards creating more general AI. The path isn’t simple, but to say there’s been no progress isn’t accurate. We don’t know if llms can get us to agi or not, I personally think it won’t. But you are wrong
@@lionelmessisburner7393 I didn't say there's no progress or that LLMs aren't useful (I'm personally incorporating a small LLM into my app as an alternative to a traditional search-based front end). I said they're not the path to AGI. And what's in the news this week? OpenAI's new model is disappointing, LLM capabilities are levelling off and domain experts like Ilya Sutskever and François Chollet are saying we need new techniques because LLMs aren't it.
AGI certainly exists. It is a method for raising absurd amounts of Venture Capital based on the fear of being left behind.😉
If AGI already exists, then this guy's attitude is the most dangerous one possible.
@@Dan-dy8zp you didn't get it was sarcasm
@@Dan-dy8zp also AGI definitely does not already exist. If you understood how far LLMs are from AGI, you would know how far AGI is from being a reality.
@@operandexpanse I don't believe that AGI exists. Just responding to toadguy.
@@Dan-dy8zp no worries 😌
Very practical person.
Good interview
The branching factor of python cant be above 256 right? In the worst case your next move is a byte, but not all bytes will lead to valid python.
He might have coined the term RAG...but as an AI and machine learning developer....
We've been doing this since the 90s 😊
yeah it's just a new word for "search" if a bot does it or benefits from it, right?? 🤔 but we get new words for everything from the perspective of bots doing it in order to deny that bots are here thinking things 🤷♀😅
Yes yes yes, but one component of human creativity is very similar to what LLMs do. When I sing a melody over a chord progression, it is generated for me based on my training data. I, the chooser, the decider as W would say, decide whether the direction of the melody my brain gave me is good enough or if I want a re-roll or to push the melody one way or another, but fundamentally, my brain relies on a something sub-conscious to hear melodies that dont exist yet, but derived from the melodies I've heard or the sequences i've practiced.
"The Level 2 of AI Adoption is arriving before AGI"
The Last AI of Humanity Book
Toronto, Toronto...
YYZ, YYZ
This dude is realy based. He gave a realistic evaluation of the capabilities of transformers and LLM. He dont hype shit up in a nonsensical way.
I guess our brain is operating same or roughly same as LLM. We practice things many times before doing it well.
without someone telling us, practice, repetition, reinforcement we know very little. We have many years of training 24h a day.
Our main advantage to an LLM is we receive info not only by text, but we see, hear, touch, taste, etc...
I don't understand why Can't an LLM system learn the required operations to do basic arithmetic?
If it can write computer programs, debug and correct code from error messages...
I doubt chatgpt is just spitting out next word using probability..., the answers are very sophisticated, broad, concise, etc.
Sounds good.
"Remember the times when you had to browse the internet yourself?"
It’s not a phantasy, it’s godlike capabilities is phantasy. It will feel like chatgpt when it came out. Amazing, incredible and almost useless
it's crazy that I know about the band good kid independent from this connection
The question is not whether AGI is possible or probable rather it's what would prevent it being so.
He doesn't *want* it to happen, so of course is wont. Please give money. /s.
@@Dan-dy8zphuh?
The company that achieves AGI and subsequently ASI will dominate. Or perhaps it will be the machine itself?
He had a ring that crosses two fingers and it is quite unnerving.
36:02 "I don't know how our company is getting data"... Are you freaking kidding me.
Listen again…
What he says: “I don’t know how other companies are getting their data so it’s hard to talk about the way in which they’re differentiate”
AGI… A Goldrush Investment 🤖🤪🤖
To bring AGI back down to earth to solve real-world problems is WAY too ambitious of a goal. but I wish him luck!
I think they're all trying to do that (solve real world problems), and with the same strategies. But this one is selling us a feeling of safety while he does it. He doesn't know how brains do consciousness or agency, no one does yet, but he's totally confident it won't emerge spontaneously from anything he wants to do. No, not with these ever growing models that keep spontaneously displaying startling new emergent abilities as they are made larger. They are harmless and just amazingly useful.
The last mile is maintaining coherence and accuracy. These models are incredibly limited due to their performance ceiling right now.
Ceiling?? Where?? Proof? Bit early to be claiming that, no? Is it the data? Cause word on the street is that synthetic data is even more effective than real data.
Power is a bottleneck, but they're building power plants as we speak.
So it's customised langchain hosted service - lol
His points are very sensible and "coherent"
This guy is completely overselling what Cohere does. Anyone can do RAG with LLMs using open source tools and models. No moat.
Unfortunately he didn't say how their approach to RAG is but he told how it is / has to be done if you have large datasets.
I don't know if you already have realised such a complex workflow including multiple specialized models that you have first tested for a longer time and selected and then especially prompted for that task narrowing the possible answers based on again large correlating data it found down to the correct answer to your question?
Also think of the fact that in larger companies hundreds of docs get added every day and their contents need to be extracted from multiple file-formats, markup-languages, text-formats and then inserted and indexed into the databases.
Not to speak from finding the correct citations.
And first you maybe have to narrow down the question to a specific situation/text-passage if the question is very vague and also have to control the llm's output and maybe take actions for narrowing down those.
Let's say you have the bible in the database ;-) and already had a longer conversation with the chat-application about it.
Then You ask "what did jesus tell the people" and you refer just to the last two or three sentences? But the model has your whole conversation in his chat history and now searches for answers to that question based on your whole conversation? - Maybe it would accumulate all the found answers to a short one but maybe it also would try to spit out all referring texts it found in the bible to that question? What if the context was too small for that? What if not? maybe it would write down lots of answers in a list, maybe thousands of words, ongoing and ongoing?
So you also have to deal with and handle input and output.
That's all extremely complex to get to a good application for large datasets and maybe traditional database applications are better suited for a lot of, if not most, use cases.
If you had actually built anything substantial with RAG you would know that not all RAG systems are equally good. There are numerous secrets involved in a great RAG system and for the generative model that is augmented too.
He doesn't think he needs a moat. If he had the same resources as OpenAI and Facebook, he'd be the most dangerous of the three. A feeling of safety is what he's selling. He telling us what we want to hear.
What does moat mean?
@@ataraxia7439 moat in AI means competitive advantage
I think he knows what he is saying. Because declarive has to be within imperative calls.
It has not even been 2 years since chatgpt was released, give it some more time
Do you think that humans work on things they have not seen before? This is I guess. Also a philosophical question but you're saying that the llms only do what they've been trained on but people do too. Earlier you stated when you were talking about the what you call a hallucination, but these are flights of Fancy, a mixture of learned data novelly expressed as something that never was. The information that they are telling is not necessarily something they were trained on. It comes from the training data as language. But it is still novel and new. And not from the training data. Because the output has no matching sequences within the training data and even if asked to examine this for truthfulness, it would recognize it as a delusion. This is no different from complex goals, such as creating a new mathematical formula to solve a certain problem.. If it is based on The statistical representation of the real world, then absolutely novel ideas can proffer. Because quite frankly that is actually all humans do. Nobody suddenly has an idea to do something that they know nothing of. A person with no knowledge of how refrigeration works is going to look at an air compressor on a refrigerator and say I know how to make this better.
They have to understand certain aspects. But then you may come across somebody who has an idea about something because they don't completely understand it. And this is where some interesting inventions have come along in the past. By trying something that no one well trained on the idea would have done but it's still using the real world knowledge that they had about other things and doing so
If you think that llms are less capable than humans because we are doing something special and interesting with the real world data that we have, you are incorrect
well, of course you don't just say "wow, we want to build AGI too!!" you do you, and they create AGI.
scale up the hw, and it will come.....
This guy gets it.
AGI is a fantasy, yes.
Lol
Willing to take a bet on that? 1k on AGI in the next 5 years?
The internet is a fad, yes
"We can have everything we want and nothing bad will happen. Give me money" -interviewee.
yup
Laid off by Ai and or human extinction? An Ai new world order? With swell robotics everywhere, Ai jobloss is the only thing I worry about anymore. Anyone else feel the same? Should we cease Ai?
I think we should stop making bigger models before we understand how the current ones are accomplishing what they do.
AT LAST, A TECH BRO NOT FALLING INTO THE HYPE OF AGI DOOMSDAY SCENARIO, I WANT AI TOOLS THAT WORK WELL, NO SOME ENTITY THAT WILL DEBATE ME ABOUT THE MEANING OF LIFE BUT IS USELESS AT EVERYTHING.
Luddites salivating over the headline without checking the video's response (No).
B-but a charming young man (who wants our money) says these new programs are *fine*. They can never compete with us, because they can't have agency, they just can't, and they are not at all dangerous, just incredibly useful. It must be so.
yes it is
Answer: Yes
Unless you define "artificial" as our mind which is not.. discoverable in matter
Brian Greene: Quantum Gravity, The Big Bang, Aliens, Death, and Meaning | Lex Fridman Podcast #232
Somewhere in the first 8 minutes or so
"not only is there no evidence.." LUL he's awesome.
Answer: No. He quite literally said he sees no reason why we can’t eventually develop AGI systems, just that we are absolutely not there yet and that LLMs will not take us there.
@@sirkiz1181so, yes, but wordier.
@@Nebukanezzer no, not at all. To say AGI is fantasy is about as dumb as saying Chat GPT 5 is going to be AGI and that it’s happening within a year or two
@@sirkiz1181 You could say the same thing about the Warp Drive or Cold Fusion
The only thing we have achieved at AGI is to create this indescribable hype in the AI community. That everyone today believes that if they leave their front door tomorrow they have to fear for their life because a T-3000 will come around the corner. That sounds exaggerated. But if I have to read comments like, that an LLM can be better than any ML algorithm, then that's it for me.
Beware. Remember, he too is selling us something: A feeling of safety.
@@Dan-dy8zp he is selling his start-up
They don’t want to hear that. These comments are just a massive echo chamber for vitalists
@@2AoDqqLTU5v whats your argument? ad hominem fallacy
@@2AoDqqLTU5v It's a powerful human intuition, I think. Even when people think they don't believe brains have some special spark, they still lean into it.
It is only impossible if it breaks physics. Otherwise, never use that word. Time and time again, the impossible became possible.
There was never something impossible that became possible, but rather something that was not studied well enough. Hilbert tried to axiomatize all mathematics. Gödel arrived and theoretically demonstrated that it was impossible. Therefore, your assumption is false.
Shame, the vocal fry is strong in this one
Presumably you can target individuals who try to act randomly in the invasions of choice recording.
My man said Toronto 5 times in a 20 second period. Thats an extrapolated TPM of 15. Or one Toronto every 4 seconds. Good podcast tho.
So you are telling me AI is all hype and professional are struggling to make good products out of it? You are also telling me you are better off prompting a general model than a fine tuned one? That's bonkers, why would you build a product that the more you try to constrain it the worse it behaves? That's like the opposite of optimization.
Isn't that what companies have done for decades? You're a bot? It really worries me how naive people can be.
OpenAI
Will quicjly become like Cohere