Has Generative AI Already Peaked? - Computerphile

แชร์
ฝัง
  • เผยแพร่เมื่อ 8 พ.ค. 2024
  • Bug Byte puzzle here - bit.ly/4bnlcb9 - and apply to Jane Street programs here - bit.ly/3JdtFBZ (episode sponsor). More info in full description below ↓↓↓
    A new paper suggests diminishing returns from larger and larger generative AI models. Dr Mike Pound discusses.
    The Paper (No "Zero-Shot" Without Exponential Data): arxiv.org/abs/2404.04125
    / computerphile
    / computer_phile
    This video was filmed and edited by Sean Riley.
    Computer Science at the University of Nottingham: bit.ly/nottscomputer
    Computerphile is a sister project to Brady Haran's Numberphile. More at www.bradyharanblog.com
    Thank you to Jane Street for their support of this channel. Learn more: www.janestreet.com

ความคิดเห็น • 2.6K

  • @Computerphile
    @Computerphile  11 วันที่ผ่านมา +68

    Bug Byte puzzle here - bit.ly/4bnlcb9 - and apply to Jane Street programs here - bit.ly/3JdtFBZ (episode sponsor)

    • @worldofgoblins
      @worldofgoblins 10 วันที่ผ่านมา +17

      Could you explain what “There exists a non-self-intersecting path starting from this node where N is the sum of the weights of the edges on that path” means? Is the end node for the “path” one of the purple nodes?

    • @MacGuffin1
      @MacGuffin1 10 วันที่ผ่านมา +4

      Humans do fine with less data, volume of data is clearly not the issue. 'Has generative AI already peaked?' : Not even close...

    • @mr_easy
      @mr_easy 10 วันที่ผ่านมา +2

      @@worldofgoblins Yeah, same doubt

    • @squirlmy
      @squirlmy 10 วันที่ผ่านมา +5

      B​ut things like LLMs don't at all work like human intelligence. That's like saying "people make ethical decisions all the time, so my pocket calculator should have no problem with issues of morality."

    • @paulmichaelfreedman8334
      @paulmichaelfreedman8334 10 วันที่ผ่านมา +1

      @@MacGuffin1 There's still ways to go, but I do think there is an asymptote. But not before this type of AI has become as useful and intelligent as the droids in Star Wars. But not much beyond that. It remains an input-output system, defeating any chance of ever evolving to human-like emotions, for example. For human-like AGI, I think Ben Goertzel is much in line with that and he says that is still quite some time away as such an AI is radically different from transformer and generative AIs.

  • @miroslavhoudek7085
    @miroslavhoudek7085 9 วันที่ผ่านมา +2487

    As a sort of large trained model myself, running on a efficient biological computer, I can attest to the fact that I've been very expensive over the decades and I certainly plateaued quite some time ago. That is all.

    • @ExecutionSommaire
      @ExecutionSommaire 8 วันที่ผ่านมา +44

      Haha, I totally relate

    • @avi7278
      @avi7278 7 วันที่ผ่านมา +22

      fr fr

    • @SteinGauslaaStrindhaug
      @SteinGauslaaStrindhaug 6 วันที่ผ่านมา +63

      @@avi7278 Apparently youtube thinks "fr fr" isn't English so it offered to translate it… it translates to "fr fr" apparently 🤣

    • @AnimeUniverseDE
      @AnimeUniverseDE 6 วันที่ผ่านมา +14

      I get that you were just making a joke but the current version of AI could not be further from humans.

    • @salasart
      @salasart 6 วันที่ผ่านมา

      XD This was hilarious!

  • @tommihommi1
    @tommihommi1 10 วันที่ผ่านมา +4211

    generative AI has destroyed internet search results forever

    • @vincei4252
      @vincei4252 10 วันที่ผ่านมา +666

      Nah, Google did that because of their greed.

    • @priapulida
      @priapulida 10 วันที่ผ่านมา +9

      @@vincei4252 .. and because they are woke
      (edit: I thought this is obvious, but apparently not, use the simple prompt "Google and DEI" to get a summary)

    • @Alice_Fumo
      @Alice_Fumo 10 วันที่ผ่านมา +128

      Come on, any 100IQ+ human with half an hour of time could figure out how google or whomever could largely fix those issues if they really wanted to.
      Also, the grapevine says that OpenAI search gets announced next monday so maybe there'll be some competition finally. Take buckets of salt with this though, I don't know where I heard it, but I'm quite sure it wasn't a trustworthy source.

    • @tommihommi1
      @tommihommi1 10 วันที่ผ่านมา +734

      @@yensteel the point is that trash generated by AI has flooded the results

    • @no_mnom
      @no_mnom 10 วันที่ผ่านมา +151

      Add to the search before:2023

  • @marcusmoonstein242
    @marcusmoonstein242 9 วันที่ผ่านมา +2065

    You've just described the problem being experienced by Tesla with their self-driving software. They call it "tail-end events", which are very uncommon but critical driving events that are under-represented in their training data because they're so rare.
    Tesla has millions of hours of driving data from their cars, so the software is better than humans in situations that are well-represented in the data such as normal freeway driving. But because the software doesn't actually understand what it's doing, any event that is very uncommon (such as an overturned truck blocking a lane) can lead to the software catastrophically misreading the situation and killing people.

    • @bened22
      @bened22 8 วันที่ผ่านมา +159

      "Better than humans" (X)

    • @andrej2375
      @andrej2375 8 วันที่ผ่านมา +41

      It's better than humans AND we'll keep working on it

    • @James2210
      @James2210 8 วันที่ผ่านมา +206

      ​@@bened22It's like you commented without actually reading what it's saying

    • @bened22
      @bened22 8 วันที่ผ่านมา +105

      @@James2210 I read it but I don't even believe the softened claim.

    • @ids1024
      @ids1024 8 วันที่ผ่านมา +67

      In theory, an advantage of self-driving cars *could* be that the software has "experience" with many of these uncommon situations that few human drivers would, which could save lives when a quick reaction is needed, or the best response is something most people wouldn't think to do. But that may be decades away still.

  • @Rolox01
    @Rolox01 8 วันที่ผ่านมา +1220

    So refreshing to hear grounded academics talk about these sorts of things and take realistic look at what’s happening. Feels like everyone wants to say anything about generative AI

    • @Dino1845
      @Dino1845 7 วันที่ผ่านมา +68

      It's only now I feel we're learning of the monstrous cost & technical limitations of this technology now that we're past the initial hype.

    • @harrylane4
      @harrylane4 7 วันที่ผ่านมา +91

      @@Dino1845I mean… people have been talking about that since the start, you just weren’t listening

    • @the2theonly672
      @the2theonly672 7 วันที่ผ่านมา +37

      @@harrylane4exactly, you can’t clickbait that like you can with “this crazy new AI will take your job”

    • @snickerdoooodle
      @snickerdoooodle 6 วันที่ผ่านมา +14

      People can still talk about the ethics and ramifications that AI has on the human element without your permission, just saying.

    • @the_mad_fool
      @the_mad_fool 6 วันที่ผ่านมา

      It's because all the crypto grifters jumped onto the AI bandwagon, so there's just a ton of scammers and liars flooding the air with their phony claims. Reminds me of back when stem cell research was the next big medical thing, and suddenly people were coming out with "stem cell anti-aging cream" made from "bovine stem cells."

  • @leckst3r
    @leckst3r 8 วันที่ผ่านมา +870

    10:37 "starts to hallucinate"
    I recently heard it expressed that AI doesn't "sometimes" hallucinate. AI is always hallucinating and most of the time its hallucination matches reality/expectation.

    • @sebastiang7394
      @sebastiang7394 8 วันที่ผ่านมา +240

      yeah but the same could be said of humans to some extend. We always have our own model of the world that is flawed and doesn’t match reality perfectly.

    • @fyang1429
      @fyang1429 8 วันที่ผ่านมา +15

      AI is just a very very cleaver Hans so that does make sense

    • @drew1564
      @drew1564 8 วันที่ผ่านมา +65

      If a hallucination matches reality, it's not a hallucination.

    • @fartface8918
      @fartface8918 7 วันที่ผ่านมา

      ​@@drew1564untrue

    • @Brandon82967
      @Brandon82967 7 วันที่ผ่านมา +8

      Not true. Someone told GPT 3, which is way worse than GPT 4, that it can call the user out if they asked a nonsense question. It was able to answer sensical questions like "who is the 40th president of the US" and called the user out when they asked nonsense like "when is the spatula frighteningly spinning"

  • @michaelujkim
    @michaelujkim 10 วันที่ผ่านมา +2032

    even if you took the whole internet as a dataset, the real world is orders of magnitude more complicated.

    • @B.D.E.
      @B.D.E. 10 วันที่ผ่านมา +247

      A simple but very important point that's easy to forget with all the optimistic ambitions for AI.

    • @mharrisona
      @mharrisona 10 วันที่ผ่านมา +10

      I appreciate your comment sir

    • @dahahaka
      @dahahaka 10 วันที่ผ่านมา +89

      Which is part of why people are working on merging robotics and ml
      However nobody is trying to let these things train on the real world, quite the opposite, turns out that training on the "whole internet" is vastly more efficient and transferable zero shot into the real world without any problems
      It's funny how you assume humans have perfect models of the world... You just need very very rough approximations

    • @Cryptic0013
      @Cryptic0013 10 วันที่ผ่านมา +77

      Yup. Look at the behaviors and beliefs of human beings who are, as they say, chronically online. Getting 100% of your information from the internet's not a great anchor to reality

    • @freezerain
      @freezerain 10 วันที่ผ่านมา +34

      But an AI should not be perfect, just be better then an average human. If dataset will contain all image recording, all books and movies and music all news and tiktoks and comments this is could be enough to be better then human in some tasks

  • @ekki1993
    @ekki1993 10 วันที่ผ่านมา +481

    As a bioinformatician, I will always assume that the exponential growth will plateau sooner rather than later. Sure, new architectures may cause some expected exponential growth for a while, but they will find their ceiling quite fast.

    • @visceralcinema
      @visceralcinema 9 วันที่ผ่านมา +36

      Thus, the exponential curve becomes logarithmic. 🤭🤓

    • @arthurdefreitaseprecht2648
      @arthurdefreitaseprecht2648 9 วันที่ผ่านมา +36

      ​@@visceralcinemathe famous logistic curve 😊

    • @typeins5139
      @typeins5139 9 วันที่ผ่านมา

      Imagine Blackrock created Aladin the super computer back in the days in the desert protected by the American military with full force. And now we are talking about small companies (compared to that giant monster)

    • @ekki1993
      @ekki1993 9 วันที่ผ่านมา +16

      @@typeins5139 Were you responding to a different comment there?

    • @jan7356
      @jan7356 8 วันที่ผ่านมา +4

      The curve he drew and explained was actually logarithmic in the amount of data (sublinear).

  • @RobShocks
    @RobShocks 10 วันที่ผ่านมา +435

    Your ability to articulate complex topics so simply with very little cuts and editing adlib is amazing. What a skill.

    • @philippmillanjochum1839
      @philippmillanjochum1839 7 วันที่ผ่านมา

      Facts

    • @Pabliski577
      @Pabliski577 7 วันที่ผ่านมา +4

      Yeah it's almost like he talks to real people

    • @bikerchrisukk
      @bikerchrisukk 7 วันที่ผ่านมา

      Yeah, really understandable to the layman 👍

    • @docdelete
      @docdelete 6 วันที่ผ่านมา +3

      It's called teaching, and it's really not that rare a skill outside of the internet bubble.

    • @RobShocks
      @RobShocks 5 วันที่ผ่านมา +1

      @@docdelete Teaching you say?! What a cool concept. Thanks for sharing.

  • @peterisawesomeplease
    @peterisawesomeplease 9 วันที่ผ่านมา +162

    I think a key issue is we are actually running out of high quality data. LLMs are already ingesting basically all high quality public data. They used to get big performance jumps by just using more data. But that isn't really an option anymore. They need to do better with existing data.

    • @jlp6864
      @jlp6864 7 วันที่ผ่านมา +63

      theyre also now "learning" from ai generated content which is making them worse

    • @Alan.livingston
      @Alan.livingston 7 วันที่ผ่านมา +15

      I worked on a system a while back that used parametric design to convert existing architectural plans and extrapolate them out into 3d models needed to feed into steel frame rolling machines. The hardest part of what we did was accomodating the absolute garbage architects would put in the plans. Turns out when a source isn’t created with being machine readable in mind it’s often difficult to do anything about it.

    • @peterisawesomeplease
      @peterisawesomeplease 7 วันที่ผ่านมา +39

      @@Alan.livingston Yea the problem of garbage or at least badly structured data is really clear in LLMs. Probably the most obvious example is they never say "I don't know" because no one on the internet says "I don't know". People either respond or they say nothing. So the LLMs don't have any idea of uncertainty. Another related issue that comes up constantly is that LLMs will give a popular answer to a popular question when you actually asked a question with a slight twist from the popular one. For example ask an LLM "what is an invasive species originating on an island and invading a mainland". They all answer the much more popular reverse question. Its a data problem the LLMs can't escape the overwhelming larger amount of data on the reverse question because all they see is the text.

    • @quickdudley
      @quickdudley 7 วันที่ผ่านมา +1

      ⁠@@jlp6864there are machine learning algorithms that aren't really affected by that kind of thing but adapting them to text and image generation would be pretty tricky to do right.

    • @picketf
      @picketf 6 วันที่ผ่านมา +4

      ​@@peterisawesomeplease I asked for ChatGPT 4 to calculate the distance to Gamma-ray burst 080916C one of the most violent time distortion events ever on record using a trifecta of 1. the cosmological formula for luminosity distance in a flat universe, 2. redshift, specifically to compensate for the time dilation effect, 3 Emission energy calculations. It responded 3 times the first 2 answers it concluded right after filling the page with formulas that they were incorrect and restarted the answering process presenting new numbers. I'd say it is a rather peculiar case but for sure those 2 wrong answers and the fact that it had become aware of their fallacy AFTER formulating everything twice is an attest to its indecision😅

  • @djdedan
    @djdedan 8 วันที่ผ่านมา +324

    I’m not a child development. Specialist so take this with a grain of salt but What’s interesting is that you can show a child one image of a cat. Doesn’t even have to be realistic and they’ll be able to identify most cats from then on. What’s interesting is that they may mistake a dog for a cat and they will have to be corrected but from then on they will be able to discern the two with pretty high accuracy. No billions of images needed.

    • @bened22
      @bened22 8 วันที่ผ่านมา +103

      Yes. Computer AI has nothing to do with human intelligence.

    • @joaopedrosousa5636
      @joaopedrosousa5636 8 วันที่ผ่านมา +127

      That brain was in fact trained with a vast amount of data. That baby brain was created with a DNA that guided the development of those nervous structures while it was in the mother's womb. The ancestors of that baby going millions of years in the past interacted with the world through visual means and the more successful at reproducing the genes encoded those useful visual perception and classification tasks

    • @dtracers
      @dtracers 8 วันที่ผ่านมา +55

      What you are missing is the 3-4+ years of "training" the human AI has taken up to that point of different cats and dogs. +Millions of years of evolution that has made the best sub networks that has learned.
      That second piece is hard because it's like running Google's automl over every possible data set and possible network architecture for an LLM

    • @ParisCarper
      @ParisCarper 8 วันที่ผ่านมา +40

      Like someone else mentioned, evolution has selected for brains that can learn and adapt new concepts quickly. Especially very tangible concepts like different types of animals and how to easily recognize them.
      For the AI, you have to start from scratch. Not only do you have to teach it cats, but you have to teach it how to understand concepts in general

    • @mrosskne
      @mrosskne 8 วันที่ผ่านมา +11

      @@ParisCarper what does it mean to understand a concept?

  • @Reydriel
    @Reydriel 10 วันที่ผ่านมา +826

    5:35 That was clean af lol

    • @squirlmy
      @squirlmy 10 วันที่ผ่านมา +67

      I've never seen a better right angle drawn by hand!

    • @NoNameAtAll2
      @NoNameAtAll2 10 วันที่ผ่านมา +24

      @@squirlmy tbf, it's on a celled paper

    • @dl0.0lb
      @dl0.0lb 10 วันที่ผ่านมา +37

      @@NoNameAtAll2 I certainly couldn't do that that quickly and smoothly even it were gridded!

    • @drlordbasil
      @drlordbasil 10 วันที่ผ่านมา +31

      he's an AI

    • @BillAnt
      @BillAnt 10 วันที่ผ่านมา +4

      AI is more like crypto with diminishing returns. There will be incremental improvement, but less and less significant than the original starting point.

  • @allgoodhandlesweretaken
    @allgoodhandlesweretaken 7 วันที่ผ่านมา +63

    also a lot of people seem to think that OpenAI came up with some novel way of engineering this stuff when in reality most of the progress we have seen is just the result of more compute and an increase in parameter count and dataset size. seems unlikely there will be an exponential curve when increasing these two is so hard and expensive.

    • @tux_the_astronaut
      @tux_the_astronaut 7 วันที่ผ่านมา +11

      Ye also not many people mention the energy demands of AI with all the water and electricity needed for AI we can only go so far

    • @picketf
      @picketf 6 วันที่ผ่านมา +2

      Two things speak against your theory. Why woukd Microsoft invest in openai? Why are datasets, algorhythm and methods closed source if it's only a scalability issue?

    • @prabhdeepsingh8726
      @prabhdeepsingh8726 5 วันที่ผ่านมา +10

      @@picketf Microsoft did not hand over money to OpenAI. The 10 billion dollars investment was the cost of microsoft's cloud infrastructure over which OpenAI trained their models. So, basically Microsoft will let OpenAI use their hardware for free until their consumption cost reaches 10 billion dollars or so. Its more like lending than investment.

    • @headlights-go-up
      @headlights-go-up 5 วันที่ผ่านมา +15

      @@picketfMicrosoft has invested a metric ton of cash into hundreds of projects that don’t pan out. Just because they invest in something doesn’t mean that something is a guaranteed success.

    • @user-co8cs9vc4q
      @user-co8cs9vc4q 4 วันที่ผ่านมา

      ​@@picketfbusiness. If your model is 83,5% on benchmarks, and your opponents is 83,6% you will lose users and thus money.

  • @supersnail5000
    @supersnail5000 10 วันที่ผ่านมา +699

    Im surprised 'degeneracy' wasnt also mentioned in this - basically that as more AI generated content leaks into the dataset, further training could actually lead to worse results. There are ways of encoding the output to evidence that the data was generated, but that likely wont hold up if edits were made to the data prior to it entering the training corpus.

    • @Raccoon5
      @Raccoon5 10 วันที่ผ่านมา +53

      AI generated data is frequently used for training of AI and it has pretty decent results.
      I doubt what you say is true, but having real world in the data set is always important.
      But that's not really a problem since we are taking more and more videos and photos of the real world.

    • @TheManinBlack9054
      @TheManinBlack9054 10 วันที่ผ่านมา +62

      I do not think that the so called "model collapse" presents an actual danger to AI advancement as was shown by the Phi models. The models can be trained on synthetic data and perform well.

    • @existenceisillusion6528
      @existenceisillusion6528 10 วันที่ผ่านมา +130

      @@TheManinBlack9054 That synthetic data was carefully curated. The problem still exists if someone isn't careful enough.

    • @monad_tcp
      @monad_tcp 10 วันที่ผ่านมา +33

      @@Raccoon5 no, those are called adversarial models, they don't work that well

    • @pvanukoff
      @pvanukoff 10 วันที่ผ่านมา +15

      Think about pre-AI, when we just had humans learning. What "dataset" did they learn on? They learned on things created by humans before them. If humans can learn from humans, and generate new, interesting, innovative results, I don't see why AI can't do the same by learning from and building on data generated by other/previous AI.

  • @tommydowning3481
    @tommydowning3481 10 วันที่ผ่านมา +303

    I love this content where we get to delve into white papers with the convenience of a youtube video, not to mention with the genuine enthusiasm Mike always brings to the table.
    Great stuff, thanks!

    • @xX_dash_Xx
      @xX_dash_Xx 10 วันที่ผ่านมา +11

      same here, +1 for paper review. and I appreciate the pessimism-- nice change of pace from the autofellatio that two minute papers does

    • @skoomaenjoyer9582
      @skoomaenjoyer9582 6 วันที่ผ่านมา

      @@xX_dash_Xx i’ve had my fill of the generative AI hype too… “no, im not worried that my quality, new work will be replaced by a machine that moves heavily-represented, average work.”

    • @blucat4
      @blucat4 4 วันที่ผ่านมา +1

      Agreed, I love Mikes videos.

    • @brianbagnall3029
      @brianbagnall3029 2 วันที่ผ่านมา

      ​@@blucat4The videos are all right but you can tell he really wants to pick his nose.

    • @lobstrosity7163
      @lobstrosity7163 19 ชั่วโมงที่ผ่านมา

      That paper will be displayed in the Museum of Human Civilization. The robotic visitors will shaken their headoids at the naïvité of their late creators.

  • @Posiman
    @Posiman 6 วันที่ผ่านมา +15

    This is the computational side of the argument for AI peak.
    The practical side is that the amound of existing high-quality data in the world is limited. The AI companies are already running out.
    They theorize about using synthetic data, i.e. using model-generated data to train the model. But this leads to a model collapse or "Habsburg AI" where the output quality starts quickly deteriorating.

  • @TheGbelcher
    @TheGbelcher 9 วันที่ผ่านมา +423

    “If you show it enough cats and dogs eventually the elephant will be implied.”
    Damn, that was a good analogy. I’m going to use that the next time someone says that AI will take over the Earth as soon as it can solve word problems.

    • @tedmoss
      @tedmoss 8 วันที่ผ่านมา +5

      We haven't even figured out if we are at the limit of intelligence yet.

    • @WofWca
      @WofWca 8 วันที่ผ่านมา +3

      1:00

    • @Brandon82967
      @Brandon82967 7 วันที่ผ่านมา +8

      That's not true but if you show it enough Nintendo games and images of supervillains, it can put Joker in Splatoon.

    • @SimonFrack
      @SimonFrack 7 วันที่ผ่านมา +1

      @@Brandon82967What about Darth Vadar in Zelda though?

    • @Brandon82967
      @Brandon82967 7 วันที่ผ่านมา

      @@SimonFrack its possible

  • @PristinePerceptions
    @PristinePerceptions 10 วันที่ผ่านมา +211

    The data we have is actually incredibly limited. We only use mostly 2D image data. But in the real world, a cat is an animal. We perceive it in a 3D space with all of our senses, observe its behavior over time, compare it all to other animals, and influence its behavior over time. All of that, and more, makes a cat a cat. No AI has such kind of data.

    • @swojnowski8214
      @swojnowski8214 8 วันที่ผ่านมา +5

      you can't recreate a 3d matrix using a 2d matrix. it is about dimensions. Development is about going from lower to higher dimension, you can do it if you are at the higher dimension, but not at the lower, umless you maku up some stuff to plug holes, that's why llms hallucinate, that's why we dream ...

    • @burnstjamp
      @burnstjamp 8 วันที่ผ่านมา +36

      However it's also true that humans can easily and accurately tell animals apart from a young age, even if shown only static images (or even abstract representations) of them. The fact that we have more senses and dimensions with which we can perceive input seems less important than the fact that the human mind simply has far more capacity for pattern recognition.
      I also don't think that introducing more variety in input would solve the issue presented in the video-only delay it. If 2D image data fails to produce exponential or linear improvement in a generalized model over time, I fail to see how 3D data, or sonic data, or temporal data, or combinations therein would substantially change the reality of the problem

    • @joelthomastr
      @joelthomastr 7 วันที่ผ่านมา +3

      Has nobody done the equivalent of plugging a large neural network into a Furby or an Aibo?

    • @celtspeaksgoth7251
      @celtspeaksgoth7251 6 วันที่ผ่านมา +1

      @@burnstjamp and born instinct

    • @picketf
      @picketf 6 วันที่ผ่านมา +2

      Well, AI is currently being trained on the billions of minutes being uploaded to youtube. Imagine you could say at any point of 2023 or 2024 that you watched every single cat video ever uploaded.

  • @LesterBrunt
    @LesterBrunt 5 วันที่ผ่านมา +15

    People just completely underestimate how complex human cognition is.

  • @OneRedKraken
    @OneRedKraken 7 วันที่ผ่านมา +91

    This is sort of confirming my suspicions of where this is all heading atm.
    When I understood the reason why AIs are confused about human hands and limbs. It made me understand the biggest flaw with AI. It doesn't 'understand' anything. Which is why even though its' been dumped tons of reference images and photos of humans. Still doesn't understand that the human hand has no more than 5 fingers.
    Why is that? Because in it's test data it has pictures of people holding something with two hands, but where one hand is hidden by the angle/perspective. And so the AI only sees one hand, but a bunch of extra fingers. It's conclusion "Human hand can have up to 10 fingers". That's a really big hurdle to climb over.

    • @picketf
      @picketf 6 วันที่ผ่านมา +4

      Fingers problem has been fixed in Dall-E 3 also you can ask it for 3D Models now and it will output in Blender script which means it's being trained to link concepts to shapes

    • @nbboxhead3866
      @nbboxhead3866 6 วันที่ผ่านมา +10

      A lot of the problems caused by limbs and fingers in AI-generated images happen because there isn't any pre-planning, so duplicates of something there's meant to be only one of happen easily. For example, if you have it try and generate an image of a man waving at you, there are several different positions his arms and hands could be in that would make a coherent image, and because the AI doesn't start out thinking "One arm here, fingers positioned like this..." and instead just generates an image based on a function that returns pixel values independent of each other, you get separate parts of the image thinking they're the same arm.
      I guess Dall-E 3 must have some sort of pre-planning that happens, which is why it doesn't fail limbs and fingers as badly. (I say "as badly" because it still won't be perfect)

    • @kirishima638
      @kirishima638 6 วันที่ผ่านมา +3

      Except this has largely been fixed now.

    • @TheMarcQ
      @TheMarcQ 5 วันที่ผ่านมา +23

      "fixed" by programmers adding phrases to your prompts making sure to include appropriate number of fingers. That is a hotfix at best

    • @picketf
      @picketf 5 วันที่ผ่านมา +1

      @@TheMarcQ apparently polydactyly is a real condition that is statistically not that rare. Will Smith slurping noodles is not that long ago and current AI is really leaps better.

  • @t850
    @t850 10 วันที่ผ่านมา +134

    ..."pessimistic" (logarithmic) perfomance is what economists would call "law of diminishing returns" and is basically how systems behave if you keep increasing one parameter, but keep all other parameters constant...:)

    • @fakecubed
      @fakecubed 10 วันที่ผ่านมา +10

      The thing is, the other parameters aren't constant. I also don't think we're close to maxing out on dataset sizes either.

    • @ekki1993
      @ekki1993 9 วันที่ผ่านมา +49

      And exponential performance is what any scientist outside of economics calls "unsustainable".

    • @t850
      @t850 9 วันที่ผ่านมา +14

      ​@@fakecubed ...that may be so, but each parameter contributes to the outcome to some degree, and even those have their limits. In the end it's only a matter of where exacty are we on the logarithmic curve. At firs curve may look as if will be rising indefenitely, but in realitiy it always reaches the "ceeling" before it flattens out.
      It's like driving a car. At first it seems as it will keep on accelerating forever, but in the end it reaches top speed no matter how hard or how long you floor it, how many speeds you have, how much power in the engine there is, or how low of a drag you can reach. If you want to keep on accelerating you need a new paradigm in propuslion (technology)...

    • @t850
      @t850 9 วันที่ผ่านมา +7

      @@ekki1993 ..."stonks" curve...:P

    • @Baamthe25th
      @Baamthe25th 9 วันที่ผ่านมา

      @@fakecubed What others parameters can be really improved, to the point of avoiding the datasets diminishing ROI issue ?
      Genuinely curious.

  • @msclrhd
    @msclrhd 10 วันที่ผ่านมา +270

    I've seen image generation gets worse the more qualifiers you add as well. Like asking for "a persian cat with an orange beard" -- it will start including orange coloured things, the fruit, or colour all the fur orange, give it orange eyes, or make the background orange. I think this is a fundamental limitation of the way transformer models work.
    Specifically, transformers are trying to reduce the entire thing into a single concept/embedding. This works when the concepts align ("Warwick Castle", "Medieval Castle", etc.) but not when the concepts are disjoint (e.g. "Spiderman posing for a photo with bananarama."). In the disjoint case it will mix up concepts between the two different things as it mushes them to a single embedding.
    A similar thing happens when you try and modify a concept, e.g. "Warwick Castle with red walls". The current models don't understand the concept of walls, etc., how things like cats or castles are structured, nor can they isolate specific features like the walls or windows. If you ask it for things like that (e.g. "Warwick Castle with stained glass windows") it is more likely to show pictures focused on that feature rather than an image that has both.

    • @Pravlord
      @Pravlord 10 วันที่ผ่านมา +11

      nope

    • @justtiredthings
      @justtiredthings 10 วันที่ผ่านมา +25

      Image generators are empirically getting better at this sort of coherence and prompt-adherence, so 🤷

    • @inverlock
      @inverlock 10 วันที่ผ่านมา +25

      Most people just work around this issue by doing multi step generation where they clearly separate concepts between steps and allow the later steps to generate content in the image with the context of the previous step as a base image. This doesn’t actually solve the issue but is a reasonably effective mitigation.

    • @roxymigurdia1
      @roxymigurdia1 10 วันที่ผ่านมา +7

      that's not how transformers work and also i dont know where u got the idea that models don't understand the concept of walls

    • @msclrhd
      @msclrhd 10 วันที่ผ่านมา +43

      @@roxymigurdia1 If you are just talking about walls, transformers can encode that concept as a vector sure. Similarly they can encode variants of those given enough data like "brick wall" or "wall with graffiti".
      My point is that if you ask the model to point out the walls on warwick castle it cannot do that as it does not know how to relate the concept (feature vector) of a wall to the concept (feature vector) of castles. Thus, even if it can encoded "red brick walls" and "warwick castle" correctly (which it can), it does not necessarily know how to draw "warwick castle with red brick walls" as it does not know how to align those two concept vectors, nor where the walls on the castle are in order to style them differently.
      That is what I meant.
      I've just tested this on Stable Diffusion 2.1 and that does a lot better job with these combined wall/castle examples than 1.5 does. It still messes up my "Spiderman in a photo with bananarama" example, though (ignoring the weird faces) :shrug:!

  • @ohalloranpeter
    @ohalloranpeter 9 วันที่ผ่านมา +26

    I was having exactly this argument during the week. Thanks Computerphile for making the point FAR better than I!

  • @Augustus_Imperator
    @Augustus_Imperator 6 วันที่ผ่านมา +8

    I'm sorry man, you choose the wrong day to publish this video 😅

  • @marcbecker1431
    @marcbecker1431 6 วันที่ผ่านมา +16

    This is totally off-topic, but the quality of the X and Y axes of the graph at 5:43 is just stunning.

  • @jorgwei8590
    @jorgwei8590 9 วันที่ผ่านมา +3

    Really nice episode. For LLMs it's interesting to see that all the frontline models are still basically at GPT-4 level. But then, they all seem to be based on a similar tech-stack. Avenues of improvement seem to be: Moar and better data, algorithmic and model architecture improvements, better training techniques, better prompting techniques, MOAR compute. It's going to be interesting to see, if any of these also work as a limiting factor.

  • @TheNewton
    @TheNewton 10 วันที่ผ่านมา +25

    4:12 "it didn't work" , too right , afaik the emergent behaviors of large language models(LLMs, big data sets) as you scale up are plateaus and have not lead to any consistent formula/metrics to indicate emergent behavior can be extrapolated as a trend
    Meaning we don't know if there are more tiers to the capabilities a LLM could have even IF you have enough data.
    And there's a paper that such emergence is humans doing the hallucinating with a 'mirage' due to the measurement metrics.
    [1] , and also see the emergence part in the standford transformer talk[2], [3].
    The other issue in the here and now with scaling up to even bigger data is that most of the data after ~2023 is just gonna be garbage, as every source: the internet , internal emails, ALL content sources get polluted by being flooded with AI generated-content and no true way to filter it out.
    AKA model collapse[4] , though I don't know of much published work on the problem of LLM's eating their own and each others tails, probably more stuff if you view it as an attack vector for LLM security research. Bringing us again and again to realizing authenticity is only solvable by expensive intervention of human expertise to validate content.
    [1] arXiv:2206.07682 [cs.CL] Emergent Abilities of Large Language Models
    [2] youtube:fKMB5UlVY1E?t=1075 [Standford Online] Stanford CS25: V4 I Overview of Transformers , Emily Bunnapradist et al speakers
    [3] arXiv:2304.15004 [cs.AI] Are Emergent Abilities of Large Language Models a Mirage?
    [4] arXiv:2305.17493 [cs.LG] The Curse of Recursion: Training on Generated Data Makes Models Forget

    • @SlyRocko
      @SlyRocko 8 ชั่วโมงที่ผ่านมา

      The polluted data could definitely be alleviated if generative AI had functionality similar but opposite to the Nightshade antiAI tool, where generated AI works could have injections to exclude themselves from learning models.
      However, there are still the other limits to AI that can't be solved without some novel solution that we probably won't even find anytime soon.

  • @kevindonahue2251
    @kevindonahue2251 9 วันที่ผ่านมา +8

    Yes, data required to train new models grows exponentially. They've already trained it with everything that can get their hands on and are moving towards "synthetic" ie AI generated data. The age of the Hapsburg AI is already here.

  • @astropgn
    @astropgn 10 วันที่ผ่านมา +11

    Every video I watch with dr pound makes me want to do a class with him. I wish his university would record the lectures and make them available on TH-cam

  • @thunken
    @thunken 10 วันที่ผ่านมา +224

    10:17 the blue marker pen _still_ doesn't have the lid on...

    • @harryf1867
      @harryf1867 10 วันที่ผ่านมา +10

      I am that Analyst who says this in the meeting room to the person at the whiteboard :)

    • @redmoonspider
      @redmoonspider 10 วันที่ผ่านมา +2

      Where's the sixth marker? Bonus question..what color is it?

    • @sergey1519
      @sergey1519 10 วันที่ผ่านมา +2

      ​@@redmoonspiderBlack.

    • @sergey1519
      @sergey1519 10 วันที่ผ่านมา +4

      Last seen in stable diffusion video.

    • @redmoonspider
      @redmoonspider 10 วันที่ผ่านมา +1

      @@sergey1519 great answer!

  • @EtienneFortin
    @EtienneFortin 9 วันที่ผ่านมา +25

    It's probably always the case for any problem. At some point brute force plateau. Reminds me when the only way to increase the processor speed was increasing the clock speed. I remember seeing graphs where the performance grew linearly and they were planning a 10 GHz P4. At some point they needed to be a little bit more clever to increase performance.

    • @beeble2003
      @beeble2003 5 วันที่ผ่านมา +1

      Clock speed is a nice analogy. 👍

  • @KnugLidi
    @KnugLidi 10 วันที่ผ่านมา +56

    The paper reinforces the idea that bulk data alone is not enough. An agent needs to be introduced into the learning cycle, where the particular algorithm needs to identify what pairs are needed for a specific learning tasks. In a nutshell, the machine need to know to direct its own learning toward a goal.

    • @davidgoodwin4148
      @davidgoodwin4148 10 วันที่ผ่านมา +1

      We are doing it a different way, via prompts. Our example is SQL for our system. The LLM knows SQL. It does not know our system as we never posted its structure publically (not because it is super secret, it is just internal). We plan to feed it a document describing our system. We then tell it to answer questions based on it (As a hidden prompt). That would work for teaching a model what an elephant is but I do feel you could provide fewer examples of new items once you have it generally trained.

    • @PazLeBon
      @PazLeBon 10 วันที่ผ่านมา +1

      if all people were he same that could eventually work.... but we aint, im particularly contrary by default :)

    • @Hexanitrobenzene
      @Hexanitrobenzene 10 วันที่ผ่านมา +1

      AlphaCode 2 has convinced me that LLMs + search will be the next breakthrough. Generate - verify paradigm. At present, it's not clear how to make the "verify" step as general as "generate".

    • @PazLeBon
      @PazLeBon 9 วันที่ผ่านมา +2

      @@Hexanitrobenzene llms kinda are search already

    • @siceastwood2714
      @siceastwood2714 9 วันที่ผ่านมา

      @KnugLidi isn't this just the concept of synthetic data? And there are actually efforts in creating AI's specialized on creating synthetic data needed for further training. I'm kinda confused why this concept isn't even mentioned here.

  • @doomtho42
    @doomtho42 10 วันที่ผ่านมา +12

    I can absolutely see there being a plateau regarding AI performance using current data processing techniques, and honestly I don’t think that’s as much of a “cavalier” idea as he suggests - I think the interesting part is how we progress from there and what new techniques and technologies arise from it.

  • @petermoras6893
    @petermoras6893 6 วันที่ผ่านมา +5

    I think people mysticize Machine Learning and Generative AI far more than it needs to be.
    At the end of the day, ML is just an arbitrary function. It can be any function as long as we have the right input and output data.
    The obvious problem is that the possibility space of any problem balloons exponentially with it's complexity, so you eventually reach a point where you don't have enough resources to brute force the solution.
    However, I don't think we've reached the peak of generative AI, as there are other avenues of improvement other than more training data.
    One solution I think we'll see employed more is using more complex algorithms that help bridge the gap between the input and output data.
    For example, we don't train a Neural Net on pure images. We use a convolutional layer at the start in order to pre-process the image into data that is easier to find correlations with.
    But these layers can be anywhere in the NN and still be effective.
    (Personal opinion) For Image based gen-AI, I think future algorithms will use pre-trained algorithms that show understandings of 3D objects and their transposition onto 2D planes. The general image classifiers could then use the pre-trained 3D transposition as a basis for understanding 2D images, which would in theory give them an understanding of 2D object representation that is closer to our own.

  • @squirrelzar
    @squirrelzar 10 วันที่ผ่านมา +15

    It’s interesting because animals, and specifically humans as the prime example of what a “general intelligence” should be almost proves it’s not a data problem. It’s a learning problem. I’d argue “the something else” is a sufficiently complex system that is capable of learning on comparatively small data sets. And we probably A don’t have the right approach yet and more importantly B don’t yet have access to the sort of technology required to run it

    • @fakecubed
      @fakecubed 10 วันที่ผ่านมา +14

      Our brains are constantly learning in massively parallel operations every minuscule fraction of a second every single day for decades at a time, across all of our senses simultaneously. It's pretty hard to compete with that.

    • @squirrelzar
      @squirrelzar 10 วันที่ผ่านมา +6

      @@fakecubed agreed - it’s a totally different system than “just some random numbers in a black box”

    • @annasofienordstrand3235
      @annasofienordstrand3235 10 วันที่ผ่านมา +1

      It's not a learning problem either, it's a metaphysical problem. If the senses gather "information," then how do neurons in the brain know what that information represents? Try to answer without using a homunculus argument. That is seemingly impossible, and so neural coding has failed.

    • @squirrelzar
      @squirrelzar 10 วันที่ผ่านมา +7

      @@annasofienordstrand3235 I think that’s what I’m getting at by saying a sufficiently complex system. The brain and its senses do ultimately boil down to a system of inputs and outputs. It’s just extremely well tuned for pattern recognition to the point where you only need to see a handful of cats to sufficiently identify all other cats. Hence my argument it’s not a data problem but a learning problem. You need a system that can operate on a small subset of data and be able to distinguish the important pieces of detail while chocking the rest up to attributes of that specific thing. And that’s only for classification problems. Just a tiny subset of what a general intelligence should be capable of

    • @mrosskne
      @mrosskne 8 วันที่ผ่านมา +1

      @@annasofienordstrand3235 What does it mean to know something?

  • @Archimedeeez
    @Archimedeeez 10 วันที่ผ่านมา +8

    the camera work is enthralling.

  • @jmdz
    @jmdz 10 วันที่ผ่านมา +348

    "In science we don't hypothesize" is a really weird statement. Do you mean we don't "just hypothesize"?

    • @TohnoEn
      @TohnoEn 10 วันที่ผ่านมา +225

      Probably what he meant is "we don't just asspull and blindly keep following that same trajectory regardless of evidence"

    • @peterjohansson1828
      @peterjohansson1828 10 วันที่ผ่านมา +70

      I'm guessing he means that scientists tests hypothesizes by gathering and analyzing evidence. They don't just hypothesize (aka ask questions) they put in the hard work and do real scientific work.

    • @Sevalecan
      @Sevalecan 10 วันที่ผ่านมา +43

      Yeah, a hypothesis is literally part of the process defined as science.

    • @monad_tcp
      @monad_tcp 10 วันที่ผ่านมา +5

      @@TohnoEn tell that to SamAlt , but let him ride the wave.

    • @axelanderson2030
      @axelanderson2030 10 วันที่ผ่านมา +10

      in science you don't solely hypothesise

  • @pibyte
    @pibyte 10 วันที่ผ่านมา +179

    Wait ... so you are suggesting that generative AI is not the magical solution for everything that tech giants tries to make us believe for years?

    • @hellofranky99
      @hellofranky99 10 วันที่ผ่านมา +25

      To be fair, that only really started last year.

    • @MermaidTyrone
      @MermaidTyrone 10 วันที่ผ่านมา +13

      "for years" barely a year

    • @Theonewhowantsaname
      @Theonewhowantsaname 10 วันที่ผ่านมา +31

      @@MermaidTyroneWith how much AI has been in the news it honestly feels like years. Even as a software engineer with an interest in the topic, I’m already sick and tired of hearing about AI, and find myself more and more averse to anything related to it. All the hype and grift and doomerism and debate is exhausting.

    • @olivers.7821
      @olivers.7821 10 วันที่ผ่านมา +17

      @@MermaidTyrone the things we call "AI" right now where already started almost a decade ago. Most of it shouldnt even be called AI but the word lost all of its meaning already so nevermind

    • @hardboiledaleks9012
      @hardboiledaleks9012 10 วันที่ผ่านมา +5

      No tech giant has ever told me generative A.I was the answer. I can tell you really think human intelligence is unique to humans, like you are special in the universe or something, but you're in for a rude awakening. If I was you I would move on from the denial phase of grief as soon as possible, it'll be better for you in the end

  • @pedroscoponi4905
    @pedroscoponi4905 10 วันที่ผ่านมา +180

    Reminds me of the story doing the rounds a while back about people trying to use these new gen-"AI" to identify species of mushroom and whether they're safe to eat, and the results were, uuuuh, _dangerously_ incorrect to say the least 😅

    • @sznikers
      @sznikers 10 วันที่ผ่านมา +12

      Cant wait to see people with baskets of deathcaps identified by some halfbaked istore app as edible 😂

    • @DonVigaDeFierro
      @DonVigaDeFierro 10 วันที่ผ่านมา +42

      That's insane. Even expert mycologists need to get their hands on a specimen to accurately identify them.
      Irresponsible to use it, and way more irresponsible to publish it or advertise it.

    • @sznikers
      @sznikers 10 วันที่ผ่านมา +44

      @@DonVigaDeFierro sillicon valley bros dont care, gotta hustle to pay all the coaching bills...

    • @randomvideoboy1
      @randomvideoboy1 10 วันที่ผ่านมา +10

      @@sznikers Ah yes anything AI related must be silicon valley bros hustlin. Forget Alphafold or the other AIs used in science with a high degree of accuracy.

    • @randomvideoboy1
      @randomvideoboy1 10 วันที่ผ่านมา +12

      Ah yes let's compare shitty bootleg ChatGPT appstore AIs with the ones scientists use.

  • @JJ-hx4tc
    @JJ-hx4tc 4 วันที่ผ่านมา +1

    Pretty new to understanding all this, so bear with me. My thought is, we judge AI under human parameters often, and derive utility from that. For example a “cute image” or a “friendly sentence”. Now if we set the LLM challenge to optimize for AI to human voice conversation and the input is now vizual and audible inputs (as we’ve just seen with GPT-4o) where the AI observes what makes us “delighted”, “touched”, “excited” heck even “aroused” we would be effectively training the AI to act Human no? And then at some near point the uncanny valley is reached and we perceive this as intelligence. Now, how do we distinguish from AGI except to say that the AI is “likely” not having a “human” or “conscious” experience however it is evoking that perception of experience in us.
    All that to say: a model that reaches that height would rise in the graph example based on humanized measurements we put in place. So at which point will a perceivable plateau be reached if we’re evaluating trained behaviors through our senses. Why should we care if architecturally there is no conscious element to be attained within the AI? And where’s the cap on how “real” AI will feel in interaction from now moving forward?

  • @jonathonreed2417
    @jonathonreed2417 10 วันที่ผ่านมา +6

    This was an interesting video considering public discussion of “power laws”. I hope you do another video about “synthetic data” which is being discussed now, what is it exactly why someone would want to use it drawbacks etc. I’m personally in the skeptical camp but it would be interesting to hear an academic answer to these questions.

    • @scampercom
      @scampercom 10 วันที่ผ่านมา

      Came here to make sure someone mentioned this.

    • @bilbo_gamers6417
      @bilbo_gamers6417 10 วันที่ผ่านมา

      Synthetic data is going to be huge, and will bring a significant performance increase as time goes on. All of OpenAIs data will likely be generated in-house in addition to the data they currently have from before the low bar AI generated dross began to pollute the internet.

  • @luketurner314
    @luketurner314 10 วันที่ผ่านมา +4

    2:04 "Always go back to the data" -Larry Fleinhardt (Numb3rs)

  • @jalengonel
    @jalengonel 8 วันที่ผ่านมา +4

    I’ve found that prompting these things by applying metaphorical concepts, “like this” in the style of “like this” seems to yield very very impressive results.
    Your implications about the last part seem to confirm my suspicions as to why that form of prompt engineering works so well.

    • @uy9584
      @uy9584 7 วันที่ผ่านมา +2

      Would you mind expanding on this a little?

    • @senefelder
      @senefelder 7 วันที่ผ่านมา

      Like this?

    • @uy9584
      @uy9584 6 วันที่ผ่านมา

      @@senefelder Yeah, I'm not sure I get your wording here and an example of linking metaphors would help. As I'm reading it it seems like your saying "metaphor" in the style of "other metaphor" but I can't figure out why that would work. Do you have an example you could share?

  • @TheRovardotter
    @TheRovardotter วันที่ผ่านมา +2

    A more interesting observation is that, at the current state, AI has the potential to affect jobs, society and institutions. And that’s more interesting than the potential of AGI

  • @martypoll
    @martypoll 7 วันที่ผ่านมา +2

    Another problem is the degradation of the training set as AI results become part of the training set. I don’t think performance will ever move above the linear graph. I think the future of AI will be narrow AI. The AI that interprets X-ray images or calculates protein structure doesn’t have to be the same program/machine that does language translation or produces cat videos.

  • @marybennett4573
    @marybennett4573 10 วันที่ผ่านมา +25

    Very interesting! This reminds me of a post that demonstrated that one of the image generation AI programs couldn't successfully follow the prompt "nerd without glasses". Clearly the model determined that having glasses is an intrinsic property of "nerds" given the majority of it's source images included them.
    Silly little example but illustrative of the broader issue I think.

    • @tamix9
      @tamix9 10 วันที่ผ่านมา +7

      That's more to do with the clip encoder being not great at encoding meanings like "without". Adding "without something" can instead make it more likely for that thing to appear.

    • @fakecubed
      @fakecubed 10 วันที่ผ่านมา +1

      @@tamix9 Yeah, you need to use negative prompts if you want to avoid something.

  • @djrmarketing598
    @djrmarketing598 9 วันที่ผ่านมา +7

    I think the graph flattens out more with diminishing returns as more examples get added - neural networks are just complex number systems and at some point if you take every image ever made and put it into a computer, I feel like the system just moves into a state where it doesn't really know more, it just is slightly more accurate at the same but still makes errors, while still not being able to identify "anything". One of the issues I think we have is we're applying a "digital" solution to an analog system. Human and animal eyesight isn't pixels. If it was, we'd already have cameras attached to brains for the blind. The data stream of nature I believe is more of a "video" learning algorithm. It's not about the pixels but the change of each pixel to the next pixel. Look at an open field - you see the field, but when a cat runs across, you see the cat because your brain (and animal brains) are designed to see that and hear that, and thus it's much different. AI is not trained that way. We should be training AI from "birth" with a video stream teaching it about shapes and colors, like we do children. Maybe we can rapidly do that, but we're not going to see a breakthrough in AI until we're doing "real live" training and not just throwing a few billion words and images into a neural net.

    • @hexzyle
      @hexzyle 8 วันที่ผ่านมา +5

      Yeah the reality is that these algorithms, no matter how much data you put into them, are still only reading/understanding the data within the constraints of the data type. E.g. a picture of a cat is just a picture of a cat, not an actual cat with weight, 3 dimensions, and a skeleton. These algorithms will become excellent at classifying or generating pictures of cats, but not being able to process how a cat moves in 3d space, the physical affects it has on the world, or how its skeleton is orientated geometrically. The machine algorithm is still hyper specific to "pictures" even though it has a facade of something more complex than that at first glance.

    • @thedudewhoeatspianos
      @thedudewhoeatspianos 7 วันที่ผ่านมา +1

      Even if we do that, do we have any reason to believe it can achieve a different result? It might be worth trying but i suspect human brains use a different kind of pattern matching, and no amount of training data will overcome that hurdle. We have to change our approach.

    • @alexandrebenoit
      @alexandrebenoit วันที่ผ่านมา

      What your describing is just a video encoding. At the end of the day, computers are digital. Any video stream, no matter how you encode it is going to be digital. I'm not saying encoding is not important, it's a huge factor in training an effective model, but we are already doing that and have been for a long time.

  • @sittingstill3578
    @sittingstill3578 10 วันที่ผ่านมา

    I followed a TH-cam recommendation path that took me from bookbinding tutorials to the enrollment process for Oxford. Certainly wasn’t what I expected; it was interesting though.

  • @Lantalia
    @Lantalia 10 วันที่ผ่านมา

    So, my only objection to this is that we are seeing near independent scaling with model size, training data, ops spent training, and ops spent reasoning, especially reasoning about a set of initial attempts at a solution. There is also some success at distillation, improving the capabilities of smaller models with training by larger ones. If we also get significant improvements in efficiency of the underlying math (AlphaTensor is doing this) and hardware (lots of attempts, but I'm not aware of any specific significant successes yet) from applying these tools to the design problems, then we also start adding additional dimensions in which to scale it whose rate of improvement is dependent on the improvement of the whole

  • @SkullCollectorD5
    @SkullCollectorD5 10 วันที่ผ่านมา +98

    Could part of the asymptote argument be that it will become harder and harder to train models on accurate data that *was not* previously generated by (another) model?
    Oversimplified, basically every written body of work released after 2022 will have to be viewed critically - not just in the prior healthy sense of informing your opinion well, but now that you cannot be absolutely sure it wasn't generated and possibly hallucinated by an LLM. Does generation degrade as if copying a JPEG over and over?
    In that way, possible training data is limited to human history up to 2022/23 - if you were to be cynical about it.

    • @tobiasarboe5753
      @tobiasarboe5753 10 วันที่ผ่านมา +28

      I don't know if this plays into the argument from the video, but it *is* a very real problem that AI faces yes

    • @dahahaka
      @dahahaka 10 วันที่ผ่านมา +6

      Honestly, none of this matters, Human Level Intelligence will always be the lower abound of what's possible with machine learning, if a human has enough data to gain the intelligence that it does, there has to be enough data for the human to gain this knowledge and intelligence... I'm seriously confused how Dr. Pound is oblivious to that :( just like people are blinded by hype, it seems theres a growing amount of people who are blinded by their discontent with hype and trying to disprove hype. Idk what i think of that

    • @medhurstt
      @medhurstt 10 วันที่ผ่านมา +5

      I think its because the expectations of models is too great. Its already well known its possible to run an answer back through a model to improve it and I think this is what this paper is missing (although I haven't read it!). Its unrealistic to think a model can hold the answer to all questions from training. Many questions simply need multiple passes in the same way we need to think things through ourselves. I think the computerphile issue of insufficient representation of objects in models may well be real but is very solvable even if it become incremental improvement on the knowledge side of the AI.

    • @dahahaka
      @dahahaka 10 วันที่ผ่านมา +6

      @@medhurstt exactly, just think about if I asked my mum to explain to me what Counter Strike is, you wouldn't consider her to not be intelligent because she's only heard that a couple times in her life :D

    • @Jack-gl2xw
      @Jack-gl2xw 10 วันที่ผ่านมา +3

      Having enough high quality data to train on is always a concern, but fortunately, I think any (serious) body of work that is published with the aid of a GPT model will be audited by the author to make sure the quality is still high. In this sense, I dont think that this would be an issue because the signal to noise ratio is still positive. If models are trained on random internet content, I would say this is more of an issue as there may be some low quality generated text sneaking into the training data. While training on poorly made synthetic data may cause some issues, I think the main takeaway from the paper is that more data (even quality data) will not be able to get us to new transformative results. We need new learning approaches or techniques to better understand the data and the users intentions. Personally, I think this is where implementing a form of reinforcement learning would help break through this plateau. Supposedly this is what OpenAI's mysterious Q-Star algorithm is thought to be doing

  • @robinvik1
    @robinvik1 4 วันที่ผ่านมา +3

    An AI needs thousands of pictures in order to distinguish between a cat and a dog. A very small child need about 12 pictures. Clearly they are not learning the same way we are.

  • @mz-power9587
    @mz-power9587 6 วันที่ผ่านมา

    Glad there's been some empirical data about this. I've had the same sort of Hypothesis with no numbers to back it up. I think we've basically near the plateuau for what these neural network implementations can do

  • @jesustyronechrist2330
    @jesustyronechrist2330 6 วันที่ผ่านมา

    I think it's related to this "Domain adversary" problem regarding making models more generic and adaptable so they can identify new data and then use it accordingly by applying a dynamic filter or something in the middle that now translates the never-seen input into a recognizable one.

  • @Shiftito
    @Shiftito 10 วันที่ผ่านมา +3

    4:20 is just one of the many reasons I've always loved content from this legendary channel. Kudos, fellas.

    • @wazreacts
      @wazreacts 10 วันที่ผ่านมา +2

      4:20 content is something much of the world enjoys. Oh, you meant the timestamp!?

    • @turolretar
      @turolretar 10 วันที่ผ่านมา +1

      Now that you mention it, they do like to use green a lot

  • @Lambda_Ovine
    @Lambda_Ovine 5 วันที่ผ่านมา +11

    considering that, on top of these findings, the internet is being flooded with ai generated garbage, to the point that models are being trained with the generated output of other generative ai resulting in degenerate output that is going back into the datasets, i think is very reasonable to believe that we're going to hit a plateau or even start to have qualitative degradation

    • @kabosune9097
      @kabosune9097 วันที่ผ่านมา

      I always hear this for the past 10 months, but I haven't ever seen AI image models degrade. And they can always roll back

  • @GlennGaasland
    @GlennGaasland 6 วันที่ผ่านมา +1

    I have closely followed the development of LLMs with great interest during the past year and had many longer conversations with GPT4, Gemini and Claude about various subjects. Of course I listen to many interviews and podcasts and talks on the subject as well. Yet I remain confused about this:
    Why is almost nobody talking about processes for DESIGNING NEW DATA STRUCTURES?
    If we see historic investments in building enormous new data centers, where are the corresponding investments in building new datastructures?
    It seems obvious that the future of AI depends on what data is created in the next few years, through the interactions between humans as well as current generations of LLMs, giving rise to the datasets upon which future LLMs are trained.

  • @danielbrockerttravel
    @danielbrockerttravel 7 วันที่ผ่านมา

    I think the particular approach has a little ways to go before it peaks, but new approaches are going to augment this that are more efficient. It's so cool to see just how much they were able to do with scale alone.

    • @plaidchuck
      @plaidchuck 7 วันที่ผ่านมา

      Can’t hope for less innovation, otherwise we’d be fixing horses still

  • @user-rr6nb3jv9r
    @user-rr6nb3jv9r 10 วันที่ผ่านมา +9

    This was my first instinctive thought about how useful ai will be in the long run, for this type of thing. Thank you for putting it into words.

    • @fakecubed
      @fakecubed 10 วันที่ผ่านมา +1

      It won't ever be less useful than it is right now, and it's already very useful. The logarithmic growth may be true, or not, but we're still in the very steep part of it.

  • @bosstowndynamics5488
    @bosstowndynamics5488 10 วันที่ผ่านมา +120

    A plateau does make sense in that it would match the tail end of advancement of new technologies that Tom Scott frequently references with the S curve of progression, where things grumble along in the research phase, then hit a phase of explosive improvement, followed by a plateau. GPT 2 to 3 plus the move to diffusion from GANs was the start of that explosive advancement in the generative space, but GPT 4 is already trained on most of the internet so there's not really much more data left to pursue the bigger transformer approach, at least until the next new technology in the deep learning/AI space comes in.

    • @Kknewkles
      @Kknewkles 10 วันที่ผ่านมา +17

      I'm not even sure it was an explosive development afforded by technology alone. Trillions of dollars of investment hype give you a certain push, but even that has its limits.

    • @TheManinBlack9054
      @TheManinBlack9054 10 วันที่ผ่านมา +16

      Interesting idea, but what makes you think that we are not at the start of the plateau, but necessarily near its end? I mean its good that its a falsifiable claim, since we can see whether the GPT-5 will be any better or not. I appreciate that! Personally I do not think that we are anywhere near the end, I do not think that we have even started to increase.

    • @TheManinBlack9054
      @TheManinBlack9054 10 วันที่ผ่านมา +6

      Btw you think this is why all the new current models like LLAMA 3, Claude 3, Gemini 1.5 are near the GPT-4 level? Because they already sucked up all the internet, basically?

    • @codelapiz
      @codelapiz 10 วันที่ผ่านมา +9

      its kinda one of the fundemental patterns of the world. systems thats growth depend on their size are everywhere. We easly model them as a differential equation. and the function for size dependent on time, is some sort of exponential. However, the system breaks down near the extremes. all exponensial growths in the world dose. if not they would be the entire universe very quickly. animals populations that reproduce to twice its size every year will quickly drain the enviorment of resources. if they didnt they would be every atom of the earth before few 100 years. and not long after every atom in the universe. no matter how good that specis is, that wont happen.

    • @LiamL763
      @LiamL763 10 วันที่ผ่านมา +21

      GPT 3.5 was trained on a massive amount of data and people easily forget that AI researchers were adamant that it's performance was the maximum we could ever hope to achieve with generative models.

  • @cefcephatus
    @cefcephatus 2 วันที่ผ่านมา +1

    Logarithm doesn't have limit close to infinity, but rate of improvement (derivative) gets to 0 at close to infinity for sure.
    That's the technical term of "flatten out".
    And the reality that causes this growth is usually the very foundation of conditional statistics, Bayes' Theorem. No matter how you improve the learning algorithm, but if the output is determined by summing data distributions, the result will be Sigmoid function. This is worst than pure logarithmic function, because it actually converges and has limit at infinity, so there exists the end of this approach.
    If you want to improve AI, don't think about applying statistics state machine, you'll have to embrace superposition and code the quantum wave function in to it. Yes, it still uses Bayes' Theorem to discern an output, and the solution is more complicated in traditional computers. But instead of sigmoid, now we have chaos to deal with, which sounds cool if we could stay on the upper branch of each bifercations.

  • @programninja6126
    @programninja6126 7 วันที่ผ่านมา +2

    Everyone who wants to understand AI should study mathematical statistics. The Central Limit Theorem with a convergence of 1/sqrt(n) isn't just a coincidence, it's the best linear unbiased estimator of any 1 dimensional problem. (there's technically other distributions like one for min(x) with their own convergence rate, but it's not better than 1/sqrt(n)) Machine learning models may have the advantage of being non-linear and thus can fit many more models than simple linear regression, but they obviously can't outperform linear models in cases where the data actually is linear so to think that a model can have anything other than diminishing returns is yet to be shown and would break statistics at its core (in fact 1/sqrt(n) is the best case, high dimensional linear models have a convergence rate of n^(-1/5) or worse, so if your data has a high intrinsic dimension then your convergence will obviously be slower)
    On the other side people pay a lot for this kind of research and it's excellent job security to keep trying for something you know probably won't happen

  • @matthimf
    @matthimf 10 วันที่ผ่านมา +3

    Great video! There is another limit, the amount of available data limit. I heard about what they call "model collapse" which happens when you continuously feed models with their own synthetic data. Would be great if you could do a video on this topic too! This means that non-synthetic data will become more and more valuable.

    • @justtiredthings
      @justtiredthings 10 วันที่ผ่านมา

      The research has repeatedly demonstrated that synthetic data can boost genAI performance

  • @CjqNslXUcM
    @CjqNslXUcM 10 วันที่ผ่านมา +27

    The type of inference done by LLMs is similar to human extemporization, which boils down to memory retrieval. There doesn't seem to be a process of introspection, deliberation and change.

    • @sopwafel
      @sopwafel 10 วันที่ผ่านมา +4

      And that will never change!

    • @CubicSpline7713
      @CubicSpline7713 10 วันที่ผ่านมา +3

      Temporal context is missing.

    • @nickthurn6449
      @nickthurn6449 10 วันที่ผ่านมา +6

      That is true. It is also true that those processes exist in the real world and we have no idea how they operate.
      We know that only some entities (human minds) reliably perform these processes and that there is a vast difference between the ability to generate an insight and the ability to recognise / appreciate an insight.
      Eg millions appreciate a hit song despite its simplicity but only one mind came up with it.

    • @clray123
      @clray123 10 วันที่ผ่านมา

      The generative algorithms cannot and do not do loops.

    • @Pixelarter
      @Pixelarter 10 วันที่ผ่านมา

      The new chain-of-thought algorithms are trying to mitigate that. They introduce a kind of introspection and deliberation before giving an answer.
      And from what I read the results are promising. The only problem is that they consume a lot of processing. OpenAI's Q* is speculated to work like that.

  • @JimiVexTV
    @JimiVexTV 3 วันที่ผ่านมา

    I think the right direction may be multi-model setups, where highly specialised models are brought in for dedicated tasks. Think one AI, that you interface with with, that directs multiple specialised agents to pop in and out of a workflow to match the user's needs. Or think a modular setup, where multiple AIs that are dedicated to specific strains of reasoning or information are able to intermingle using a shared language of sorts.

  • @frostydog860
    @frostydog860 8 วันที่ผ่านมา +1

    This is an absolute gem! Great explanations with references to the research! Keep it up. :)

  • @HolyBuddha82
    @HolyBuddha82 9 วันที่ผ่านมา +17

    The idea that one model will rule them all will fade away, rather we will see that the co-existence of multiple models working together. Where you end up having a sort of decision tree process that determines which model(s) would be used to complete the desired task. These smaller models would be faster and lighter to run. We are already starting to see this emerge with things like LangChain and AI Agents. Its rather interesting to see how fast all this is evolving. Great content as always!!

    • @AnimeUniverseDE
      @AnimeUniverseDE 6 วันที่ผ่านมา +1

      It's rather depressing actually

    • @Aircalibur
      @Aircalibur 2 วันที่ผ่านมา +1

      You get it. For anything not completely lowest common denominator or ridiculously specialized it'll be people passing things from one AI to another, assuming the AIs can even mesh. And if you have to do that too much, it might just be cheaper to have a human do the task from the beginning to the end. If just one of the AIs performs unpredictably, the whole thing is broken too.

  • @jimmy21584
    @jimmy21584 10 วันที่ผ่านมา +15

    Considering the open source community’s achievements with model building, fine tuning and optimisation, I think the really interesting improvements will come from there, and not so much from throwing more data into training and stacking more layers.

    • @fakecubed
      @fakecubed 10 วันที่ผ่านมา +5

      Yeah, the open source community is where all the best innovations are happening in figuring out how to make the most out of the models that are out there, but big investor money paying for PhDs to do research gets us important mathematical innovations and big volumes of training data. Both are necessary.

    • @sebastiang7394
      @sebastiang7394 8 วันที่ผ่านมา

      Is it really tough? Unix was a commercial product. The first desktops were commercial products. In what discipline does open source lead the technical curve.

    • @fakecubed
      @fakecubed 8 วันที่ผ่านมา +1

      @@sebastiang7394 The first desktops were home-built kit computers. Open source doesn't mean non-commercial.

    • @sebastiang7394
      @sebastiang7394 5 วันที่ผ่านมา

      @@fakecubed Ok so you're talking about a desktop pc. Not the concept of a gui. Misunderstanding on my part. Sorry about that. Yeah tinkerers definitively do a lot of great stuff. Especially in the early days of computing and still today. But I think it's hard to argue that the majority of innovation happens in the open source world. Just look at all the stuff that happened at Bell labs or IBM. Basically the beginning of Open Source (the GNU project) aimed at reproducing all the commercial tools that already existed. They basically aimed at recreating the UNIX ecosystem. And still today most big open source projects are either backed by large cooperations or are rather small in scale.

  • @vng
    @vng 9 วันที่ผ่านมา

    The paper is about text-image and image-text vision-language models. I wonder if the same can be applied to the large-language models (and large-action models being developed). It would be interesting if someone can actually crunch the numbers for GPT-2, GPT-3, GPT-4, llama, and other LLMs to see if their performance matrices are linear, exponential, or log linear.

  • @Kevin_Street
    @Kevin_Street 10 วันที่ผ่านมา

    Fascinating! I think the central problem is with "e" in the diagram. You say it's "...a numerical fingerprint for the meaning in these two items..." But can "meaning" be reduced to a numerical fingerprint? It doesn't seem likely.
    The result of their paper sounds just like what you'd get when you're trying to achieve greater and greater performance from a system that has an ultimate limit. In this case the dataset is limited to the meaning humans put into it. Every picture can be equated to a series of words, by putting together sequences of words you can describe pictures, and so on. The fact that the dataset contains actual pictures and words that correspond with each other, and not just random noise is meaning that humans injected into the data.
    By representing the data numerically the computer can try out all the possible combinations until it finds the ones humans say are correct. It's like a blind watchmaker randomly putting together parts until someone tells him he built a working watch. The more parts there are, the longer it takes him to randomly build the watch. That's like what's happening in the paper. More and more data means it takes longer to find the inherent meaning, but the computer can't put any meaning of its own into the data (it can't build a "watch" that isn't already implicitly "there" in the parts), so eventually it approaches an ultimate limit.

  • @davidgoodwin4148
    @davidgoodwin4148 10 วันที่ผ่านมา +3

    There are some cases where over training does facilitate deeper learning. In the example of pictures and text that does not help. What is missing is etymology (why we call a elephant a elephant) and examples. So you do have to "tell" it the rare things which is either more training or deeper prompts. I am working on deeper prompts "considering this document answer this question" you then hide that prompt to make a AI expert tool form that content. This works for many things. Learning what a more specific elephant is sounds like more training but it could be "given these additional examples (elephant) ..." It would be able to adjust.

  • @Kneephry
    @Kneephry วันที่ผ่านมา +3

    So AI models struggle with accuracy in the absence of sufficient training data. Last I checked human beings have a similar limitation.

  • @Ptaku93
    @Ptaku93 6 วันที่ผ่านมา +2

    this plataeu isn't a problem, it's a solution. I'm so thankful for it and I hope other papers on the topic repeat these findings

  • @uncletimo6059
    @uncletimo6059 5 วันที่ผ่านมา +3

    have clickbait titles peaked ?
    no, not by a long shot. also, do not forget to put your face in the picture - algo loves this.

  • @specy_
    @specy_ 7 วันที่ผ่านมา +2

    One thing im mostly worried about LLMs is the fact that we are able to improve them only if more data is found or the architecture changes, lets say the second is doable, we still need the first. Where do we get this data? Well big portion of it comes from the internet, blog posts, articles, open source software,etc. who uses LLMs the most? Blogs, articles, and open source software. We are also polluding the web with low quality LLM generated texts. We all know what happens when you give an AI it's own input... It will be analogous to over fitting. We could find a way to detect and exclude LLM generated content, but pretty much everything nowadays uses a little bit of it.
    At best we have very few salvageable data to use, at worst, the LLMs actually start getting worse

  • @svetlovska
    @svetlovska 9 วันที่ผ่านมา +2

    Very impressed. Not least by how straight you can draw a freehand graph axis.

  • @joelandrews8041
    @joelandrews8041 10 วันที่ผ่านมา +22

    One potential solution - Has there been any research into developing an AI model which classifies a problem for use by another more specialised AI model?
    For the plant/cat species case, a master AI would identify the subject, and would then ask the specialised subject specific AI model for further info on that subject. This prevents the master AI from needing all the vast amount of training data of the subject specific AI.
    Not sure if I've explained this very well!

    • @zactron1997
      @zactron1997 10 วันที่ผ่านมา +25

      The concept makes sense, but the problem is more fundamental according to this paper. It's not about whether you can train a model to do all these things, it's about the requirement for data at a rate that doesn't match the supply.
      In your example, the problem is never having enough data to make a good enough "specialist" AI to a sufficient quality.

    • @helix8847
      @helix8847 10 วันที่ผ่านมา +9

      Yeah its called FineTuning. Companies are doing it all the time right now, but are not just giving it away.

    • @terbospeed
      @terbospeed 10 วันที่ผ่านมา +11

      Agents, Mixture of Experts, RAG, etc

    • @joelandrews8041
      @joelandrews8041 10 วันที่ผ่านมา

      @@helix8847 thanks for this. I'd like to learn more about this @Computerphile!

    • @JayS.-mm3qr
      @JayS.-mm3qr 10 วันที่ผ่านมา +3

      Man.... they are developing advancements for everything. Any problem that you can think of, people are coming up ways to address with code. That is the magic of coding. You have a problem, and express the problem and solution in a virtual environment, in coded language that computers understand, and the computer outputs something that we understand. Have you heard of AI agents? Those address the thing you asked about. It turns out that using multiple llm's, and developing the ability for ai to complete tasks, makes them a lot more effective. This is true without increase data size. Yes, models are becoming better, without new data.

  • @Quargos
    @Quargos 10 วันที่ผ่านมา +20

    Honestly, it sounds like the conclusion you're coming to, about it not being able to infer to new things that aren't very well represented is just a kind of simpler thing that feels like it ought to be obvious: The nature of the dataset informs the nature of the resulting model. If the dataset doesn't have the information to differentiate between tree species (or if the info it has there is highly limited), then of course the model won't either. The model is simply a "compressed" form of the information fed in, and not other info.
    That "You show it enough cats and dogs and the elephant is just implied" as said at the start can never hold. Because if you've never seen an elephant, how could you ever be expected to recognise or draw one? I do not believe that extrapolation to anything outside of the dataset will ever truly work.

    • @peterisawesomeplease
      @peterisawesomeplease 9 วันที่ผ่านมา +1

      I don't like the elephant example either. But I think the point of the paper isn't just that if the data is missing the model won't be able to handle it. The point is that you exponentially growing amounts of data to get linear increases in performance. And we are already running out of high quality data to feed models.

    • @wmpx34
      @wmpx34 9 วันที่ผ่านมา +4

      I agree, but it sounds like many AI researchers don’t. So where’s the disconnect? Either we are wrong or they are overly optimistic. Like the guy in this video says, I guess we will see in a couple years.

    • @alexismakesgames6532
      @alexismakesgames6532 9 วันที่ผ่านมา +1

      The "elephant" speaks to human creativity. It's the ability to make new things based on base principles. Maybe being trained on only cats and dogs is too optimistic to make an elephant but say it also had some other animals so it knew what "horns" were ect. The hope is you could then give it a very detailed description involving concepts it already knew and hopefully get an elephant. Then you could teach it "elephant" and it basically becomes able to learn anything that can be described. There are a lot of other steps but this is one of the keys to having AGI.
      I agree though, it is terribly optimistic to think this will happen with the current ML models. Which is my main problem with them, they pretty much regurgitate only what they have a lot of basis for and become very derivative and patchy in areas where the data is thin.

    • @lopypop
      @lopypop 9 วันที่ผ่านมา

      In the same way that it will probably eventually be able to solve math problems it hasn't been explicitly trained on, I think the argument goes that once it "understands" enough fundamentals of the real world, it can extrapolate out quite creatively.
      This won't work with the example of recalling strict factual data that it hasn't been trained on ("draw me an elephant" ), but it might work with enough prompting to get something reasonable (generate an image of a large land mammal of a specific size, anatomical properties, and other qualities). It's possible that it generates an image that looks like an elephant without ever being trained on elephant photos

    • @Ylyrra
      @Ylyrra 8 วันที่ผ่านมา +4

      @@wmpx34 Most AI "researchers" aren't asking the question. They're too busy being engineers and looking to improve their AI, not look at the bigger picture. The bigger-picture people at all those companies have a vested interest in a product to sell that has a long history of hyperbole and "just around the corner" wishful thinking.
      You don't ask the people building a fusion reactor how many years away practical fusion is and expect an unbiased remotely accurate answer.

  • @pyonchan1804
    @pyonchan1804 9 วันที่ผ่านมา +2

    training of ai has mostly been curated training data, which is why giving it open access to the internet could be a factor the paper might not have taken into account, so will be interesting to see what happens.

    • @Ylyrra
      @Ylyrra 8 วันที่ผ่านมา +6

      The mother of all poisoned-well attacks is what happens. AI isn't ready for bad-faith training sets. It barely copes with accidental biases in the training data selection.

  • @Axel-gn2ii
    @Axel-gn2ii 4 วันที่ผ่านมา +1

    The amount of data on the internet is growing exponentially and models are coming down in size while retaining performance

  • @existenceisillusion6528
    @existenceisillusion6528 10 วันที่ผ่านมา +16

    CoT, ToT, sparse/MoE, etc. These are approaches to improve models, beyond just scaling data or model sizes. There are really a lot of 'tricks' to improve performance.

    • @clapppo
      @clapppo 10 วันที่ผ่านมา +1

      yeah but to generalise for everything I don;t think saying "think this through step by step" is going to quite cut it

    • @JayS.-mm3qr
      @JayS.-mm3qr 10 วันที่ผ่านมา +10

      Yeah there are improvements all the time. Spacial reasoning ability is a recent one. Anyone who thinks AI has peaked, isn't paying attention at all.

    • @fakecubed
      @fakecubed 10 วันที่ผ่านมา +2

      Yeah, tons of new stuff is happening with the models we already have, and new models are being built rapidly, apparently the people building them think they are going to be better enough to be worth it. This to me says we are not yet at the peak, or anywhere close. But things are moving so quickly, it's possible we'll reach that peak sooner than anyone expects.

    • @mvmlego1212
      @mvmlego1212 10 วันที่ผ่านมา +1

      Do you think that these techniques will simply decrease the base of the logarithm discussed at 7:50, or do you think they will improve the efficiency class? e.g. to sqrt(n)

    • @prolamer7
      @prolamer7 10 วันที่ผ่านมา +1

      @@JayS.-mm3qr Exactly, even i as enthusiast have seen many recent papers which improve performance. If they are combined in right way it will produce much more capable models.

  • @poduck2
    @poduck2 10 วันที่ผ่านมา +10

    I'm curious about how AI will train itself as more and more AI content gets disseminated on the Internet. Will it degrade the performance, like photocopying pages that have been photocopied? It seems like it should.

    • @ItIsJan
      @ItIsJan 10 วันที่ผ่านมา +2

      in the best case, the ai wont et better

    • @itssardine5351
      @itssardine5351 10 วันที่ผ่านมา

      Yeah thats what I’m curious about. A couple years back it was extremely easy for big companies to mass-scrape the internet and take everything but now they would need to fight against themselves

    • @nadinegriffin5252
      @nadinegriffin5252 2 วันที่ผ่านมา

      Most quality information isn't found on the Internet anyways. Quality writing and information is found in books and novels. Paywalls prevent acess to quality material online.
      AI already has an issue with plagiarism and not citing sources.
      It's like getting a handwritten photocopy of a photocopy about the Cabbage Soup Diet in the 90s. It supposedly claims developed at a hospital but has noting to back up that claim. It isn't a published paper, it doesn't link to the hospital website or provide you with a contact to verify the information.
      Infact, because AI has such poor input of information I wouldn't be surprised if I asked it about the Cabbage Soup diet that it would tell me it was developed at a reputable hospital and is great for losing weight. 🤣

  • @MrGryph78
    @MrGryph78 9 วันที่ผ่านมา

    The scaling is happening on the other end of the equation too, though. The data sets need to grow to get better performance out of the models, but the cost of doing that training decreases as better hardware enables more efficient training on those larger data sets. It's then a matter of which is growing faster, the size of the data sets to keep getting improvements increasing or is the computing power and efficiency growing faster than the size of the data sets. If the graph of performance vs data set size is logarithmic, and the growth of computing power and efficiency is exponential, then maybe the true resultant trend line will be closer to linear.

  • @bakawaki
    @bakawaki 2 วันที่ผ่านมา +2

    I hope so. Unfortunately, these mega companies are investing a ludicrous amount of money to force a line upwards that will inevitably plateau, while utterly disregarding ethics and the damage it will cause.

  • @EJD339
    @EJD339 8 วันที่ผ่านมา +5

    So can someone explain to me when I google a question now it
    Tends to highlight the wrong answer or not even answer the question I was searching for? I didn’t feel like it use to be that.

  • @grayfaux_
    @grayfaux_ 10 วันที่ผ่านมา +4

    This paper assumes that we continue to try and make progress using the same tech now as we will in the future. I think we can see over human history that progress is usually accompanied by a new methodology or technology. If this was the case we would be here all day doing backpropagation on an abacus. Great video by the way. Thanks!

    • @fakecubed
      @fakecubed 10 วันที่ผ่านมา +3

      Yep. Every week or two there's another big paper showing off some new concept. They don't all result in major improvements but there's so much active research going on, and the open source community is quick to try everything out and even make older models work better.

  • @catcoder12
    @catcoder12 9 วันที่ผ่านมา

    Throughout my undergrad, I've seen Dr Pound go from talking about cryptography to AI. How time flies!

  • @sbmoonbeam
    @sbmoonbeam 7 วันที่ผ่านมา

    You also don't know if you aren't in the exponential growth phase of an s-curve that plateaus out further along. But also this is about the limitations of zero shot solutions, humans don't really come up with full solutions to complex problems by immediate recognition, although this helps with categorising the initial problem. So maybe it suggests a shift to a new phase of approaches around how an AI agent gathers further info might be imminent (if I'm being optimistic).

  • @johnarnebirkeland
    @johnarnebirkeland 10 วันที่ผ่านมา +19

    Expecting AGI by increasing LLM size and complexity, sounds a lot like emergence in complex systems theory. I.e. there is precedence for this happening in biology etc. but there is absolutely no guarantee that it will happen, or if there is emergence that it will be anything useful in the context of AGI interacting with humans. But then again you could also argue that the LLM results we currently see already is proof of emergence.

    • @Qacona
      @Qacona 10 วันที่ผ่านมา +13

      I suspect that we'll develop models that are able to fool humanity into thinking they're AGI long before we actually hit 'real AGI'.

    • @thesquee1838
      @thesquee1838 10 วันที่ผ่านมา +1

      How do we see proof of emergence with the current LLM results? I'm curious

    • @kneesnap1041
      @kneesnap1041 10 วันที่ผ่านมา

      I think it's more likely a new technique / algorithm, or rather a collection of them, rather than a complex system creating unexpected behavior though.
      Think about how much "training data" a person gets. We only need to see a handful of cats & dogs before we can reliably tell them apart. We don't need dozens, hundreds, or thousands.
      To imagine that massive datasets are an absolute requirement for AI seems a bit unlikely to me because of this.

  • @emerestthisk990
    @emerestthisk990 5 วันที่ผ่านมา +2

    I'm what you call a 'creative' that uses Photoshop and After Effects and in my professional work and personal time I cannot find a single use for AI. After a playing around period with ChatGPT I haven't gone back to use it. I don't even use the new AI features in PS. I think the corporations and big tech really want AI to be the next revolution and are trying to sell it to us as that so they can line their collective shareholder pockets'. But the reality is far from the monumental shift being sold.

    • @francisco444
      @francisco444 5 วันที่ผ่านมา

      Lol not a single use for AI?
      I've seen so many creatives make this mistake it's actually enraging. Creatives are about pushing the boundaries, fostering connection, and mastering their tools. If AI is not at all for you, that's fine. But you'll find yourself in the back of the line eventually.

  • @tsuyusk
    @tsuyusk 9 วันที่ผ่านมา

    i loved the drawing of curves of big o notations

  • @user-yj3mf1dk7b
    @user-yj3mf1dk7b 9 วันที่ผ่านมา +1

    No one promised that AI would be one-shot LLMs: question - answer.
    classification cat\dog\elephant can be:
    10 agents describe images, and vote for answers -> get an answer.
    the issues: how to measure performance etc.
    and if it will work at scale or it's just some % improvement.
    it's well known: quality beats quantity.

    • @mrosskne
      @mrosskne 8 วันที่ผ่านมา

      Try complete sentences some time.

  • @shadee0_106
    @shadee0_106 5 วันที่ผ่านมา +10

    this video has funny timing considering that gpt-4o has just released

    • @sjzara
      @sjzara 5 วันที่ผ่านมา +7

      What has that got to do with it? ChatGPT 4o still has these problems.

  • @MrVersion21
    @MrVersion21 10 วันที่ผ่านมา +6

    You can see the same in speech recognition. Performance correlates to the log of the number of training examples.

  • @feandil666
    @feandil666 7 วันที่ผ่านมา +1

    never thought about it but it makes a lot of sense, a lot more than a "singularity"

  • @theronwolf3296
    @theronwolf3296 6 วันที่ผ่านมา +1

    There is a big difference between pattern matching and comprehension. At the first level, comprehension allows a mind to eliminate spurious matches, but further on, comprehension allows coming to conclusions that did not exist in the data. This is what intelligence really is. (GenAI could not have independently conceived of this study, for example). Essentially it's regurgitating whatever was fed into it. Actual comprehension goes far beyond that.
    Nonetheless this could be very useful for finding information though. An AI trained on millions of court cases, for example, could help a (human) lawyer track down relevant information that is critical... but it would require the application of human intelligence to determine that relevance as well as eliminate the material that does not apply.

  • @rozk_yt
    @rozk_yt 10 วันที่ผ่านมา +69

    Happy to see general sentiment trending in a more realistic and less hype-based direction for this tech. I've been trying to make this same impression on people I know irl for ages now, especially people panicking about their jobs and AI sentience and other similar bs as if there is any likelihood of it happening in the immediate future. I blame TH-cam for recommending all those horrible doomer AI podcasts indiscriminately

    • @Kknewkles
      @Kknewkles 10 วันที่ผ่านมา +11

      Heh, I was worried last year so I learned the basics of "AI". Since then I'm not as worried, but increasingly annoyed at a first tech bubble that I clearly understand is just that. A bubble. A marketing bundle of most promising buzzwords in history.

    • @dahahaka
      @dahahaka 10 วันที่ผ่านมา +6

      Anti-Hype based sentiment that disregards very very basic mathematics and statistics, isn't a good thing IMO, this is just as bad as Hype

    • @pedroscoponi4905
      @pedroscoponi4905 10 วันที่ผ่านมา +34

      I think a lot of people worried about their jobs are less worried about the AI actually matching their skill level, and more about their bosses/clients buying into the hype and replacing them anyway.

    • @riccardoorlando2262
      @riccardoorlando2262 10 วันที่ผ่านมา +11

      @@pedroscoponi4905 Honestly, that's a much more real problem. Just like with all tech bubbles before this one..

    • @inkryption3386
      @inkryption3386 10 วันที่ผ่านมา +10

      ​@@pedroscoponi4905 yeah this isn't a technological issue, it's a class issue.

  • @ianscottuk
    @ianscottuk 10 วันที่ผ่านมา +11

    What is the paper?

    • @earleyelisha
      @earleyelisha 10 วันที่ผ่านมา +7

      No "Zero-Shot" Without Exponential Data: Pretraining Concept
      Frequeney Determines Multimodal Model Performance

    • @emmettobrian1874
      @emmettobrian1874 10 วันที่ผ่านมา +1

      It's linked in the description.

  • @ruspj
    @ruspj 3 วันที่ผ่านมา +1

    might make sense that it would follow the pessimistic curve. simply because i would guess double the data would add maybee something like 50% to the accuracy rather than double it.
    this doesnt necessarily mean that we are close to the curve leveling out though. with the fast improvement at the moment there could be a long way yet to go before things start to noticably slow down.
    it doesnt just depend on getting more and more data but with machine learning being rather new technology there are bound to be loads of tweaks and improvements to the training methods, and with computers continuing to get faster they will allways be able to do more and more passes over the training data for better results.
    with the ammount of data, improvemnts in training methods, and computers getting faster all resulting in smaller and smaller improvements each time with diminishing returns.

  • @MushroomFleet
    @MushroomFleet 3 วันที่ผ่านมา

    I train these image diffusion models and wholly agree. Data Quality is more important than Data Quality. There are many competing CLIP models, so we are seeing new approaches or refined processing rather than larger and larger models now.

  • @ozzi9816
    @ozzi9816 10 วันที่ผ่านมา +4

    This has been something I’ve suspected for quite a while now. There are people calling for more hand-written algorithms to assist the transformers, namely to allow AI to create “blocks” of information that they can assemble in novel ways akin to how humans learn and experiment, but apparently there’s a LOT of bad blood between AI people and handwritten algorithm people and that’s why it hasn’t happened yet.