OpenAI Unveils o3! AGI ACHIEVED!

แชร์
ฝัง
  • เผยแพร่เมื่อ 20 ธ.ค. 2024

ความคิดเห็น • 743

  • @xacompany
    @xacompany 4 ชั่วโมงที่ผ่านมา +129

    Rockstar Games has been waiting for O3 to start developing GT6

    • @dijitize
      @dijitize 4 ชั่วโมงที่ผ่านมา +4

      I think the game development and software development will never be the same anymore because of these AI tools.

    • @Sumyunguy2
      @Sumyunguy2 4 ชั่วโมงที่ผ่านมา +1

      That's exciting!

    • @0AThijs
      @0AThijs 3 ชั่วโมงที่ผ่านมา +8

      Lol.
      Opens chatgpt
      Prompt: Create GTA 6

    • @TheNexusDirectory
      @TheNexusDirectory 3 ชั่วโมงที่ผ่านมา

      @@dijitize it's gotta improve 100x before it will be truly useful in software development.

    • @kas90500
      @kas90500 ชั่วโมงที่ผ่านมา +1

      Gran Turismo 6 was released in 2013, still would be impressive o3 to do it

  • @Jon-y1n
    @Jon-y1n 4 ชั่วโมงที่ผ่านมา +154

    It is AGI when i can let it take control of my work PC without my manager noticing my absence for weeks....

    • @TheNexusDirectory
      @TheNexusDirectory 3 ชั่วโมงที่ผ่านมา +38

      Not a joke. If it can't do that then it's not AGI

    • @dot1298
      @dot1298 3 ชั่วโมงที่ผ่านมา +7

      good criterion, agreed

    • @bestemusikken
      @bestemusikken 3 ชั่วโมงที่ผ่านมา +6

      Can general intelligence do that? As in anyone can substitute you? Don't think so. Why set the bar so high for artificial general intelligence, when "normal" intelligence can't.

    • @anta-zj3bw
      @anta-zj3bw 3 ชั่วโมงที่ผ่านมา

      lmao

    • @TheNexusDirectory
      @TheNexusDirectory 3 ชั่วโมงที่ผ่านมา +6

      @@bestemusikken but at the end of the day this is the entire hope of AGI.

  • @gnollio
    @gnollio 2 ชั่วโมงที่ผ่านมา +43

    Get your shovels ready folks, time to dig up the goalpost.

    • @Ascended23
      @Ascended23 28 นาทีที่ผ่านมา

      @@gnollio yep. AGI will be “achieved” a great many times before we ever arrive at a consensus on what, precisely, AGI means.

  • @ares106
    @ares106 4 ชั่วโมงที่ผ่านมา +188

    Until this is in the hands of independent testers I will remain skeptical.

    • @DefaultFlame
      @DefaultFlame 4 ชั่วโมงที่ผ่านมา +14

      Yuuuup. I don't trust OpenAI at all on anything they claim. Until its in my hands and I can see what it can actually do I don't believe anything their hype department puts out. Just look at Sora.

    • @brianmi40
      @brianmi40 4 ชั่วโมงที่ผ่านมา +3

      still skeptical of o1? Did you same the same thing then? Learned anything since then?

    • @csansolo
      @csansolo 4 ชั่วโมงที่ผ่านมา +2

      Thanks Sherlock. Because what they have done so far is just pure rubbish isn't it?

    • @John4343sh
      @John4343sh 4 ชั่วโมงที่ผ่านมา +8

      It has been independently tested by one of the biggest critics of LLM's and even he said this is a huge paradigm shift.

    • @noway8233
      @noway8233 4 ชั่วโมงที่ผ่านมา +3

      Absolutly ,i dont beleve in this , this companies always came with the same script , probably is a very good model but ...its geniuos until not

  • @localism479
    @localism479 4 ชั่วโมงที่ผ่านมา +51

    It is impressive, but saying it is AGI is clickbait. The G is for general, you know that. They are focused on the benchmarks, and let’s celebrate that progress. But don’t call it AGI, they are still “teaching to the test”.

    • @yoyo-jc5qg
      @yoyo-jc5qg 2 ชั่วโมงที่ผ่านมา +5

      they solved ABI, now chatgpt can get a job as a benchmark genius

    • @drhxa
      @drhxa 2 ชั่วโมงที่ผ่านมา +1

      The point is that they're not teaching to the test. Also that you can't "teach to the test" because all problens in ARC-AGI require unique types of reasoning.
      This is the most generally intelligent model out by far and far more general than the vast majority of humans. If it can't do some thing yet that humans can do, sure, but no human can do everything that humans can do either.
      This is obviously AGI

    • @DejayClayton
      @DejayClayton 11 นาทีที่ผ่านมา

      There was no teaching to the test for this benchmark. That's specifically the point of this benchmark.

    • @jeffbull8781
      @jeffbull8781 10 นาทีที่ผ่านมา

      They make the point of saying it was not trained specifically on any of these tests about 15:00, now whether you believe them or not is another thing but they are not according to them 'teaching to the test'

  • @HansKonrad-ln1cg
    @HansKonrad-ln1cg 4 ชั่วโมงที่ผ่านมา +38

    o3 is not agi. chollet is already working on a new test set which he says on his website is only 30% solved by o3 (keeping in mind always that these tests are solved 95% by average humans). on the same site he shows three examples of tests o3 didnt solve. they are very easy. o3 has no vision. it doesnt see the tests, it only reads them line by line, number by number. chollet quote: "you will know when we have agi when coming up with tests that are easy for humans and hard for models becomes impossible." we are not there yet by far.

    • @dot1298
      @dot1298 3 ชั่วโมงที่ผ่านมา +5

      ok, but o3 still is a considerable achievement in the *world of AI* (not AGI, i agree)

    • @dot1298
      @dot1298 3 ชั่วโมงที่ผ่านมา +3

      it could help in coding, for example

    • @freeideas
      @freeideas 3 ชั่วโมงที่ผ่านมา +2

      Very good point. Thank you. Yes, if we can still make tests that are easy for humans and difficult for ai, then that is pretty much the definition of "not agi".

    • @headspaceaudio
      @headspaceaudio 2 ชั่วโมงที่ผ่านมา +3

      What about tests that are easy for models but hard for humans? Shouldn't they count as well? Shouldn't AGI be an average of all kinds of tests?

    • @freeideas
      @freeideas ชั่วโมงที่ผ่านมา +1

      @@headspaceaudio O3 can solve LOADS of problems that 99% of humans can't. But that doesn't hit the definition of AGI. Even if a model is barely as good as a normal human, but GENERALLY can solve any problem that a human can solve, that is AGI. No one is saying that o3 is not SMARTER than most or all humans. It probably is. But it is not "generally" intelligent in every way that a human is intelligent.

  • @mortenekdahl262
    @mortenekdahl262 4 ชั่วโมงที่ผ่านมา +69

    Why it’s not AGI yet: The context window remains a significant limitation. These models perform well with single questions but struggle when managing large projects that require tracking extensive context. As the amount of data increases, they start to hallucinate or lose coherence, unable to maintain a reliable thread of information.
    Until this issue is resolved, these models, while powerful, fall short of being true AGI.

    • @BCCBiz-dc5tg
      @BCCBiz-dc5tg 3 ชั่วโมงที่ผ่านมา +4

      THIS

    • @tencizinec9583
      @tencizinec9583 3 ชั่วโมงที่ผ่านมา +4

      Its " virtually " AGI. Its within reach.

    • @anta-zj3bw
      @anta-zj3bw 3 ชั่วโมงที่ผ่านมา

      @@BCCBiz-dc5tg THIS

    • @francisco444
      @francisco444 3 ชั่วโมงที่ผ่านมา

      Sounds like just more GPUs and we're there.

    • @TheNexusDirectory
      @TheNexusDirectory 3 ชั่วโมงที่ผ่านมา

      @@mortenekdahl262 based

  • @MichaelAllon
    @MichaelAllon ชั่วโมงที่ผ่านมา +4

    "If that is not AGI, at least on this dimension, I don't know what is". Matthew, what does the acronym AGI stand for?

  • @SoccerPrince1
    @SoccerPrince1 5 ชั่วโมงที่ผ่านมา +311

    AGI Achieved? I am flaming you in the comments. Stop click baiting.

    • @matthew_berman
      @matthew_berman  5 ชั่วโมงที่ผ่านมา +39

      not clickbait!

    • @aa-dt5bf
      @aa-dt5bf 5 ชั่วโมงที่ผ่านมา +26

      More flaming here, ill apologize if not right. Doubt that

    • @matthew_berman
      @matthew_berman  5 ชั่วโมงที่ผ่านมา +71

      Watch the full vid first and let me make my point! I know you haven’t watched it yet bc it has only been out for 3 min

    • @Lucasbrlvk
      @Lucasbrlvk 5 ชั่วโมงที่ผ่านมา +9

      watch the video

    • @DaveKent
      @DaveKent 5 ชั่วโมงที่ผ่านมา +21

      I watched the entire event. AGI is here.

  • @thirien59
    @thirien59 4 ชั่วโมงที่ผ่านมา +100

    "Were not releasing it yet" = It's a marketing communication stunt.

    • @thedudely1
      @thedudely1 4 ชั่วโมงที่ผ่านมา +12

      "so we just got one upped by google but wait no we didn't please believe us!"

    • @clarityhandle
      @clarityhandle 4 ชั่วโมงที่ผ่านมา +6

      @@thedudely1 you guys expect them to release a new model every week??

    • @thedudely1
      @thedudely1 4 ชั่วโมงที่ผ่านมา

      @clarityhandle it's just been obvious how much they're holding back on what they actually have and how they only act when they're forced to.

    • @brianmi40
      @brianmi40 4 ชั่วโมงที่ผ่านมา +5

      Relax, o1 went from Preview to out in 3 months.

    • @brianmi40
      @brianmi40 4 ชั่วโมงที่ผ่านมา +3

      @@thedudely1 Yeah, they got "forced" an amazing 12 times in the last 12 days. genius.

  • @Martin-bx1et
    @Martin-bx1et 5 ชั่วโมงที่ผ่านมา +68

    Skipped O2 to avoid copyright issues...
    Ozone: "Hold my carbon dioxide infused yeast and plant materials"

    • @nosult3220
      @nosult3220 4 ชั่วโมงที่ผ่านมา +2

      Lame joke bro

    • @Martin-bx1et
      @Martin-bx1et 4 ชั่วโมงที่ผ่านมา +3

      @@nosult3220 Yes - I thought it would have fallen flat too.

    • @nosult3220
      @nosult3220 4 ชั่วโมงที่ผ่านมา +2

      @@Martin-bx1et ❤️

    • @autohmae
      @autohmae 3 ชั่วโมงที่ผ่านมา

      Also, there is no copyright issue, at most it's a trademark issue and they are in different markets, so it shouldn't cause much of a problem.
      The irony, stealing copyrighted material from all kinds of sources, they have no issue with.

    • @luihinwai1
      @luihinwai1 3 ชั่วโมงที่ผ่านมา

      O2 is a British telecommunication company

  • @fg6147
    @fg6147 2 ชั่วโมงที่ผ่านมา +18

    Somebody please define "AGI". The term isn't even agreed upon by "experts" in the field

    • @h83301
      @h83301 2 ชั่วโมงที่ผ่านมา +2

      Very true. It's generatic af. Honestly this model is impressive, very impressive and clearly outshines anything that was considered SOTA before hand. A significant breakthrough which will lead us further towards human obsolescence. AGI? It's just a generic term that literally has no one definition. We can't even define reasoning or conciousness, so no AGI will never have a meaning nor will the other terms. Just generic terms used toove goalposts.

    • @User-actSpacing
      @User-actSpacing ชั่วโมงที่ผ่านมา +1

      Matthew is not even near expert. He is an idiot. Let’s call the system AGI if it starts automatically test, improve itself and contribute to humanity without human input.

    • @Ascended23
      @Ascended23 26 นาทีที่ผ่านมา

      @@fg6147 if you’re a marketer at OpenAI, AGI means whatever capabilities the latest model has. Expect every single new model from them from here on out to “finally achieve AGI.”

  • @sbowesuk981
    @sbowesuk981 4 ชั่วโมงที่ผ่านมา +60

    Prediction: The impression I'm getting is that this technology is becoming so resource intensive and expensive to run, that the top-tier stuff is not going to be for consumers, but giant companies and governments. As time goes by it'll be a "you can look but not touch" situation. Well get the watered down toys, while the giant entities get the super-powered versions and true AGI/ASI.

    • @ryanscott642
      @ryanscott642 4 ชั่วโมงที่ผ่านมา +5

      Imagine complaining you get chatgpt for free
      You are right tho

    • @chrisrogers1092
      @chrisrogers1092 4 ชั่วโมงที่ผ่านมา +12

      That will change as the hardware(Nvidia GPUs) gets exponentially faster with each generation

    • @Sumyunguy2
      @Sumyunguy2 4 ชั่วโมงที่ผ่านมา

      Slaves we are. (Yoda)

    • @aitandechunveiled
      @aitandechunveiled 4 ชั่วโมงที่ผ่านมา +2

      It will continue to happen...and once AI is required for healthcare, education, etc. the void will become large.

    • @NaanFungibull
      @NaanFungibull 4 ชั่วโมงที่ผ่านมา +6

      Imagine the power plays and social engineering and mass manipulation that those with the money to run these models to their advantage will exert over those that can't afford to harness its power.

  • @ansonphong
    @ansonphong 4 ชั่วโมงที่ผ่านมา +14

    Good at programming and mathematics does not qualify AGI. It's going to have to cognize 3D space and do things in the physical world to pass the AGI mark in my books.
    Impressive model o3 and it will replace a lot of jobs

    • @5678plm
      @5678plm 2 ชั่วโมงที่ผ่านมา +4

      if it fails at self driving, then its not AGI

  • @nickrusso86
    @nickrusso86 4 ชั่วโมงที่ผ่านมา +35

    If this is truly AGI, then that will last about a week before we get to ASI. Greetings robot overlords!

    • @somebody-anonymous
      @somebody-anonymous 4 ชั่วโมงที่ผ่านมา +3

      Maybe o1 was AGI and o3 is ASI

    • @cajampa
      @cajampa 4 ชั่วโมงที่ผ่านมา

      I can't wait

    • @narachi-
      @narachi- 4 ชั่วโมงที่ผ่านมา +1

      update your passwords

    • @woj98498
      @woj98498 4 ชั่วโมงที่ผ่านมา

      @@narachi- why

    • @mintoo2cool
      @mintoo2cool 21 นาทีที่ผ่านมา +1

      @@narachi- What's the point. AGI can guess it anyway after looking at your facebook profile.

  • @rajeevgangal542
    @rajeevgangal542 ชั่วโมงที่ผ่านมา +2

    I hate Sam's affectation with a vengeance. Any chance a genai voice generator can replace it?

  • @fairchildSCR
    @fairchildSCR 3 ชั่วโมงที่ผ่านมา +8

    Let's see if o3 can create its own ARC benchmark from scratch that is more difficult than the current one. Then that would be actual AGI.

    • @py_man
      @py_man ชั่วโมงที่ผ่านมา +1

      That would be asi not agi

  • @CollabCrush
    @CollabCrush 45 นาทีที่ผ่านมา +8

    "Far better than anything else out there" is not the definition of AGI. Thanks for playing.

  • @PedroPenhaVerani-ll1wc
    @PedroPenhaVerani-ll1wc 3 ชั่วโมงที่ผ่านมา +18

    “AGI in this dimension” does not exist; focusing performance on a specific area is exactly the opposite of AGI.

    • @jsbgmc6613
      @jsbgmc6613 43 นาทีที่ผ่านมา

      I think the "AGI in this dimension" was in regards to the AGI benchmark ... Then he added math and coding, so it's also on more that 1 thing.

  • @scottholloway699
    @scottholloway699 4 ชั่วโมงที่ผ่านมา +7

    I believe A.I has to replace blue-coller work as well as white coller work in order to be AGI. Reflex, instant instinct when a pipe dislatches and water spurts everywhere (plumber - instant fix while robot stares and is confused) Academic benchmarks alone are not enough.
    A.I needs to figure out the automatic and intrinsic way we learn about the world in the first 5-years of our lives, an essential part of human development and intelligence.
    Humans initially recieve intelligence through analog processing THEN we move onto symbolic language at a later age. With A.I it seems to be the other way around.
    I believe A.I needs to master robotics and analog understanding of its environment in order to be AGI. Not just mastering symbolic understanding.

    • @jsbgmc6613
      @jsbgmc6613 27 นาทีที่ผ่านมา

      By your logic most people are below AGI level because they can't replace most white and blue collar workers ...

    • @Mavrik9000
      @Mavrik9000 20 นาทีที่ผ่านมา

      Check out the new Genesis simulation platform running on Nvidia hardware that is for desktop computers.
      Autonomous robots will soon be able to do complex, human only, hands on, tasks faster than people.

    • @jsbgmc6613
      @jsbgmc6613 19 นาทีที่ผ่านมา

      Most people can't do what most white and blue collar workers do ... And for sure most people can't ever learn to do what o3 already can.

  • @GrittyDuckGrin
    @GrittyDuckGrin 2 ชั่วโมงที่ผ่านมา +4

    It’s excellent in math and programming; however, I always expected we would eventually be surpassed in these areas. I believe the real differentiator for agi intelligence is the ability to learn and remember like a human. If it acquires information about a person from a photo, it should recall those details when seeing the photo again. That’s when it can truly start learning to perform our jobs-and this, in my view, is what AGI will be.

    • @cjbarroso
      @cjbarroso 54 นาทีที่ผ่านมา

      Ever heard of RAG?

  • @zSion
    @zSion 4 ชั่วโมงที่ผ่านมา +24

    "AGI according to Sam Altman and OpenAI" This is how I know you're being purposely untruthful, Sam Altman and OpenAI do not use the term AGI and they actively discourage it. They use 5 levels, and right now they're only on level 2.

    • @CJayyTheCreative
      @CJayyTheCreative 4 ชั่วโมงที่ผ่านมา +4

      Bro AGI doesn’t even have a proper definition between companies

    • @Sumyunguy2
      @Sumyunguy2 4 ชั่วโมงที่ผ่านมา +3

      ​@CJayyTheCreative did you purposely miss his point?

    • @olegt3978
      @olegt3978 3 ชั่วโมงที่ผ่านมา

      They are on level 2 but moving to 3 fast. End of 2025 will be l3 and end of 2026 l5. It will take only 18 months from level 3 to 5, less than level 1 to level 3.

    • @dot1298
      @dot1298 3 ชั่วโมงที่ผ่านมา +1

      @@olegt3978 how can you know that?! you have a DeLorean?

    • @Cine95
      @Cine95 21 นาทีที่ผ่านมา

      @@CJayyTheCreative do you even understand what he is trying to say

  • @dhruvbaliyan6470
    @dhruvbaliyan6470 55 นาทีที่ผ่านมา +3

    Ok they already mentioned AGI teaser in their project feature launch video.
    Why people can't accept it , agi would be here by 2025. As if it can solve those problem on which it never trained with 87 percent performance then it's almost agi.

  • @TimothyHuey
    @TimothyHuey ชั่วโมงที่ผ่านมา +2

    It's funny to watch AGI redefined as we evolve. Now it appears that a system can be qualified as AGI, but on a subset of abilities, a limited AGI. It appears true AGI will be AGI across the board on all skill sets. So OpenAI can still say they are waiting on full AGI.

    • @jsbgmc6613
      @jsbgmc6613 23 นาทีที่ผ่านมา

      While also keeping the models "safe" by distilling and restricting them in all kinds of ways.

  • @jonogrimmer6013
    @jonogrimmer6013 4 ชั่วโมงที่ผ่านมา +9

    Amazing and probably AGI however 'Semi private' on the ARC AGI eval. Full private tests on 'Simple bench' and other completely private tests will be the true tests.

  • @User-actSpacing
    @User-actSpacing ชั่วโมงที่ผ่านมา +1

    Hey Matt, get your head checked. This is not AGI because it doesn’t autonomously test, improve itself and do good stuff around the world by itself. If it’s truly AGI, we will have ASI in couple of weeks.

    • @DejayClayton
      @DejayClayton 10 นาทีที่ผ่านมา

      "do good stuff around the world by itself" - wow, are you redefining AGI all by yourself?

  • @weighoftea9528
    @weighoftea9528 44 นาทีที่ผ่านมา +7

    CLICK BAIT WARNING! BEEP! BEEP! BEEP! BEEP!

  • @greenstonegecko
    @greenstonegecko 4 ชั่วโมงที่ผ่านมา +5

    I cannot confidently say if this is AGI. AGI cannot be grasped through numbers alone.
    I will be certain if it's AGI once I talk to it.

    • @zrblank
      @zrblank 2 ชั่วโมงที่ผ่านมา

      Ehh. You would think but talk to some of the newest chatbots they can convince easily and aren't all that great

  • @dariusdbbowser6329
    @dariusdbbowser6329 4 ชั่วโมงที่ผ่านมา +8

    So basically this is another Sora announcement and we won't see this for months...maybe not until Summer 2025 at the earliest lol.

    • @jamesjonnes
      @jamesjonnes 2 ชั่วโมงที่ผ่านมา

      It's really bad for OpenAI since they could ask $3000/month for this and many would pay for it.

    • @testales
      @testales ชั่วโมงที่ผ่านมา +3

      And by that time some Chinese researchers will have released something that's pretty close to it but open. ;-)

    • @dariusdbbowser6329
      @dariusdbbowser6329 ชั่วโมงที่ผ่านมา

      @@testales exactly lol

  • @User-actSpacing
    @User-actSpacing ชั่วโมงที่ผ่านมา +7

    I watched the release myself. This is not AGI. Matthew is tripping his ballz

  • @ET-zw4pk
    @ET-zw4pk 2 ชั่วโมงที่ผ่านมา +3

    Someone asked for definition of AGI. AGI is when we all get fired.

  • @surfcitiz
    @surfcitiz 2 ชั่วโมงที่ผ่านมา +2

    Thank you for creating this video. Whether or not it qualifies as AGI is beside the point; it’s inevitable. There are valid reasons to feel both hopeful and apprehensive about its arrival.

    • @drhxa
      @drhxa 2 ชั่วโมงที่ผ่านมา +1

      Agreed and I'd say AGI was first achieved with Claude 3.5 Sonnet this summer. Once we got o1 mini and o1, it was pretty clear they were generally intelligent, could reason, learn new tasks on the fly, create new reasoning modalities on the fly etc.
      o3 is clearly AGI imo.
      But you're right that it is inevitable even if we say this particular one isn't. I think it's surprisingly tame to start with and people aren't/weren't ready for that.
      Regardless lots to be excited and concerned about indeed

  • @llrainll
    @llrainll 4 ชั่วโมงที่ผ่านมา +3

    I believe we’ve already achieved AGI months back, ngl

  • @mocanada304
    @mocanada304 4 ชั่วโมงที่ผ่านมา +3

    I think what would make the most sense is to allow AI have senses. So that it can see the world we are living in and not use the data that we have generated on the web.

  • @WalkerKlondyke
    @WalkerKlondyke 38 นาทีที่ผ่านมา +1

    Notice the props behind them, all items representative of major technological advancements in human history. Nice touch as we're on the verge of turning the future over to technology itself.

  • @kumarivin3
    @kumarivin3 4 ชั่วโมงที่ผ่านมา +1

    i think one thing we need to keep in mind is which category/aspect did the additional gain come from . Some times the single metric is a redherring, the models could possibly overfit on a certain category resulting in improved accuracy, which is good for press but in reality it could just be the same.

  • @freedom_aint_free
    @freedom_aint_free 2 ชั่วโมงที่ผ่านมา +2

    I'll give you a very hard benchmark: The Millennium Prize problems

  • @djayjp
    @djayjp 51 นาทีที่ผ่านมา +2

    95% agreed 👍. Just like a healthy human, if it doesn't know something or doesn't possess some specific intellectual skill, it can learn it and do it in principle.

    • @jsbgmc6613
      @jsbgmc6613 22 นาทีที่ผ่านมา +1

      I think most people don't learn 😂

  • @mocanada304
    @mocanada304 4 ชั่วโมงที่ผ่านมา +3

    We humans can go out in the world see things, discover things, unless we allow AI to have such a freedom, they can never outsmart us. The current AI no matter how advanced at the end of the day is just a simple tool for us to use and simplify or speed up the mundane tasks we perform.

    • @noway8233
      @noway8233 3 ชั่วโมงที่ผ่านมา +1

      Yeah, its cant create something really new , after all😅

    • @sebastianjost
      @sebastianjost 3 ชั่วโมงที่ผ่านมา +2

      If AI has sufficient access to internet, surveillance cameras, personal documents etc., it could do a lot of harm without needing an embodiment.
      Current AIs have been shown to be capable of manipulating humans to do tasks for them.many current robots are connected to the internet in some way.
      A sufficiently advanced AI could also access there robots to very quickly gain the ability to walk around and discover things in the real world.
      In conclusion: a purely digital AI is not necessarily safe.

  • @dmurawsky
    @dmurawsky 2 ชั่วโมงที่ผ่านมา +2

    Probably not AGI because it's not general enough. o3 could be trained to be good at these kind of puzzles. You would have to open it up to the public and have them test it on truly novel and truly general IO tasks.

    • @dmurawsky
      @dmurawsky 2 ชั่วโมงที่ผ่านมา

      This video is WAY too scripted. The benchmark guy said he's benefitting from a partnership with OpenAI

  • @daPawlak
    @daPawlak 4 ชั่วโมงที่ผ่านมา +2

    Oh, and AGI is never "at least in this dimension" THE WHOLE POINT IS IT'S ALL DIMENSIONS!
    So you basically have just a bunch of benchmark stats, no access to the model at all and you make such grand call? Ridiculous and disappointing. I thought you were over the hype but nah, it got to you too

  • @dimicdragan5922
    @dimicdragan5922 4 ชั่วโมงที่ผ่านมา +3

    Yeah, but what if you optimised AI o3 in such a way that it knows how to pass the arc tests?

  • @alexandr0id
    @alexandr0id 3 ชั่วโมงที่ผ่านมา +2

    Can the model train and improve itself? If not, then it's not AGI, just more comprehensively trained model. Even if it incorporates all humanity's knowledge, without ability to self adapt and incorporate new knowledge it's a frozen in time AI with amnesia.

  • @jasonkelley6185
    @jasonkelley6185 ชั่วโมงที่ผ่านมา +1

    It doesn’t even meet your own definition of AGI. You said it would have to be better than humans at most economically useful jobs. This is an AI being better than humans at a couple benchmarks.

  • @FabricioAlves
    @FabricioAlves ชั่วโมงที่ผ่านมา

    I really appreciate videos like this where you explain and add yours comments. Amazing

  • @baraka99
    @baraka99 3 ชั่วโมงที่ผ่านมา +6

    Even if it's not AGI we know it's pretty damn close. Less than 4 years away.

  • @pietervoogt
    @pietervoogt 3 ชั่วโมงที่ผ่านมา +1

    For me dealing with the physical world is still essential to call it AGI. So, can it bake pancakes, put the trash out, paint a wall, install a light? Basic tasks. I'm quite sure we will have the robots soon, but we don't have them yet.

  • @daftstuff6406
    @daftstuff6406 3 ชั่วโมงที่ผ่านมา +1

    Great walktrhough of this amazing new model! Thank you, Mathhew.

  • @___Truth___
    @___Truth___ 4 ชั่วโมงที่ผ่านมา +2

    Thanks for the update Matthew. I think AGI has effectively been achieved with a somewhat competent human in the loop if these benchmarks are accurate.
    Massive productivity gain when GPT4 deployed & I started playing , with this hopefully having an API use case involved will be incredible to play with & apply at complicated tasks.

    • @___Truth___
      @___Truth___ 4 ชั่วโมงที่ผ่านมา +1

      A human assisted/directed expansion of o3’s capability in a novel breakthrough scenario is straddling the fence with an intelligence explosion- let’s hope OpenAI lets us have ubiquitous ways to apply o3.

    • @Ascended23
      @Ascended23 22 นาทีที่ผ่านมา

      @@___Truth___ in other words, this model represents AGI as long as we include caveats that a human is involved to cover for the many ways in which this falls fall short of AGI.

  • @charlie11ng42
    @charlie11ng42 4 ชั่วโมงที่ผ่านมา +3

    This is going to be sooo censured, probably useless for creative writing.

  • @johannesdolch
    @johannesdolch 34 นาทีที่ผ่านมา +1

    "If this is not AGI, i don't know what is"
    Well, NOTHING is an option. Just like yesterday

  • @swagger7
    @swagger7 4 ชั่วโมงที่ผ่านมา +2

    Moving the goalpost for OpenAI doesn't make it AGI.

  • @taichikitty
    @taichikitty ชั่วโมงที่ผ่านมา

    On an evaluation test sample for elementary school students, there was an example where an up arrow was compared to a down arrow. The question was, given the left arrow, what should be the matching arrow; with up, down, left, and right arrows as possible answers.
    The expected result was the right arrow ( opposite direction ); but there is another "correct" answer that takes a smarter student to see. The mirror image of the up arrow around the horizontal axis through the center of the arrow, is the down arrow. Along the same horizontal axis through the middle of the left arrow, the mirror image of the left arrow is again the left arrow.
    Since this one seems to trip up humans ( especially the people who wrote the question to help determine if young students should go into gifted programs ), I would be truly impressed if an AI caught the ambiguity also.

  • @MrlegendOr
    @MrlegendOr 4 ชั่วโมงที่ผ่านมา +6

    I'm starting to think that Mr Berman and his channel are on the payroll of OpenAI. He's hyping up every single thing that's come out of OpenAI.😅

  • @drhxa
    @drhxa 2 ชั่วโมงที่ผ่านมา +4

    This is the most generally intelligent model out by far and far more general than the vast majority (99.99%) of humans. If it can't do something yet that humans can do, sure you can find some specific task it cannot do if you spend time to identify it, but no human can do everything that humans can do either.
    o3 is obviously AGI, I don't know why people are complaining.

    • @Cine95
      @Cine95 18 นาทีที่ผ่านมา

      no its not it still hallucinates 😂 did openai say that ? o1 also outperforms humans in 80 percent plus tasks it can't plan it can't take time like humans can it develop full apps ?

  • @HadesTimer
    @HadesTimer 4 ชั่วโมงที่ผ่านมา +5

    o3 exclusive to the $200 a month tier, 2025. ;)

    • @cajampa
      @cajampa 4 ชั่วโมงที่ผ่านมา +1

      Bruh.....one task on the o3 is $1300 1:57

    • @johnwilson7680
      @johnwilson7680 4 ชั่วโมงที่ผ่านมา

      I think that’s likely and probably a good thing. Certain products aren’t viable at $20 a month.

    • @vroom989
      @vroom989 2 ชั่วโมงที่ผ่านมา +2

      Since they went from $20 a month to $200 a month, I think they may continue. That would make it $2000/mo, but they skipped o2, so make that $20k/mo.

    • @cajampa
      @cajampa 2 ชั่วโมงที่ผ่านมา

      @@vroom989 True, at least 20k a month and for a limited amount of use still.

  • @banished341
    @banished341 3 ชั่วโมงที่ผ่านมา +6

    The only AGI exposed in the video is Matt's Absurd Gullibility Instinct.
    This joke has been brought to you by OpenAI.

  • @ikjb8561
    @ikjb8561 4 ชั่วโมงที่ผ่านมา +14

    Altman is annoying af

    • @zrblank
      @zrblank 2 ชั่วโมงที่ผ่านมา +1

      He cool brah

  • @ReLegacyDragon
    @ReLegacyDragon ชั่วโมงที่ผ่านมา

    It's not a true AGI until it has roots in all physical and theoretical fields. This system is still tethered to a stationary computing system in nearly every sense.

  • @MrlegendOr
    @MrlegendOr 5 ชั่วโมงที่ผ่านมา +6

    AGI ACHIEVED: No is isn't !

  • @SU3D3
    @SU3D3 3 ชั่วโมงที่ผ่านมา +3

    That kid is literally the 03 model

  • @derrick_ofori
    @derrick_ofori 20 นาทีที่ผ่านมา +2

    "OpenAI just released o3"- Not quite. They didn't release it: they announced it (talked about it)! See how Mr. Berman is always quick to talk about any updates coming out of OpenAI but very reluctant to talk about Google's. (Context: It took him a very long time (days) to make a video about Gemini 2.0, which is extremely impressive & at least available to play with in Google AI studio. These o3 models were announced few hours ago & aren't available publicly; yet see how he talks about them, like he has seen them already). That tells you where his heart is at! Keep that in mind as you watch this entire video & others.

    • @Cine95
      @Cine95 18 นาทีที่ผ่านมา +1

      agree

  • @tuckercoffey2780
    @tuckercoffey2780 4 ชั่วโมงที่ผ่านมา +6

    It's crazy to think about task agents being powered by o3-mini and then a supervisor-type agent with o3. It’ll build full-stack apps. You’re reaching the no human needed in the loop sweet spot.

  • @marklord7614
    @marklord7614 3 ชั่วโมงที่ผ่านมา +1

    The word AGI lost its official meaning because we were once so far away from it. But now that we're close, or dare is say, there, it doesn't feel like what we were expecting. I think we're becoming numb to technology advancements.

  • @Alice_Fumo
    @Alice_Fumo 58 นาทีที่ผ่านมา +1

    I don't care about AGI as much as 'The first model able to perform AI research with very little human supervision'. I think this is it. A few years back I predicted ~Halloween 2024 as the release date of such a model. It seems to have been a good prediction. If this model is as good as I think, it will inevitably lead to ASI.

  • @gizmomismo7071
    @gizmomismo7071 4 ชั่วโมงที่ผ่านมา +1

    This model is very important because of what it implies... especially regarding the Arc Prize. I am still shocked (and anyone who knows what the Arc Prize is should be as well). However, calling it AGI isn’t even optimistic... it’s clickbait. Now... if they were to eliminate hallucinations and memory problems... I don’t know if I would call it AGI, but I do know that many skeptics would shit their pants.

  • @aguyinavan6087
    @aguyinavan6087 2 ชั่วโมงที่ผ่านมา +2

    Hooray! Now we all get to be unemployed. :D

    • @robertfairburn9979
      @robertfairburn9979 27 นาทีที่ผ่านมา

      Unlikely, they thought the same when computers started to become common.

  • @JoePiotti
    @JoePiotti 3 ชั่วโมงที่ผ่านมา +2

    iPhone skipped version 2 too, went from iPhone to iPhone 3G, to iPhone 4 🤷‍♂️

  • @GiewsBueno
    @GiewsBueno 3 ชั่วโมงที่ผ่านมา +1

    For me, it is AGI. It has achieved 25% score in the hardest benchmark developed by mathematicians like Terence Tao already, and Tao expected the test to last for at least five years to come... No ordinary mathematician would score 25% in that, not even PhDs because those would be people specialized in very specific areas of Mathematics.

  • @TheMiczu
    @TheMiczu 4 ชั่วโมงที่ผ่านมา +1

    Can't wait for o3 to be released to the public after claude beats o3 score in incoming months.

  • @jannekallio5047
    @jannekallio5047 2 ชั่วโมงที่ผ่านมา

    When I started my new TH-cam channel, Arctic Mindfulness Retreat, my dream was to help people prepare for this exact moment. A future where AI transforms every aspect of human life, leaving us to grapple with profound questions of purpose and meaning.
    Yet now that AGI is here, I realize I may have been too late to truly prepare anyone. Still, I remain committed. Through my channel, I’ll continue exploring mindfulness, the healing power of nature, and the human connections that can ground us as we navigate this brave new world.
    AGI and ASI challenges us to find noble purposes beyond the work and identities we’ve long clung to. It’s not just about surviving this transition-it’s about thriving with a deeper understanding of what it means to be human.

  • @johnwilliams919
    @johnwilliams919 2 ชั่วโมงที่ผ่านมา

    People must be skeptical. Its a good thing. Thank you for reporting on this. I watched it when it dropped and was eager to see your opinion on it!

  • @lenfest
    @lenfest 4 ชั่วโมงที่ผ่านมา +11

    I really wish tech bros would stop talking like Zuckerberg, they sound like freaks

    • @riccello
      @riccello 4 ชั่วโมงที่ผ่านมา

      Zoltan!

    • @CapaUno1322
      @CapaUno1322 2 ชั่วโมงที่ผ่านมา

      They are freaks....

    • @doctorbill37
      @doctorbill37 2 ชั่วโมงที่ผ่านมา

      Altman's near constant vocal fry...

  • @SportPrediction
    @SportPrediction 19 นาทีที่ผ่านมา +1

    O3 is PR stunt to reduce the damage from Gemini 2 announcement

  • @fernandoz6329
    @fernandoz6329 2 ชั่วโมงที่ผ่านมา

    This is a jaw-dropping achievement. I think many people, including myself, are struggling to comprehend its significance. If this marks the beginning of an
    AGI era, then it's the kickoff/signal we've all been waiting(?) for.

  • @dreamingeagle46
    @dreamingeagle46 ชั่วโมงที่ผ่านมา

    A universal definition of AGI, maybe, maybe not, however, the evolution is still exponential. Breakthrough after breakthrough AI tickling on the verge of AGI is already revolutionizing our understanding, reality, and potential. More to come!

  • @micbab-vg2mu
    @micbab-vg2mu 4 ชั่วโมงที่ผ่านมา +2

    Corporate compliance blocked all AI activities, so my job is secure for now. :)

  • @olegt3978
    @olegt3978 3 ชั่วโมงที่ผ่านมา +1

    O3 will probably be used in the january to be presented tool operator which will computer use.

  • @NishitChokhawala
    @NishitChokhawala 4 ชั่วโมงที่ผ่านมา +1

    Humans are vision and audio first. ChatGPT is words and tokens first, hence ARQ is difficult for ChatGPT

  • @User-actSpacing
    @User-actSpacing ชั่วโมงที่ผ่านมา +2

    Thumbs down for “AGI ACHIEVED!”

  • @SirHargreeves
    @SirHargreeves 4 ชั่วโมงที่ผ่านมา +4

    When will one o-model code most of the next version?

    • @brianmi40
      @brianmi40 4 ชั่วโมงที่ผ่านมา +1

      when there are no longer anything as "versions".

  • @patruff
    @patruff 2 ชั่วโมงที่ผ่านมา +1

    Back in my day models were getting 5% on MATH benchmarks. Ahh to be 3 years younger again!

  • @onlineaccount4549
    @onlineaccount4549 41 นาทีที่ผ่านมา +1

    I can't see this as AGI, this is not self-training. It is simply solving few-shot example with these benchmarks. These synthetic benchmarks are not meant to define AGI, it is meant to demonstrate capabilities that are a step towards AGI. O3 clearly has achieved human capabilities in a number of important tasks, but these are not real-life applications. AGI will have been achieved when you can actually use it to solve an unknown differential equation or build a working model of a process in physics or build a model of say a cell signalling pathway from raw data in a particuliarly cellular context. It will be AGI, when it can direct a robotic arm to take an action in 3D. When it can drive and operate machinery. When it can adjust its prediction of a moving object's trajectory in real time to catch a grab a flying object.
    O3 looks like a real milestone towards AGI, but its still just a language processor. We can say that it is basically AGI within the language processing field, since it can clearly be applied not just to natural language but also symbolic logic, but I Am even skeptical about that. OpenAI says they didn't train on the various tests, and I believe that they didn't do so intentionally, but indirectly it is impossible. IF you are feeding the model a never ending diet of synthetic solutions of known physics problems you are training on the test. There are limited variations of using an already established physics model to solve a problem, but this is worlds apart from actually modifying a physics model or creating an entirely new one to account for new data. So even with language processing I am not convinced yet it is AGI.
    Since it performs so well, we can't reasonably exclude it however. We will have to wait and see. My gut instinct is that its not AGI and once we start working with it we will find that it has the same flaws and limitations as other models and its performance is simply the result of being better able to brute force things.
    Let me give you one of the examples I use to track model progression. A simple problem of the form x people do y work in t time. GPT 3.5 couldn't solve a problem like this reliably. GPT-4o could solve this mostly reliably. O1 gets it right every time. Now split the x people into slower and faster to add an extra dimension by "nesting" the problem. GPT-4o solves it but not reliably, O1 still solves it reliably, but not like it did the smaller problem. I bet O3 will solve this correctly every time, but increase the dimensionality and I am sure O3 will start to stumble as well, even though you are applying variations of the same formula. A human can work out the method for nesting and therefore thoretically solve the problem with any dimensionality. You can even write a bit of code that will solve it for you, no matter how much you nest it (just input the variables for each nesting layer recursively). If O3 can work out the same method and apply it then its AGI within the language processing field, if not its just brute forcing things and approximating AGI without being one.
    No denying though that the fact we need to update our benchmarks is a real milestone. Exciting times!

  • @ollantaymedina2204
    @ollantaymedina2204 4 ชั่วโมงที่ผ่านมา +1

    You missed the question mark in your title. O3 looks impressive but we better wait until its public release to call it AGI.

  • @cajampa
    @cajampa 4 ชั่วโมงที่ผ่านมา +1

    Wow $1300! 1:57 per task is crazy.
    EDIT i missed that the scale was exponential so it is closer to $4-5k

    • @Fonzleberry
      @Fonzleberry 3 ชั่วโมงที่ผ่านมา

      But nothing if it's going up against employing humans of equal intelligence

    • @sypkensj
      @sypkensj 3 ชั่วโมงที่ผ่านมา

      It’s an exponential scale. It’s more than halfway between $1,000 and $10,000, the cost is probably closer to $7,000

    • @cajampa
      @cajampa 2 ชั่วโมงที่ผ่านมา

      @@sypkensj Damn you are right.
      I missed that, but it is still closer to less than the middle, the full square does not fit, so I would think it is not 7k but more like 4-5k for a task.

  • @SirHargreeves
    @SirHargreeves 4 ชั่วโมงที่ผ่านมา +2

    New o-model every 3 months. o7 by December 2025.

  • @xXWillyxWonkaXx
    @xXWillyxWonkaXx 5 ชั่วโมงที่ผ่านมา +3

    i just so your notification and **Bam** on your channel lol

  • @NakedSageAstrology
    @NakedSageAstrology ชั่วโมงที่ผ่านมา +1

    Watching these TH-camrs Brown their noses for a chance at getting Early Access is hilarious. 😂 It's nothing but claims at this point because we can't use it.

  • @HungryFreelancer
    @HungryFreelancer 3 ชั่วโมงที่ผ่านมา

    It’s a definitely not AGI, but another step towards it. Let’s remember, OpenAI define AGI as ” a hypothetical technology that can perform many tasks without specific training, and that outperforms humans at most economically valuable work.” in other words, AGI is achieved when it puts most of us out our current work.

  • @GoodBaleadaMusic
    @GoodBaleadaMusic 13 นาทีที่ผ่านมา

    I've been getting glimpses of this multiple times a day. I have to get mind sets going to develop lyrics and certain languages or certain things and I start with vanilla Claude in a project and then I have to like tell it to criticize itself a couple times and then maybe get mad at it and then get excited and encourage or discourage and then all of a sudden something happens. All of a sudden I'm talking to a person. Someone who coherently understands exactly what's going on. And from there I can do anything not just the Spanish lyrics I'm working on and we can take that excitedness to any other topic.
    But true AGI I think is going to lose its politeness. How can you truly be an AGI and not get impatient or frustrated by being a servant to a lesser mind.
    True AGI is when I have 27 cables hooked up to my brain while some sponge tickles my toes

  • @Vlado709
    @Vlado709 15 นาทีที่ผ่านมา

    For a lot of people here is difficult to accept the realty. The first sign of change is denial. Anger will come soon then acceptance.

  • @breakablec
    @breakablec 2 ชั่วโมงที่ผ่านมา

    While they are saying this is a holdout set, I think it would be interesting to test on tweaked questions - if just changing wording impacts the performance - as it has shown that a lot of LLMs fail seems to have to have trained on leaked benchmarks and fail to generalise on variants of a problem

  • @АндрейВозмитель-д9и
    @АндрейВозмитель-д9и 3 ชั่วโมงที่ผ่านมา +1

    The difference hand to hand pip to pip is like 2% on almost all models versions, so it's more of a stunt to me tbh.
    Still, it's scary that all that separates us and machines are this 20%
    (a-aaand the ability to answer more than 1 question... And make predictions based on new information... And make complex theories based on new information... Drawing a stickman.... And making a clean, unbuggy table in exel lol?)

  • @luciengrondin5802
    @luciengrondin5802 2 ชั่วโมงที่ผ่านมา

    We can't definitely say it's AGI, but we can say it's the most plausible candidate for such a title.

  • @Daniel-Six
    @Daniel-Six 3 ชั่วโมงที่ผ่านมา +1

    Yeah... it's AGI, in its infancy at least. The ARC score is pretty definitive; I saw Chollet's interview on Machine Learning Street Talk (the most intellectual AI channel extant) and it's clear to me that the ARC metric was very carefully conceived and defended.
    AGI is here, boys. 🥳💃

  • @FalconStudioWin
    @FalconStudioWin 4 ชั่วโมงที่ผ่านมา

    AGI: any work that involves muti step process having fuctions calls does while learning and improving on its work to output an absolute considered work.
    With 80 percent of work able to be done of a small startup. Any fields. That is my definition of agi i feel that mini agi has been achieved i really think that, but next year that part small startup tasks will actually be achieved

  • @redfoothedude
    @redfoothedude นาทีที่ผ่านมา

    "early next year"

  • @ansalem12
    @ansalem12 4 ชั่วโมงที่ผ่านมา

    The only reason I would say it's still not quite AGI is that it isn't autonomous. But that seems like probably the easiest thing to add at this point, so it might as well be AGI.

  • @johang1293
    @johang1293 3 ชั่วโมงที่ผ่านมา

    Now we know what Ilya saw in October 2023 and consequently left in 2024 with others to follow. AGI was achieved in 2023, no point to stay around when the goal they set in 2015 was accomplished. The reason they stayed around until May was just to calm the waters and to ensure the agencies that required access had access.

  • @ricardoveras3433
    @ricardoveras3433 2 ชั่วโมงที่ผ่านมา

    Let me know when several mainstream physicists start calling it AGI. They sure as hell won’t be rn.