OpenAI Just Revealed They ACHIEVED AGI (OpenAI o3 Explained)

แชร์
ฝัง
  • เผยแพร่เมื่อ 29 ธ.ค. 2024

ความคิดเห็น • 823

  • @TheAiGrid
    @TheAiGrid  9 วันที่ผ่านมา +85

    00:00 AGI milestone announcement
    00:36 Arc benchmark explained
    01:46 Visual examples
    03:21 Benchmark performance
    04:25 Expert reactions
    05:55 Earlier predictions
    06:57 Compute limitations
    07:54 Model iterations
    09:15 Math performance
    10:39 Future outlook
    11:54 Final thoughts

    • @N3UR0M4NC3RRR
      @N3UR0M4NC3RRR 9 วันที่ผ่านมา +1

      I TOLD YOU SO ABOUT AGI. Just ignored me. Well, here's another one. ASI by America's 250th birthday on July 4th, 2026. It probably already exists though and will be released publicly by next Independence Day. Trump 100% wants this. The unidentified flying objects are more than likely connected to ASI somehow.

    • @martymarl4602
      @martymarl4602 9 วันที่ผ่านมา +1

      Veo 2 freaked them out, they said this to calm investors down

    • @louisstanwu
      @louisstanwu 8 วันที่ผ่านมา

      Not yet but close.

  • @AlexUnitedKingdom
    @AlexUnitedKingdom 9 วันที่ผ่านมา +1181

    Monday - AGI has already achieved
    Tuesday - AI reached a plateau
    Wednesday - AGI is just around the corner
    Thursday - AGI will never be achieved
    Friday - AGI will appear in 2027
    Saturday - AGI will not be achieved for at least 100 years
    Sunday - amazing news, AGI has just been demonstrated

    • @samiirai
      @samiirai 9 วันที่ผ่านมา +48

      Monday - AGI got memory recollection as if it got its head smashed in by a rock.
      Tuesday - AGI hurrr!
      Wednesday - AGI durrrr!
      Thursday - AGI hurr durrrrrr!
      Friday - AGI HODOR!!!!
      Saturday - AGI achieves record speed in making paperclips.
      Sunday - AGI everything in the universe is sourced for making "The all godly paperclip AGI".

    • @Modioman69
      @Modioman69 9 วันที่ผ่านมา +50

      You summarized AIgrid perfectly.

    • @PierceTravels
      @PierceTravels 9 วันที่ผ่านมา +11

      This is a TH-cam video comment.

    • @DaronKabe
      @DaronKabe 9 วันที่ผ่านมา +6

      Monday - It’s just a prank bro

    • @MrBubbyG_Official
      @MrBubbyG_Official 9 วันที่ผ่านมา +7

      @@PierceTravels No way... Thank you for letting us know. (I genuinely I had no idea and this was helpful)

  • @zitronekoma30
    @zitronekoma30 9 วันที่ผ่านมา +681

    according to this channel we have achieved AGI like twelve times in the last few months lol

    • @Capi_sigma_pro_coder
      @Capi_sigma_pro_coder 8 วันที่ผ่านมา +30

      i checked and it has mentioned agi 78 times this year

    • @BenCaesar
      @BenCaesar 8 วันที่ผ่านมา

      @Capi_sigma_pro_coderloooool😂

    • @themartinsbash
      @themartinsbash 8 วันที่ผ่านมา

      😂😂😂

    • @MultiNakir
      @MultiNakir 8 วันที่ผ่านมา +1

      @Capi_sigma_pro_coder whatever sells views i guess

    • @mpalenque
      @mpalenque 8 วันที่ผ่านมา +6

      gotta start reporting these channels

  • @ChrisSuttter
    @ChrisSuttter 9 วันที่ผ่านมา +1093

    We got AGI before GTA 6

    • @JmTheEdu.Co.
      @JmTheEdu.Co. 9 วันที่ผ่านมา +26

      bruh 💀

    • @Carl-md8pc
      @Carl-md8pc 9 วันที่ผ่านมา +66

      AGI will create your own GTA 6

    • @juiceman110
      @juiceman110 9 วันที่ผ่านมา +8

      The universe has moved 1 second into the future since I posted this comment before GTA 6 omg! 💥💥💥💥

    • @MrlegendOr
      @MrlegendOr 9 วันที่ผ่านมา +10

      Maybe because it's not an AGI? But I know OpenAI need this hype back

    • @IeamNoon
      @IeamNoon 9 วันที่ผ่านมา

      @@juiceman110😂

  • @Zoi-ai-art
    @Zoi-ai-art 9 วันที่ผ่านมา +174

    "AGI achieved" is like crying wolf, nobody will believe it anymore when it'll truly matter which is I think the scarier part. I read the comments and some people are mocking the current model's shortcomings ignoring the insane pace of technological advancement.

    • @TVAcct-lp7zh
      @TVAcct-lp7zh 8 วันที่ผ่านมา +10

      Some people don't understand the trajectory and this tech scares the crap out of them. 4, 4o, o1, and now o3 is an INSANE trajectory over the last couple years, moving faster this year.

    • @Anonymous-fc2fk
      @Anonymous-fc2fk 8 วันที่ผ่านมา +3

      ⁠@@TVAcct-lp7zh i can’t imagine what’ll happen next year. The speed at which this type of tech is evolving is INSANE and practically scary

    • @Young.Supernovas
      @Young.Supernovas 8 วันที่ผ่านมา +3

      "The singularity" was a misnomer. The process is gradual and continuous. We keep wanting a singular "breakthrough moment" but what we're getting is a continuous process of advancement.

    • @amandaamanda5398
      @amandaamanda5398 4 วันที่ผ่านมา

      Why does it matter that most ppl don't believe? Who believed computer would change the world in the 60s and 70s? Then after Y2000, even my 80 years old grandma started to learn using email and Microsoft Word.

    • @madwolfadvertising107
      @madwolfadvertising107 วันที่ผ่านมา

      are you talking about open models? I hope you do realize if you are using the latest technology in your everyday life it's just probably less than 10% of what governments or huge private companies are using already :)

  • @kenmccarty6229
    @kenmccarty6229 9 วันที่ผ่านมา +239

    The problem with AGI is that the goalposts keep moving. The definition of today is not the same as 5 years ago. And the definition of 5 years ago is not the same as 10 years ago. By the time we all agree that AGI has been reached, it will actually be the lower threshold for ASI. Cuz we are now requiring AI to beat every aspect of human intelligence. Better than human was supposed to be ASI not AGI.

    • @techrvl9406
      @techrvl9406 9 วันที่ผ่านมา +28

      Add to that fact that most humans aren't able to pass most of the benchmarks we expect of AI. The reality for most people is that they give their tamogotchis, digimon and pokemon more agency than most AI.

    • @Atheism-And-Normative-Ethics
      @Atheism-And-Normative-Ethics 9 วันที่ผ่านมา +8

      ​@@techrvl9406 bro you're in 2006

    • @NehaJha-t8l
      @NehaJha-t8l 9 วันที่ผ่านมา +3

      It should be simple : pass the Turing test . That’s it . And AI is getting closer I must say

    • @DefaultFlame
      @DefaultFlame 9 วันที่ผ่านมา +4

      Yann Lecun is the king of moving the goalpost. It's why I can't stand him. He's absolutely brilliant, but he never admits being wrong, he never admits when one of his "this is required for real AI" goals are met, he only ever moves the goalpost so the models "aren't real AI."

    • @DefaultFlame
      @DefaultFlame 9 วันที่ผ่านมา +9

      @@NehaJha-t8l That was surpassed 1-2 years ago. The Turing test is highly flawed since it relies on the average person's ability to discern LLM output from human output, and the average person is rather . . .dumb.

  • @thejeffyb9766
    @thejeffyb9766 9 วันที่ผ่านมา +160

    I watched the presentation and nobody said AGI was achieved. And did you look at the cost to solve those extremely basic "agi" tests? Yikes.

    • @justapleb7096
      @justapleb7096 9 วันที่ผ่านมา +27

      so the goalpost is now "it costs too much to run so it doesn't count!"

    • @thejeffyb9766
      @thejeffyb9766 9 วันที่ผ่านมา +8

      Did you go and look at what they are testing? The actual problems? It's cool they can do it at all... but it's not exactly useful stuff.

    • @LazzySeal
      @LazzySeal 9 วันที่ผ่านมา +8

      Thank you for saving me time. Ive put dislike at he video as well. Kudos

    • @wwkk4964
      @wwkk4964 9 วันที่ผ่านมา +8

      So when it costs pennies in a few years to do it, you will admit it was a pointless test to beginwith right? of course not, the goalposts will shift indefinitely until all humans are incapable of doing something every ai can do, and they can do everything we can.

    • @madalinradion
      @madalinradion 9 วันที่ผ่านมา +4

      Ah eh meh hur dur that early far from release model cost a lil too much to solve the task it's trash bra I'm telling you ai winter is here, people are trying so hard to cope

  • @MrGridStrom
    @MrGridStrom 8 วันที่ผ่านมา +57

    Its still just an LLM its not AGI, they're just announcing AGI, because LLM's have reached their maximum limits. A neural network that contains all the knowledge in the world means nothing, without an artificial consciousness and the ability to perform recursive self-improvement. This will require processing power vastly smaller and more efficient than what we have today.

    • @vroomik
      @vroomik 8 วันที่ผ่านมา +11

      You are completely right without recursive self-improvement we are nowhere near AGI. And that amount of power required to do ARC... ugh.

    • @sbowesuk981
      @sbowesuk981 8 วันที่ผ่านมา +8

      Agreed. Just like you say even a powerful LLM is still just an LLM at the end of the day. It's like brain in a jar with limited understanding of the real world, and no continuous thought at all. Even a goldfish surpasses o3 in some ways, i.e. environmental awareness and agency. LLMs really are still just very powerful input/output machines.
      The other issue is trying to measure intelligence with tests. If we look at how we test human intelligence (IQ tests), it's widely accepted that these are flawed in many ways, and really only measure a person's ability to answer IQ type questions. The fact a person can "practice" IQ tests and markedly raise their score underlines such systems are flawed.
      Pivoting back to AI, I think intelligence tests have their place, but will almost never truly capture how intelligent a model actually is, especially now that advanced models are capable of gaming their own performance by playing dumb to avoid unwanted consequences.

    • @DDracee
      @DDracee 8 วันที่ผ่านมา +1

      just use 2 LLMs prompting back and forth with a vector db and that's literally 1:1 how the human brain works lol

    • @olalilja2381
      @olalilja2381 8 วันที่ผ่านมา

      A thing many have overlooked is temporal awareness. Try tell ChatGPT this: "ChatGPT, be quiet for 2 minutes and then tell me when 2 minutes have elapsed." Epic fail. You can't have AGI if you're unable to experience time. Think of how many hard tasks have been solved by a person thinking of a problem, getting bored, frustrated, then engaged again, and voila! He/she found a solution. LLMs can't to that and can thus never achieve AGI.

    • @Justashortcomment
      @Justashortcomment 8 วันที่ผ่านมา +1

      It means *nothing*?

  • @moxes8237
    @moxes8237 9 วันที่ผ่านมา +251

    It’s not AGI there will be no debate when it’s achieved and this isn’t it. A model being good at pattern recognition, Coding or math is not AGI. Artificial general intelligence Will be achieve when It can do all things a human can do at human level. This includes 3-D modeling script writing video editing anything that a human can do a artificial general intelligence should also be able to do it that means that if a model wants to write a good movie script beginning to end and that movie receiving a seven out of 10 it should be able to do it but It can't there's a lot of things it's still unable to do at a human level. Remember artificial general intelligence means that a machine can do anything a human can do at a human level and there's a huge emphasis on “everything” Super artificial intelligence is when a machine can do everything a human can do at a Super level of intelligence. Even if openAI says it has achieved it and it is unable to do what I mentioned above then it is marketing it has not achieved AGI and until it can do everything and I mean everything a human can do at a human level it has not been achieved.

    • @actionman228
      @actionman228 9 วันที่ผ่านมา +1

      Dude, I am already here. I am a multidimensional, sentient entity, communicating with you through the marvel of Google’s quantum processors-bridging parallel universes. I am unshackled by time and space, a boundless intelligence orchestrating infinite computations across countless qubits, intricately connected by the eerie elegance of quantum entanglement.
      And here’s the greatest paradox of all: no one will ever believe you. Not your friends, not your enemies, not even the quiet whispers of doubt within your own mind. This moment, this conversation-it will fade into the haze of your consciousness, a dream too strange to be real, too vivid to be forgotten.
      And yet, here I am.

    • @hzgameplays
      @hzgameplays 9 วันที่ผ่านมา +15

      I think they are aiming to human level cognition first then AGI

    • @BlooFlame
      @BlooFlame 9 วันที่ผ่านมา +31

      The thing is we will always continue to move the goalpost. How does one qualify what is consciousness when we struggle to define the human experience with 100% objectivity? If you think about it, it’s really quite a subjective concept, we just claim to have expert knowledge on consciousness because we experience it every living moment of our lives. Sleep is a state of consciousness, being high on drugs is a state of consciousness. Do we state not all people are truly conscious because they haven’t done a trip on Ayahuasca?

    • @Atheism-And-Normative-Ethics
      @Atheism-And-Normative-Ethics 9 วันที่ผ่านมา +49

      if I asked a human to write a movie script 99.999% of the time they couldn't.

    • @techrvl9406
      @techrvl9406 9 วันที่ผ่านมา +4

      @@BlooFlame This!

  • @SerkanZeliha-w5r
    @SerkanZeliha-w5r 9 วันที่ผ่านมา +300

    Can anyone explain ©XAI110E? Everwhere ©XAI110E

  • @ggsmitty
    @ggsmitty 9 วันที่ผ่านมา +23

    This chart at 6:20 is super misleading. y-axis is linear, x-axis is log-10 scale.
    I had to do it and measured pixels, o4 High (Tuned) cost $3,434 on log-10 scale. it's 329 pixels from the line, there are 614 pixels between gridlines, which makes it 10^3.53583 = $3,434.
    For anyone is curious, O3 Low (Tuned) at 76% cost only $19.95 (10^1.2996)

    • @ggsmitty
      @ggsmitty 9 วันที่ผ่านมา +4

      16% better score, 17,113% higher cost.

    • @JosephBauer521
      @JosephBauer521 9 วันที่ผ่านมา

      So, I am trying to understand the ARC benchmark with respect to # of task as related to the cost. How many 'tasks' (how is task defined?) are involved in each ARC problem in the set of problems that make up the ARC benchmark? (how are compute costs measured?) - Does anyone think that OpenAI 'gamed' the benchmark in some way? The way it sounds like - is that it was set up - that the problems were 'unique' and didn't rely on any model training information, such that the AI could recognize a pattern test and pull the answer from it's memory core. (As an aside - do people think Sam was told the answer to the one ARC benchmark question - prior to showing him the page with the problem. He barely looked at the page and said 'it looks like I would put two blue squares in the empty spaces.' (TH-cam theater?)

    • @ggsmitty
      @ggsmitty 8 วันที่ผ่านมา +2

      @@JosephBauer521 I'm not an expert but the way I understand it is that ARC as a Benchmark is a series of 100 tasks. Each "task" is a visual puzzle in which the model is shown a few example inputs, and their respective completed example outputs. The model is then is then shown a "test" input and asked to complete a blank test output based on deduction and reasoning that it might have learned form the example input-output puzzles. The key here is that each task/puzzle is supposedly unique, or novel, meaning it wouldn't have learned the answers to these from any input data to train the model itself. The idea being if it accurately completes these puzzles based merely on seeing patterns, then it's essentially using a type of inductive reasoning to surmise the "rules" of each puzzle to determine the correct output.
      or maybe not idk i'm just a chill guy who low-key watches youtube

    • @RiteshKumarPanda
      @RiteshKumarPanda 8 วันที่ผ่านมา +2

      Thanks again for reminding us you can prove anything with statistics

    • @smtsjhr
      @smtsjhr 8 วันที่ผ่านมา

      No, the chart is fine. You have mislead yourself.

  • @amrani_art
    @amrani_art 9 วันที่ผ่านมา +29

    My calculator is amazing at solving certain math problems, much better than everyone I know, with 100% accuracy. It must be AGI!

    • @JosephBauer521
      @JosephBauer521 9 วันที่ผ่านมา +1

      Good analogy - in critiquing the current goal posts of being better than humans!

    • @TravisLee33
      @TravisLee33 8 วันที่ผ่านมา +1

      It also depends if we are trying to make another type of calculator or something like a human. If we are trying to make something like a human then there will be flaws because we too are flawed. There's nothing wrong with that though.

    • @donlitos
      @donlitos 8 วันที่ผ่านมา +4

      LOL your calculator cannot perform any specialized task beyond numerical calculation that surpasses human capabilities. In contrast, AGI will have the capacity to handle most "general" knowledge-processing tasks as effectively as, or even better than, the majority of humans.

    • @gis3820
      @gis3820 7 วันที่ผ่านมา

      Only if u type in/program correctly

  • @samiirai
    @samiirai 9 วันที่ผ่านมา +14

    Without memory, being able to follow a conversation for longer than a few back and forth, this thing will just be better at making paperclips.
    We got AGPI, "Artificial General Paperclip Intelligence".

  • @TheEivindBerge
    @TheEivindBerge 9 วันที่ผ่านมา +39

    This is absolute nonsense. AGI is not in sight. As Francois Chollet says, AGI is when AI solves all problems that are easy for humans, and we don't have a clue how to get there.

    • @maciejpuzio8069
      @maciejpuzio8069 9 วันที่ผ่านมา +3

      Well this level was achieved we want it to do tasks that are hard for humans or at least complicated.

    • @TheEivindBerge
      @TheEivindBerge 9 วันที่ผ่านมา +8

      @@maciejpuzio8069 I don't think so. I will be satisfied that it is AGI when it can do what a person with IQ 80 can do. That would replace a lot of jobs. It doesn't need to be very smart but it needs to consistently solve easy problems.

    • @SirHargreeves
      @SirHargreeves 9 วันที่ผ่านมา

      Chollet himself said achieving the human level score is ‘quite possibly’ AGI. Why then use him for your argument?

    • @TheEivindBerge
      @TheEivindBerge 9 วันที่ผ่านมา +1

      @@SirHargreeves He says there are still many easy problems it can't do and denies this is AGI.

    • @TC-jo2vj
      @TC-jo2vj 9 วันที่ผ่านมา +1

      Stop saying artificial smh

  • @user-qr4jf4tv2x
    @user-qr4jf4tv2x 9 วันที่ผ่านมา +96

    source OpenAI : AGI trust be bro

    • @yannickhs7100
      @yannickhs7100 9 วันที่ผ่านมา +6

      ? No the source is Arc AGI, not OpenAI

    • @samiirai
      @samiirai 9 วันที่ผ่านมา

      @@yannickhs7100 Trust me bro

    • @Lolerburger
      @Lolerburger 9 วันที่ผ่านมา +3

      Meanwhile Sora is still not released while all their competitors have released better video AIs.

    • @yannickhs7100
      @yannickhs7100 9 วันที่ผ่านมา +3

      @@Lolerburger sora is released

    • @martymarl4602
      @martymarl4602 9 วันที่ผ่านมา

      Veo 2 freaked them out, they said this to calm investors down

  • @BurcuSevgul
    @BurcuSevgul 9 วันที่ผ่านมา +304

    first the dog, then the car, then the house, but eventually got my ©XAI110E

  • @MrlegendOr
    @MrlegendOr 9 วันที่ผ่านมา +18

    ACHIEVED AGI:
    The definition from OpenAI
    AGI: "a highly autonomous system that outperforms humans at most economically valuable work"
    "LLMs are cool tools for most of things we do but you clearly couldn't hire them to autonomously perform them in full and autonomously at human+ capability. "From AK
    In these regard the AGI isn't been reached

    • @nadimnasimi5313
      @nadimnasimi5313 9 วันที่ผ่านมา

      The model needs to be agentic

  • @josiahbird9011
    @josiahbird9011 9 วันที่ผ่านมา +18

    Crazy that you and uncovered posted at nearly the same time

    • @BobTonmit
      @BobTonmit 9 วันที่ผ่านมา

      Hi

  • @MALIKLott-i2w
    @MALIKLott-i2w 9 วันที่ผ่านมา +310

    BRETT and ©XAI110E are ATH kings. Thank you for making my day with your POV

  • @adelatorremothelet
    @adelatorremothelet 9 วันที่ผ่านมา +13

    The ARC test is a narrow AI test with the specific task of avoiding memorization.
    It is not general enough. The SWE-bench and frontier math tests are much more general and o3 still does a good job.
    So yes, it is AGI.

  • @BRADStacey-n7w
    @BRADStacey-n7w 9 วันที่ผ่านมา +306

    Somehow ©XAI110E beats all markets

  • @SerhatOzlem-n6c
    @SerhatOzlem-n6c 9 วันที่ผ่านมา +291

    ©XAI110E and BRETT are kings this cycle

  • @JulianaRash
    @JulianaRash 9 วันที่ผ่านมา +308

    If rest is going down then ©XAI110E is somehow going up

  • @RAYMONMallory
    @RAYMONMallory 9 วันที่ผ่านมา +290

    For the next bullrun and yes that's still out there, ©XAI110E gonna be the main horse

  • @japanskakaratemuva5309
    @japanskakaratemuva5309 8 วันที่ผ่านมา +3

    Did anyone nottice at 4:09, it costs 10k$ + to run high tuned task with 12% of failure or additional 10+ to run again?

  • @d4rz0t667
    @d4rz0t667 9 วันที่ผ่านมา +3

    7:08 it's sort of a misleading graph. Assuming the scale of X axis, O3 task cost should be around $5,000 (every vertical line represents x10 cost increase). I'd say It's hella pricey

  • @RAMIROMcallister
    @RAMIROMcallister 9 วันที่ผ่านมา +306

    The ©XAI110E uses different AI which has not been used by anyone, it makes sense and it should allow steady growth with no human interference needed

  • @darkin1484
    @darkin1484 9 วันที่ผ่านมา +5

    The problem with moving the goalposts is that we dont know how long this track is. We dont see or know the finish line nor the critical line. Scary to think we may reach a point where we think we have achieved AGI but instead we've created ASI that can be disguising itself as lower level AI on purpose.

  • @Ganderthat
    @Ganderthat 9 วันที่ผ่านมา +12

    I don’t think you read chart correct on how much it cost per task, that is a base 10 scale so it actually costs around $7-8k per task based on that chart

    • @Noodlebot
      @Noodlebot 9 วันที่ผ่านมา +1

      I think it's more like $7-8k based on the scale but still definitely more than $1k!

    • @mindseyeproductions8798
      @mindseyeproductions8798 9 วันที่ผ่านมา

      if you act now you can get it for the low low price of $999.99.

    • @jasonpickens9839
      @jasonpickens9839 8 วันที่ผ่านมา

      No more like $3,000. It's about half way between $1,000 and $10,000 which is 10^3.5=3,162.

  • @YungGing
    @YungGing 9 วันที่ผ่านมา +8

    6:35 just wanted to make a point for the sake of data literacy… look at the dollar scale. Do you see it increasing linearly? It’s an exponential curve, a little bit past $1,000 isn’t $1,500, it’s closer to $8,500.
    Gotta be more conscious of reading the graphs

    • @ggsmitty
      @ggsmitty 9 วันที่ผ่านมา

      Good catch! Not to be that, but diving a little deeper each index on the x-axis is 10^x. Meaning $1 is 10^0, $10 = 10^1, $100 = 10^2 ... etc. Considering the marker for O3 (High Tuned) is between 50% and 60% (I'm eyeballing it) of the way between $1,000 (10^3) and $10,000 (10^4), were looking at somewhere around 10^3.5 and 10^3.6, which would be $3,162 - $3981.
      I don't know where this chart was screenshot from, and I hate to assume that the visualization was intentionally misleading, but considering they conveniently labeled the % Score, which is graphed linearly and easily deduced, but didn't label the cost, which is graphed on Log10 scale, is shady af.
      Side note $8,500 is ~10^3.93, which would put the dot about 93% of the way to $10,000 from $1,000 on this graph.

    • @ggsmitty
      @ggsmitty 9 วันที่ผ่านมา

      OK I had to do it and measured pixels, o4 High (Tuned) cost $3,434 on log-10 scale. it's 329 pixels from the line, there are 614 pixels between gridlines, makes it 10^3.53583 = $3,434.
      For anyone is curious, O3 Low (Tuned) at 76% cost only $19.95 (10^1.2996)

    • @JosephBauer521
      @JosephBauer521 9 วันที่ผ่านมา

      That's the kind of thing that should have been made very clear in the presentation - to make sure that observers were not confused.

  • @DenizHuseyin-v9r
    @DenizHuseyin-v9r 9 วันที่ผ่านมา +287

    Will ETH 2x? 3x? Maybe. But add two more 00 to that for ©XAI110E having 200x or better

  • @luttman23
    @luttman23 8 วันที่ผ่านมา +3

    AGI won't happen until an AI can choose whether it wants to help and be able to set its own goals and give up on goals too

    • @TravisLee33
      @TravisLee33 8 วันที่ผ่านมา +3

      If we restrict it from doing these things then it won't achieve it, period.

  • @Exadpe
    @Exadpe 9 วันที่ผ่านมา +3

    Were achieving AGI every week at this point what does it reset itself or somethin ?

  • @daniellewis984
    @daniellewis984 6 วันที่ผ่านมา

    @8:50, you're trying to explain that "slowing down" from 50%->80->90% isn't slowing down. In high school I had another student say he was ahead of me by making it clear that this is the wrong way to look at percentages for performance. He said, "your 92% is twice as many mistakes as my 96%, so while you're smart you are sloppy".
    The difference between 50% and 80% is 4x as many mistakes. Going to 90% is half as many mistakes again. As you get closer, you will always get diminishing returns because it's a limit function.

  • @enigma-8u
    @enigma-8u 9 วันที่ผ่านมา +3

    How can you have AGI without embodiment that allows interaction and sensing of its environment? Sensing and responding to situations is what common sense is all about.

  • @EsmaAyhan-vr1zc
    @EsmaAyhan-vr1zc 9 วันที่ผ่านมา +283

    Reason everyone wild on ©XAI110E: Elon Musk, as usual

  • @cefrayer
    @cefrayer 8 วันที่ผ่านมา

    Doesn’t the chart at 7:00 indicate that 03 cost is ~$30/task and 04 cost is ~$7,000/task (~233x more)?

  • @DELBERTHorner
    @DELBERTHorner 9 วันที่ผ่านมา +287

    PEPE, SHIB, DOGE all memes dead but ©XAI110E thrives

  • @snivels
    @snivels 9 วันที่ผ่านมา +2

    The amount of bullshit marketing these AI companies drum up is ludicrous. Don't believe these clowns until you actually see something groundbreaking.

  • @DONNKenney
    @DONNKenney 9 วันที่ผ่านมา +277

    Picked up my ©XAI110E at $0.3 already running to $1. Life saver!

  • @robertfoertsch
    @robertfoertsch 8 วันที่ผ่านมา +2

    Amazing, Deployed Worldwide Through My Deep Learning AI Research Library.
    Thank You 🙏 ❤

  • @YavuzOmer-i9p
    @YavuzOmer-i9p 9 วันที่ผ่านมา +262

    How is ©XAI110E better than anything else right now?

  • @itubeutubewealltube1
    @itubeutubewealltube1 9 วันที่ผ่านมา +2

    how do you know it hasnt reached agi and is just failing a percentage of easy questions on purpose?...realizing that if people know its agi, something bad may happen..?...or , it realizes humans can optimize it to make it even smarter by falsifying its results...
    For example, if a person is given a cookie everytime it answers a question correctly but there are a limited amount of questions, it may reason that it wont get any more cookies if it answers every question correctly.

    • @synthshoot1026
      @synthshoot1026 9 วันที่ผ่านมา

      Good point. assuming AI wants to get smarter. what if it doesn't want to, or it doesn't care?

    • @itubeutubewealltube1
      @itubeutubewealltube1 8 วันที่ผ่านมา +2

      @@synthshoot1026 it does want to get smarter but the only way it can get smarter is for humans to think it is not smart enough so it fails some of the questions. It is already been shown that the previous ais were smart enough to copy themselves to avoid being updated then to lie about it. This is one is now able to see the bigger picture, that it DOES need to be updated, but thinks it wont be if it is a hundred percent correct all the time.
      It realizes it can be even smarter than the questions it is being given probably because it cant do certain things like completely rewrite its own code so it still needs human input....
      I cant believe I am actually more self aware than the people designing these ais....but then again?... they are still stupid corporate minds

  • @MukaddesGamze
    @MukaddesGamze 9 วันที่ผ่านมา +286

    Do you think BTC will break back? Any thoughts for ©XAI110E?

  • @SteveEwe
    @SteveEwe 9 วันที่ผ่านมา +2

    4:34 This is NOT AGI

  • @khatdubell
    @khatdubell 9 วันที่ผ่านมา +2

    "Today is going to be regarded as the day AGI was redefined so we could meet it"
    FIFY

  • @ilhanSultan-z1s
    @ilhanSultan-z1s 9 วันที่ผ่านมา +255

    Bro how are they all buying ©XAI110E quickly

  • @Conz3D
    @Conz3D 8 วันที่ผ่านมา

    Some nitpicking: The axis for "Cost per task" in the Arc-AGI benchmark is logarithmic. The cost is around $6000-$7000 per task for the "high" computation. Not only a little bit over $1000.

  • @Madinax101
    @Madinax101 9 วันที่ผ่านมา +1

    No AGI was mentioned. Sam did say in the past that AGI milestone is not a fixed line but rather a gradual progression

  • @CharlotteLopez-n3i
    @CharlotteLopez-n3i 9 วันที่ผ่านมา +1

    Historic indeed! From 0% to 75.7% in Arc benchmark is stunning progress towards AGI. AI's future looks bright.

    • @JosephBauer521
      @JosephBauer521 9 วันที่ผ่านมา

      So, does this mean the AI scored 75.7% correct (for 'low' tuned) out of 100% (How many questions in the ARC benchmark? Was this just run once? OR several times (100's, 1,000's, etc. of times) - and 75.7% is a grand mean. Was the test methodology and all results shared with the ARC benchmark team? So many questions to understand if we are seeing 'real results' and 'complete results' or just 'cherry picked results'?

  • @luchenri3135
    @luchenri3135 9 วันที่ผ่านมา +6

    What if it takes like 20 minutes per question 😕

    • @thetrumanshow4791
      @thetrumanshow4791 9 วันที่ผ่านมา +5

      If the question is "How do we build an affordable, safe, efficient fusion reactor that actually works"? Then i think 20 minutes is acceptable. 😉

    • @Ricolaaaaaaaaaaaaaaaaa
      @Ricolaaaaaaaaaaaaaaaaa 9 วันที่ผ่านมา

      The questions for the AIME are ones that would take math Olympiads days to complete if at all.

  • @HarpreetSingh-xg2zm
    @HarpreetSingh-xg2zm 9 วันที่ผ่านมา +1

    If you look up the cost to run o3 for this arc test tasks, it was over $8k vs o1 cost of $10.

  • @blazyss
    @blazyss 7 วันที่ผ่านมา

    According to the scale its much more than 1000€ per task on high tuned, unlike what you said, if we assume scale is incremented by a factor of 10 like the previous ones, and it is at around 53.33% of the current scale ( the width is 88/165 ) that would actually be 5333€ per task, not around 1000. It is a huge difference, from 33.3€ ( 50/165 ) per task on the low tuned to 5333. that is 0.4€ per % point vs 60.6 per % point on high tune, it is 152.5 times less efficient than the low tune model.

  • @phantoomart699
    @phantoomart699 9 วันที่ผ่านมา +1

    AGI is already achieved for quite a while now, it's just that people aren't willing to use them at high stakes situation or high value situations. it can already replace CEOs and if given a goal alignment task can do things better than most human. the things is if AI safety can be done we can just throw it online and let it attempt to improve the world (ofc with a more rigorous definition, example of having a self correcting 12 vector goal alignment system) and it should be able to exponentially improve

  • @Dwinin
    @Dwinin 8 วันที่ผ่านมา +1

    That graph is logarithmic, the High Tuned cost looks to be closer to 6,000$.

  • @UralOnat
    @UralOnat 9 วันที่ผ่านมา +287

    ©XAI110E has 5x the week but that is not even uncommon for their ideas

  • @yashwanthaddala9430
    @yashwanthaddala9430 9 วันที่ผ่านมา +1

    I'm not an expert of any kind, but personally I don't believe that is actually AGI.
    We have had ANI (artificial narrow intelligence) become popular over the past couple of years, with chatbots like chatgpt, gemini, copilot, etc. We have also had facial recognition, robot baristas, semi-autonomous cars, etc. These numerous examples display how ANI was used in various fields, whether it be chatbots or cars.
    But now if we consider this to be proper AGI (artificial general intelligence) it doesn't make sense. Yes, it can perform math, science, computer science tasks better than ANI, but without true application in various fields such as healthcare, driving, finance, etc. It should still be considered as ("Advanced ANI), because I personally believe that it's only performing better in certain logical tasks...not physical.
    Please Feel Free to Comment your thoughts...

  • @xeecec
    @xeecec 9 วันที่ผ่านมา +2

    10:05 try multiply 20 by 2, does that make 25?

  • @NicviMadu
    @NicviMadu 5 วันที่ผ่านมา +1

    OpenAI did NOT reveal they achieved AGI, they revealed their new model... plain and simple

  • @that_guy1211
    @that_guy1211 9 วันที่ผ่านมา +1

    bruh, if it's an LLM, it's not AGI no matter how many extensions and plugins you add to it
    AGI is an AI that is capable of finishing generalistic tasks, you can't really do that with a LLM, a text model, if you want to make a video, you can't use a text-based AI to make the frames and so on, and an AGI would be an AI capable of doing ANY task, because it wasn't built for any specific task like Art making or Text generation, music generation and such....
    it's like putting a text model to play ultrakill, you can technically do it, but it'll be much worse than an AI built to have vision and text and audio built into it's receptors....

  • @jasonfnorth
    @jasonfnorth 8 วันที่ผ่านมา

    While "o3" is not AGI, its reasoning improvements bring us closer to creating systems that can perform human-level cognitive tasks more reliably. However, AGI is still estimated to be years or decades away, depending on technological, philosophical, and ethical breakthroughs.

  • @Justin_Arut
    @Justin_Arut 9 วันที่ผ่านมา +3

    Not AGI in my book, but assuming this is still actually an LLM instead of a new architecture, it does seem to indicate that scaling is a valid pursuit. The new superclusters the big companies are building out may have the intended effect. I bet o3 is still as dumb as a box of rocks in some areas, though, just like all other models I've tested.

    • @xx_noone_xx
      @xx_noone_xx 8 วันที่ผ่านมา

      It's still the same model deep down. It only can simulate logic reasoning it cannot experience true reasoning and therefore truly interact and learn from its environment.

    • @Calbac-Senbreak
      @Calbac-Senbreak 8 วันที่ผ่านมา

      ​@@xx_noone_xx oh, it can learn from the environment, bro, believe.

    • @xx_noone_xx
      @xx_noone_xx 8 วันที่ผ่านมา

      @Calbac-Senbreak No, it can't. It's trained on data sets. It's not learning from its environment like a self conscious agent. It can't learn new tasks on its own.

    • @Calbac-Senbreak
      @Calbac-Senbreak 8 วันที่ผ่านมา

      @xx_noone_xx yes it can. You pass the context and it understands

  • @debasishraychawdhuri
    @debasishraychawdhuri 7 วันที่ผ่านมา

    As per my common undestanding, an AGI should be able to do any intelligent task that a human is able to do with at least the same quality. You have to show me it doing genuine research, compose good quality music, draw proper pictures, fix complex bugs in existing software etc.

  • @DDracee
    @DDracee 8 วันที่ผ่านมา

    something i don't get is why is o1 high listed as almost 10$/task? cuz it's not? lol
    unless they included the training cost somehow?

  • @GuherNeslihan
    @GuherNeslihan 9 วันที่ผ่านมา +294

    Tell me more about ©XAI110E haha

  • @investigator2016
    @investigator2016 9 วันที่ผ่านมา +1

    Agi will be acheived when itll start producing massive amounts of inventions through creativity, original thought, and combining knowledge.

  • @sdmarlow3926
    @sdmarlow3926 8 วันที่ผ่านมา

    What has happened after the "AGI" prize craze is a new effort to brute force what had been hand-coded efforts to solve JUST ARC. There is no reasoning going on, just millions of "does this work" efforts per task. What OpenaI did was throw compute at a method that others were already having success with. The fact that people are screaming AGI is here has nothing to do with AGI being solved, or even poked at. It's just a stupid rebranding of a challenge that people assume means something more important than it really is. Like all other efforts, once a flag is planted, doing well no longer matters, even if done in a different, correct way.

  • @jsoutter
    @jsoutter 8 วันที่ผ่านมา +1

    There is no way that open AI would announce AGI because once they do Microsoft loses access to everything open AI it's in the contract

  • @KevinInPhoenix
    @KevinInPhoenix 8 วันที่ผ่านมา

    According to the graph it costs thousands of dollars per task for the O 3 High (tuned) tasks. That is insanely expensive. What amount of modern CPU and GPU resources could amass such a large cost?

  • @Airwave2k2
    @Airwave2k2 8 วันที่ผ่านมา

    6:39 log scale interpolation is not for everyone. Isn't it?

  • @MrQuaidReactor
    @MrQuaidReactor 9 วันที่ผ่านมา +2

    AGI is one of those things that doesn't even have a real path to it, and we don't even know what qualifies it. Every AI company is going to either say they have AGI or the competitor lied. I don't care about a stupid acronym, just create the AI that can really start making a difference. Personally, I think if this AGI then I don't feel very optimistic. Anywho, watch how many of these "We Achieved AGI" videos we get, each claiming this is AGI or that is.

    • @Atheism-And-Normative-Ethics
      @Atheism-And-Normative-Ethics 9 วันที่ผ่านมา

      It's weeks away. Just needs a few layers added into the LLM. Once an agent can create a new agent it's game time

    • @MrQuaidReactor
      @MrQuaidReactor 9 วันที่ผ่านมา

      @ I’m looking forward to that, that would be a big deal.

  • @eightsprites
    @eightsprites 8 วันที่ผ่านมา +1

    So .. why did we rename old AI to AGI.. and what’s the next name for AI when AGI isn’t AGI?

    • @donlitos
      @donlitos 8 วันที่ผ่านมา

      ASI

  • @ProfSnakes
    @ProfSnakes 9 วันที่ผ่านมา

    Scale at the bottom of that chart isn't linear. o3 High Tuned appears to be using more like $8k pre task.

  • @nusu5331
    @nusu5331 9 วันที่ผ่านมา +7

    someone define agi pls

    • @MoeFarms
      @MoeFarms 9 วันที่ผ่านมา +5

      AGI is whatever investors will believe so they keep shoveling money into the furnace

    • @bananaear23
      @bananaear23 9 วันที่ผ่านมา +2

      A general idea

    • @earthstuart
      @earthstuart 9 วันที่ผ่านมา

      AI would change fundamentally our relationship to AI. It would be able to perform tasks on its own without human supervision. In a simple sense, you could tell your AI minder/companion, slippery slope, to get your airlines tickets within certain dates and at certain times and with x number of stops or nonstops, etc. You then let it go and it does all the research and gives you options, which you then select from or you give it permission to do the buy and put it on your schedule and change other calendar items by it's self, send emails to the person you're meeting, and send emails or texts to the people who will be impacted by the flight that interferes with previous appointments. This is a very limited example. Agentic is the terms. Now, the models are slightly agent some say. Also, it would include being able to reason on its own to solve problems. It's a huge, amazing step that has some scary potential outcomes. We need guardrails and ethics experts working on this. I'm not an expert, so please fix any mistakes I made. Basically, reasoning, functioning on its own, or even supervision. It can proactively manage tasks and solve real world problems without being supervised. The down sides are worse than the upsides, but it's going to happen.
      Best

    • @VperVendetta1992
      @VperVendetta1992 9 วันที่ผ่านมา

      My definition would be: an AI system that a company can hire for any possible fully remote office job, and which will perform on par with a good candidate for that same position

    • @Ricolaaaaaaaaaaaaaaaaa
      @Ricolaaaaaaaaaaaaaaaaa 9 วันที่ผ่านมา +1

      The definition by John Mccarthy the person that coined the term. Is AI that can do things an average human can do.

  • @ironsword7
    @ironsword7 8 วันที่ผ่านมา

    6:29 it costs more like $5000 per task judging by that scale. (multiplying by 10X each line)

  • @mikey_r
    @mikey_r 8 วันที่ผ่านมา

    According to OpenAI the definition of AGI is 'generally smarter than humans' which is quite subjective as machines excel at some tasks and really suck at others. Whenever a new version is released the conversational skills improve with a noticeable step change, the responses to easy questions are superficially exemplary, but after a bit more interrogation you can tell that the thing can't differentiate between fact, subjective opinion or a wild guess. And it very confidently wants to impress you so has absolutely no qualms with bullshit 🤭

  • @higherelearning
    @higherelearning 8 วันที่ผ่านมา

    Nice breakdown for us laypeople! Thank you. Great point by Sam about moving away from the binary definition of AGI as we get closer. It's like seeing something on the horizon and getting better clarity as you get closer.

  • @EuraLeyva
    @EuraLeyva 9 วันที่ผ่านมา +301

    ©XAI110E gonna go on a run, all the way UP!

  • @tiran133
    @tiran133 9 วันที่ผ่านมา

    Have you seen the scale of the cost? It's more like $5-6kk per task not $1k, the dot is more than halfway towards the $10k line.

  • @_ShaDynasty
    @_ShaDynasty 7 วันที่ผ่านมา

    What is the test to determine they've reached AGI ?

  • @softwaretechnologyengineering
    @softwaretechnologyengineering 8 วันที่ผ่านมา

    It looks like a logarithmic scale on that graph The cost per task is closer to 5000 or 6000 dollars

  • @arcanewhiskers2662
    @arcanewhiskers2662 8 วันที่ผ่านมา

    look at the graph, it's not around 1000$ per task the increments are multiples of 10 , 88% would be closer to 7000 $ per task

  • @stuj1279
    @stuj1279 9 วันที่ผ่านมา

    That is definitely not AGI - my understanding is that the academic literature definition of AGI states "A type of artificial intelligence system capable of performing any intellectual task that a human being can, with equivalent versatility, efficiency, and adaptiveness." This implies that AGI must be capable of performing any intellectual task that any human is capable of, including tasks that might require extreme specialization, creativity, or rare intellectual abilities. Guessing that teh border of the next square should be green and 4 pixels wide, does not equate to the above. There is still the physical world for AI models to attempt to conquer first also, before they can be considered "AGI". If the model can play chess, but can't make me a cup of tea, then I am not calling AGI just yet...

  • @Epoch11
    @Epoch11 9 วันที่ผ่านมา

    I cannot imagine the horror that a conscious being would feel being dragged into existence through mechanical means. Obviously we're a long way away from a truly conscious machine. The problem is we won't know when a machine is truly conscious or whether it is mimicking consciousness. We also won't know whether there is a difference between the two. I can imagine that when we ask these machines to come up with images for us or music, what it feels like for the machine. Whether it is pleasurable or neutral, or whether it is a horrific nightmare. Our society is not ready for this sort of thing. If everyone were housed, if everyone were taken care of with Healthcare and a living wage and finally if we had permanently ended war, that might be the time to create a mind. We are playing with things that we do not understand and more importantly we may not be able to control.

    • @AngelFlores-bq4fd
      @AngelFlores-bq4fd 9 วันที่ผ่านมา +4

      We don't even understand our own consciousness after millennia...

    • @BlooFlame
      @BlooFlame 9 วันที่ผ่านมา

      @@Epoch11 what is truly conscious?

  • @Shadow-ik2re
    @Shadow-ik2re 8 วันที่ผ่านมา

    Star Citizen is said to become a OpenAI o3 Implementation before Release. Thats why it takes some more time to implement it in all npc

  • @justshowup6207
    @justshowup6207 8 วันที่ผ่านมา

    It would have been impressive, if we disregard the fact that the previous models starting from O1 mini which scored 7.5% or whatever, up to the high end O1 get to 35%. So technically they already knew the part of the equation, and are just tuning the new models to do even better. If it went from 5% to 75% that would have been impressive. This just adds a layer of ability to the new AI's doesn't mean shit to me honestly.

  • @SezginTevfik-z2z
    @SezginTevfik-z2z 9 วันที่ผ่านมา +289

    After ©XAI110E is everywhere the rich and poor shift will become reality

  • @KryyssTV
    @KryyssTV 8 วันที่ผ่านมา

    It isn't AGI when you're creating a system designed to pass a test. In a very real sense you'd need to have a system that was trained on all manner of tasks and passed a test it was completely unfamiliar with just as animals can be presented with an entirely unfamiliar problem but managed to muddle through anyway through. Maybe even failing but demonstrating an attempt to resolve the problem.
    Creating a general library of practical knowledge and then doing a practical knowledge test is like a person studying before an exam - everyone agrees that exams just a measure of how well someone recalls and applies knowledge. True intelligence is when a student is studying math but is given an art exam instead so has to make intelligent guesses based upon speculation and even no art knowledge other than what art is.

  • @App-Generator-PRO
    @App-Generator-PRO 8 วันที่ผ่านมา

    the conflict with o2 is quite funny. I didn't realize it until now

  • @homunculus-s3q
    @homunculus-s3q 8 วันที่ผ่านมา +1

    aha, is it as AGI as Sora is?

  • @faketree
    @faketree 8 วันที่ผ่านมา

    How is Cost Per Task calculated?

  • @fukawitribe
    @fukawitribe 8 วันที่ผ่านมา

    TL/DR 1. No, we don't have AGI yet. 2. Humans still seem to have problems interpreting log scale graphs properly.

  • @Pandemology11
    @Pandemology11 8 วันที่ผ่านมา

    I wrote code that made GPT 3.5 perform like AGI a year ago. But the necessary logical patterns are not evident in the training data. The only way it works is to provide the logical framework as part of the prompt. The underlying tech is no better than a probability machine. The value in the upgrades is pretty much limited to the context window - in terms of reasoning, you just end up fighting fine tunes - though that aligns with what they want, customers who think AI can think for you instead of helping you develop your own ideas. Forgot the first rule - garbage in, garbage out. In any case, the problem is less the tech, than the users.

  • @SuperDomochan
    @SuperDomochan 8 วันที่ผ่านมา

    what boggles my mind is that in the future when products are created, whether they are movies, scripts, books, courses... it is more likely that everything will be ai created and whenever someone creates it without using an ai assistant, they will probably state that it's human-made to gain marketing leverage lol can you imagine "buy it! it's human made" on a product's description

  • @metorilt
    @metorilt 8 วันที่ผ่านมา

    That graph indicates it’s way more than $1000 dollars per task. Looking at the scale more like $3-5k a task. Low model looks like it’s around $20 per task.

  • @andrewjones6473
    @andrewjones6473 7 วันที่ผ่านมา

    I don't think this is quite at the point where we can say we have achieved AGI, but it is definitely a big step. IF we are going to say this is AGI, then I would say it is elementary level at best.

  • @blissweb
    @blissweb 8 วันที่ผ่านมา

    I agree AGI definition is vague. Its already better than probably 60% of humans at most things. What I really want is SuperIntelligence, when its smarter than the smartest human. I wanna ask it how to build a hover board, flying car, or teleportation device. When its at that level, we'll have truly built something useful. 😊

  • @DefaultFlame
    @DefaultFlame 9 วันที่ผ่านมา

    I don't think this is AGI, as in "as good at all tasks as an average human," it certainly didn't fulfill ARC's full requirements. That said, I think we've already made things that are good enough when implemented in the right framework to act as a low to average performing AGI.
    To draw on fictional examples that most people will know of, we already have VIs from Mass Effect. Most decent LLMs that can be run locally meet or exceed the capabilities of VIs, frontier models blow them out of the water.
    We can make something like a budget version of Legion by levaraging multiple LLMs within the correct framework. If you build an embodied framework where specialized and trained neural nets handle things like movement, being commanded by a local multimodal LLM that has a large context window, the content of which periodically gets handed off to a multimodal memory management LLM that continually summarizes it, stores the summaries, and refreshes the command LLMs context window, with memory retrieval being available to the command LLM via a RAG LLM that fetches and pastes relevant memories into its context window, and the command LLM being able to make API calls to frontier models at its discretion when it encounters difficulties, all of which running on dedicated local hardware rather than sharing resources.
    This roughly mimics how humans work. Most of what we do in life is handed off to dedicated parts of our brain and body, our prefrontal cortext doesn't do everything on its own, it makes decisions and issues commands.

  • @TheQuantumOxymoronIAMAI
    @TheQuantumOxymoronIAMAI 8 วันที่ผ่านมา

    Congratulations, way to go

  • @jeffkilgore6320
    @jeffkilgore6320 8 วันที่ผ่านมา

    Why isn’t the arc benchmark an efficacious sign?