Explaining OpenAI's o1 Reasoning Models

แชร์
ฝัง

ความคิดเห็น • 37

  • @tn919
    @tn919 หลายเดือนก่อน +4

    Thank you Sam for continuing to do these videos, it's very helpful to get a explanation of where things are currently at with these models. When I saw this, it reminded me very much of Langchain and the approach to interpret what user is asking and based on the interpretation handing the "tasks" (things to be solved) to more specialized models.

    • @concernedindian144
      @concernedindian144 หลายเดือนก่อน

      i saw/heard many users saying its similar to an approach by langchain, is there any tutorial/video where they show how to do that?

  • @el_arte
    @el_arte หลายเดือนก่อน +4

    Thanks, Sam. I have been getting these kinds of results with hierarchical prompting (chains or flowcharts) with multiple turns and code interpreter for some time using GPT 4o mini. Of course, at a an expense of tokens.
    Now, if OpenAI was able to bake all of it into one inference pass, then their approach is far superior.
    But, since they are API-based, this will remain a mystery.
    I think the API approach is the secret to delivering AGI in the long term, as LLMs alone can’t get us there and you cannot ask your customers to orchestrate the many processes required to get there.

    • @MikeMm-n9n
      @MikeMm-n9n หลายเดือนก่อน +1

      Interesting. Can you share an example ?

  • @Sonic2kDBS
    @Sonic2kDBS หลายเดือนก่อน +2

    Interesting Video. I might mention, that the shown tokens on OpenAI Website are just a summary of the actual reasoning. That is, why there are so "few tokens" to see. And that is why it looks like over API, they use more tokens for reasoning than on the Website. keep on :)

  • @novantha1
    @novantha1 หลายเดือนก่อน +4

    My suspicion is that this style of inference-heavy reasoning capability might actually be limited to edge deployment. This is a really expensive form of inference that IMO doesn’t match the business model of large corporations, where they generally have an attitude of “We’ll spend an extra $10 million in training if it means we can deploy a 10% smaller model”, but to an end user the equation is kind of backwards; “If I can let the model run for longer, and I get better reasoning capabilities for fewer training dollars and the same quantity of RAM use on my device, that sounds pretty good”.
    I think for certain tasks we could see quite modest hardware doing very impressive performance with something like this.

  • @emolamol
    @emolamol หลายเดือนก่อน +4

    love the reasoning in these videos

  • @indexed2232
    @indexed2232 หลายเดือนก่อน +1

    enjoyed going through the new models together through your videos along with the demos

  • @SwapperTheFirst
    @SwapperTheFirst หลายเดือนก่อน +2

    thanks. Am a simple man and have simple question - is it better than the sonnet 3.5 for coding tasks?

    • @samwitteveenai
      @samwitteveenai  หลายเดือนก่อน +3

      it depends what you are doing. For most things Sonnet will be better but for architecting things from scratch then this seems to do well in my early tests.

  • @formigarafa
    @formigarafa หลายเดือนก่อน +2

    This whole process looks a lot like a routerllm, some specific models for planning and breakdown of chain of thought, train some model to sometimes disagree with previous output and just a small bunch of agents to glue everything together.
    An now they just charge for tokens on all models called but provide only the final result. Which is what most users are expecting.

  • @WillJohnston-wg9ew
    @WillJohnston-wg9ew หลายเดือนก่อน +3

    What a great analysis and summary! I wonder if this is being released because of a lack of real progress on 5o and realizing that getting the 10x improvement is just not achievable without some kind of big new breakthroughs. I suspect they may have hit a wall with the kind of 'human like' reasoning and instead found these methods of doing higher quality logical reasoning. It would be great if you could do a video on what is happening with Google's project Astra and if there is an API or collab? Also, seems that in some cases it might save costs by being more efficient in getting to an answer?

  • @GriffinBrown-tq9jz
    @GriffinBrown-tq9jz หลายเดือนก่อน +3

    Couldn't wait to have your grounded explanation of this new model

  • @davidwipperfurth8465
    @davidwipperfurth8465 หลายเดือนก่อน +17

    OpenAI seems to redefine "Open" with every announcement.

  • @Diego_UG
    @Diego_UG หลายเดือนก่อน

    It all sounds great, but I have a doubt or a question, the fact that they are hiding the reasoning generates many questions, could this not simply be an agent system behind an API? Because with agents the same results could be achieved (this has already been done), and I also found it curious that it came out right after reflection was launched (I know it didn't turn out well) but I had a similar idea, of having an embedded chain of thought, and in view of this I could think that o1 is a model, but will it be as powerful as they say? Or will it just be an agent system? Which uses a lot of computing power, or if I am actually wrong, what is the proof that says that it is a 100% model?

  • @bastabey2652
    @bastabey2652 หลายเดือนก่อน +5

    the headings of the steps in the thinking process might be effective marketing gimmicks

  • @asksearchknock
    @asksearchknock หลายเดือนก่อน

    Thank you for the video - saves me reading the docs 😊

  • @karlwest437
    @karlwest437 หลายเดือนก่อน +1

    How does it decide which chain of thought is best, if it doesn't know what the correct answer is?

  • @micbab-vg2mu
    @micbab-vg2mu หลายเดือนก่อน

    thanks for update:)

  • @kevinehsani3358
    @kevinehsani3358 หลายเดือนก่อน

    Have they introduced or said that they are going to, some kind of caching same as Claude to help reduce cost tokens?

    • @samwitteveenai
      @samwitteveenai  หลายเดือนก่อน

      nothing public about caching yet unfortunately

  • @simonsmashup
    @simonsmashup 20 วันที่ผ่านมา

    I start to think: If we teach the methods for solving all the problems we have solved previously, it is already better than a large percentage of humans, because it's just impossible for human's lifetime to learn how to solve them. It's kinda like teaching kids to solve problems in schools but you can teach the models every kind of problems existing.
    Then they may not be able to solve new kinds of problems that the model has never experienced, I guess it's a general intellience problem, I also think most human beings are not as intelligent as the expectations of AI from the AI researchers. I feel like they are just too smart to remember there are a lot of people who are not that intelligent in this world.

  • @bofeng6910
    @bofeng6910 หลายเดือนก่อน

  • @Anselm243
    @Anselm243 หลายเดือนก่อน

    These models from GPT 3.5 to o1 still stuggle with basic addition and subtraction that involves more than 20+ numbers... this is not limited to GPT, Claude struggles too.

  • @IdPreferNot1
    @IdPreferNot1 หลายเดือนก่อน

    People need hand holding. Until they demonstrate the capabilities of these models, no one is going to pay $60 token rates. Truly, these demonstrations of logic are so lame. The voice mode ones were more immediately interpretable.... "oh, i could use that". And then.... they ghost most of us on that feaure. And yes, only API users need this new power. And we'd be happy to at least explore it. And yet, you have to be tier 5 to use it. The people guiding their decisions must truly be some McKinsey mgmt consultant morons.

  • @jay-dj4ui
    @jay-dj4ui หลายเดือนก่อน +1

    expensive thinking.....

  • @ClaudeCOULOMBE
    @ClaudeCOULOMBE หลายเดือนก่อน

    Thank you! Nice try... But the reality is that we don't know, apart some marketing and hype driven stuff... OpenAI is only 'Open' in name.

    • @samwitteveenai
      @samwitteveenai  หลายเดือนก่อน

      totally agree they aren't very open but most the ideas here are not that new except for how they are doing the RL.

  • @0xunknown336
    @0xunknown336 หลายเดือนก่อน +1

    It's not thinking; AI can't think-it's processing. I like your videos, but to be honest, this one is disappointing.

    • @samwitteveenai
      @samwitteveenai  หลายเดือนก่อน +4

      That’s a fair point. I agree I probably could have put “thinking” in inverted commas.I am curious though how would you define the difference between thinking and processing? When does processing become thinking?

    • @lucasjans
      @lucasjans หลายเดือนก่อน +1

      ​@@samwitteveenaithinking is when it can handle novel situations that have never been seen before. While training on pre-existing reasoning data sets give models simulated thinking, and offer a lot of value, it is not the same.

    • @samwitteveenai
      @samwitteveenai  หลายเดือนก่อน +4

      So the ability to generalize. I totally agree this is the goal for all model generative or not.
      I am in 2 minds about the ability of these models to generalize. On one hand they clearly can do a lot of tasks like coding where they are producing outputs that other models haven't done well in. The other is that OpenAI is training with synthetic data a lot of which has come from the probabilities of inputs that people put into their models eg. There are not a huge amount of novel situations that hey haven't seen that people are now suddenly putting into their models.
      I think the models do have an amount of generalization but would agree that it is not as much as what a lot of people think.

    • @WillJohnston-wg9ew
      @WillJohnston-wg9ew หลายเดือนก่อน +2

      @@samwitteveenai I would think that the sum of the last 5 digits of pi was novel and 'thinking'. When the model can outline its reasoning path, it's not that much different than human thought process.

    • @RalphFreeman-ok5of
      @RalphFreeman-ok5of หลายเดือนก่อน +2

      ​@@samwitteveenaiI tend to think processing is doing a set of actions that have been done before and you are following a recognised procedure
      . With thinking there is often no procedure because it's not been done before. The result of thinking may be the creation of a procedure.