Q* - Clues to the Puzzle?

แชร์
ฝัง
  • เผยแพร่เมื่อ 23 พ.ย. 2023
  • Are these some clues to the Q* (Q star) mystery? Featuring barely noticed references, TH-cam videos, article exclusives and more, I put together a theory about OpenAI’s apparent breakthrough. Join me for the journey and let me know what you think at the end.
    www.assemblyai.com/playground
    AI Explained Bot: chat.openai.com/g/g-804sC5lJ6...
    AI Explained Twitter: / aiexplainedyt
    Lukasz Kaiser Videos: • Deep Learning Decade a...
    • Lukasz Kaiser (OpenAI)...
    Let’s Verify Step by Step: arxiv.org/abs/2305.20050
    The Information Exclusive: www.theinformation.com/articl...
    Reuters Article: www.reuters.com/technology/sa...
    Original Test Time Compute Paper arxiv.org/pdf/2104.03113.pdf
    OpenAI Denial: / 1727472179283919032
    DeepMind Music: deepmind.google/discover/blog...
    Altman Angelo: / sama
    Karpathy: peterjliu/status/...
    STaR: arxiv.org/abs/2203.14465
    Noam Brown Tweets: polynoamial/statu...
    Q Policy: www.analyticsvidhya.com/blog/...
    Sutskever Alignment: • Ilya Sutskever - Openi...
    / aiexplained Non-Hype, Free Newsletter: signaltonoise.beehiiv.com/
  • วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 996

  • @aiexplained-official
    @aiexplained-official  5 หลายเดือนก่อน +584

    My computer crashed 7 times while making this video and I had a hard deadline to get a flight. There is little of my normal editing in here, or captions, just my raw investigation! Do follow the links for more details.

    • @literailly
      @literailly 5 หลายเดือนก่อน +26

      We appreciate your dedication, sir!

    • @JohnVance
      @JohnVance 5 หลายเดือนก่อน +38

      Still the best AI channel on TH-cam, none of the hype of the other channels. Maybe the news cycle will calm down and you can get some sleep!

    • @patronspatron7681
      @patronspatron7681 5 หลายเดือนก่อน +1

      Bon voyage

    • @thebrownfrog
      @thebrownfrog 5 หลายเดือนก่อน +4

      It's great as always!

    • @alertbri
      @alertbri 5 หลายเดือนก่อน +6

      You did a great job Philip, as always! Much appreciated attention to detail and balance. Exciting times ahead! Have a safe trip. 🙏👍

  • @SaInTDomagos
    @SaInTDomagos 5 หลายเดือนก่อน +601

    Dude woke up and thought to himself, how thorough will I be today and said: “Yes!” You definitely should get some interviews with those top researcher’s.

    • @Dannnneh
      @Dannnneh 5 หลายเดือนก่อน +11

      Oooh, that would be interesting!

    • @aiexplained-official
      @aiexplained-official  5 หลายเดือนก่อน +139

      Stay tuned :)

    • @JustinHalford
      @JustinHalford 5 หลายเดือนก่อน

      @@aiexplained-official🔥🫡

    • @daikennett
      @daikennett 5 หลายเดือนก่อน +11

      We'll hold you to this. ;) @@aiexplained-official

    • @DaveShap
      @DaveShap 5 หลายเดือนก่อน +7

      Philip is nothing if not thorough. Dude reads like several novels worth of text per day.

  • @nathanfielding8587
    @nathanfielding8587 5 หลายเดือนก่อน +424

    I'm truly grateful for this channel. Finding accurate news about almost anything is hard as heck, and having accurate AI news is especially important. We can't afford to be mislead.

    • @akathelobster1914
      @akathelobster1914 5 หลายเดือนก่อน

      He's good, I'm very interested in reading the references.

  • @gaborfuisz9516
    @gaborfuisz9516 5 หลายเดือนก่อน +649

    Who else is addicted to this channel

    • @danielbrockman7402
      @danielbrockman7402 5 หลายเดือนก่อน +6

      me

    • @FranXiT
      @FranXiT 5 หลายเดือนก่อน +12

      He is literally me

    • @a.thales7641
      @a.thales7641 5 หลายเดือนก่อน +3

      I am

    • @shaftymaze
      @shaftymaze 5 หลายเดือนก่อน +5

      7 min later. He digs a bit further than I have time to. And yeah. Ilya was on our side.(humanity) Remember that.

    • @ytrew9717
      @ytrew9717 5 หลายเดือนก่อน +3

      who else do you follow? (Please feed me)

  • @DevinSloan
    @DevinSloan 5 หลายเดือนก่อน +169

    Ah, the Q* video I have been waiting for from the only youtuber i really trust on the subject. Thanks!

    • @aiexplained-official
      @aiexplained-official  5 หลายเดือนก่อน +18

      Let me know what you think of the theory

    • @AllisterVinris
      @AllisterVinris 5 หลายเดือนก่อน

      Same

    • @Elintasokas
      @Elintasokas 5 หลายเดือนก่อน +1

      @@aiexplained-official Rather hypothesis, not theory.

    • @aiexplained-official
      @aiexplained-official  5 หลายเดือนก่อน +13

      @@Elintasokas but the evidence came first, so a theory no?

    • @sebby007
      @sebby007 5 หลายเดือนก่อน

      My thought exactly

  • @dcgamer1027
    @dcgamer1027 5 หลายเดือนก่อน +152

    I'd expect the Q to refer to Q-learning. Human beings think/function by predicting the future and acting upon those predictions, at least at a subconscious level. The way we make these predictions is by simulating our environment and observing what would happen in different variations of that simulation given the different choices we make. We then pick the future we feel is best and take the actions to manifest that future.
    I think a good example might be walking through a messy room with legos everywhere. You observe that environment(the room) identify the hazards(legos) then plan out a course through the room of where you can step to be safe(not step on lego). You would imagine that stepping in one spot would mean you are stuck or would step on a lego, so that whole route is bad and you try another. Repeat till you find a solution or decide there isn't one and just pick some legos up, or give up, or whatever. Of course not everyone does this, some people just walk on through without thought and either accept stepping on legos or regretting that they did not stop to think. These emotional responses of acceptance of consequences or regretting them is more akin to reinforcement learning imo. There are times when you need to act without thought, for example, if the room was on fire you might not have the time (or compute) to plan it all out.
    The Q learning stuff, in the context of these LLMs, seems like it would be their version of simulating the future/environment. It would generate a whole bunch of potential options(futures) then pick the best one. The difficult task there is creating a program that knows what the best option actually is, but they apparently already have that figured out.
    My bet is we will need to add in a few different systems of ‘thought’ that the AI can choose from given different contexts and circumstances, these different methods of decision-making will become tools for the AI to use and deploy and at that point it will really look like AGI. That’s just my guess and who knows how many tools it will even need.
    Either way it's cool to see progress and all this stuff is so cool and exciting.
    Now to go look for some mundane job so I can eat and pay off student loans lmao, post-money world come quickly plz XD.

    • @gregoryallen0001
      @gregoryallen0001 5 หลายเดือนก่อน +7

      normally a long post like this will be trash so THANK YOU for this helpful and engaging response ❤

    • @RichardGrigonis
      @RichardGrigonis 5 หลายเดือนก่อน +4

      Many years ago AI researchers speculated how to represent "thoughts." One approach was to treat them essentially as "mental objects," the other was to resort to possible worlds theory.

    • @GS-tk1hk
      @GS-tk1hk 5 หลายเดือนก่อน +5

      What you described is just reinforcement learning, Q-learning is a specific algorithm for solving the RL objective and the "Q" refers to the Q-function, which has a specific meaning in RL. It seems likely that Q* refers to the Q-function (and star generally means "optimal"), but not necessarily the Q-learning algorithm.

    • @kokopelli314
      @kokopelli314 5 หลายเดือนก่อน

      But if you have the whole world in q learning you can just use your intelligence to make money and pay someone to sweep up the room

    • @lucasblanc1295
      @lucasblanc1295 5 หลายเดือนก่อน +1

      Anyone that played a bit with those LLMs intuitively know that already. I prompt it all the time chain-of-thought and other reasoning methods like "Write a truth table to check for errors in our logic". The major issue I always arrive at, is that it always ends up getting stuck somewhere along its line of reasoning and it needs human intervention. This happens exactly because it was never taught how to think and structure its thoughts, it was just a side-effect of language. I believe once its able to reason through mathematical problems with the proper proofs, it will be able to generalize for any fields due to its lateral knowledge transfer. So, they will just need to keep fine-tuning the model towards that direction, effectively creating a feedback loop of improving its capability at reasoning correctly, so that it will require less parameters and less compute for the same quality. And adding on top of that new breakthroughs such as bigger context window, AGI is just matter of quantity and quality of the same technique.
      Just run that thing in a loop, because that's how thinking happens. It's a trial and error process. Then, fine-tune it at being better at trial-and-error processes, instead of simply giving seemingly useful answers. We were simply being lazy about it by tuning it towards being useful quickly, without caring about how it's doing it in the first place.
      It is already AGI, but it's severely misaligned, just like GPT-3 was impressive before Chat fine-tuning. Now, we are fine-tuning Chat as Q*. It's just a step.
      After Q*, it will probably be fine-tuned for improvement at further generalization, instead of simply the domain of math/programming.
      This will be tricky to train, humans don't generate textual content for the sake of thinking through it, perhaps only mathematical proofs get there, and it's extremely time-consuming. Because we make assumptions about the reader's pre-existing intelligence, we tell information through text without ever showing our full thought process.
      In other words, we are truly starting to fine-tune it for using text for thinking, not simply generating cute answers to fool humans. This may seem obvious, but I don't think people get this.

  • @pedxing
    @pedxing 5 หลายเดือนก่อน +98

    THIS was the technical dive I've wanted to find for the last few days. thank you so much for taking the time to dig into the development of these papers and the technologies they represent.

    • @Reece-hf1zx
      @Reece-hf1zx 5 หลายเดือนก่อน

      saaaaaahhj

  • @bobtivnan
    @bobtivnan 5 หลายเดือนก่อน +80

    Wow. Very impressive investigative journalism. No other AI channel does their homework better than you. Well done sir.

    • @IstvanNagy86
      @IstvanNagy86 5 หลายเดือนก่อน +6

      Even if this whole thing turns out to be a red herring, I really appreciate that he distills so many information into a single video.

  • @Peteismi
    @Peteismi 5 หลายเดือนก่อน +85

    The Q* as an optimizing search through the action space sounds quite plausible. Just like the A* algorithm that is more of a generic optimal path finding algorithm.

    • @adfaklsdjf
      @adfaklsdjf 5 หลายเดือนก่อน +7

      ohhh that Q* / A* link is very interesting!

    • @productjoe4069
      @productjoe4069 5 หลายเดือนก่อน +10

      This was my thought too. Possibly using edits of the step-by-step reasoning as the edges, or some more abstract model. You could then weight the edges by using a verifier that only needs to see a bounded context (the original, the edited, and the prompt) to say whether or not the edit is of high quality. It’s sort of like graph-of-thought, but more efficient.

    • @ZeroUm_
      @ZeroUm_ 5 หลายเดือนก่อน +10

      A* was my first thought as well, it's such a famous, CompSci graduate level algorithm.
      (Sagittarius A* is also the name of the Milky Way's central supermassive black hole)

    • @mawungeteye657
      @mawungeteye657 5 หลายเดือนก่อน

      Even if it's just speculative it's a decent idea for an actual study. Wish someone would test it.

    • @sensorlock
      @sensorlock 5 หลายเดือนก่อน +3

      I was thinking something along this line too. Is there a way to prune chains of thought, like A* prunes minimax?

  • @Madlintelf
    @Madlintelf 5 หลายเดือนก่อน +118

    We all spent the last week watching the soap opera drama and listening to wild ideas and nobody put it all together in a nice package with a bow on it until you posted this video. It is a theory, but one that is well thought out has references, and seems extremely logical. Thanks for putting so much work into this, but it's not falling on deaf ears, we truly appreciate you. Thanks, Bill Borgeson

    • @lollerwaffleable
      @lollerwaffleable 5 หลายเดือนก่อน

      Who is listening? Remember I just want like a fucking job. From OpenAI specifically.

    • @lollerwaffleable
      @lollerwaffleable 5 หลายเดือนก่อน

      When do we announce that I’m the new ceo of open ai

    • @lollerwaffleable
      @lollerwaffleable 5 หลายเดือนก่อน

      Lmao

  • @caiorondon
    @caiorondon 5 หลายเดือนก่อน +50

    This channel outpaces in quality ANY other channel on AI News in TH-cam. The way you try your best to keep the hype out and reduce the amount of speculation is really something to be proud of and really what makes your content so different from other creators.
    You sir, is the only channel in the topic that I am happy to watch (and like) every video. ❤
    Cheers from Brazil!

  • @gmmgmmg
    @gmmgmmg 5 หลายเดือนก่อน +22

    The New York Times or another major newspaper should hire you, seriously. The amount and quality of research and the way you explain and convey AI news and information is truly remarkable. You are currently my favourite yt channel.

  • @nescirian
    @nescirian 5 หลายเดือนก่อน +14

    At 17:20 Lukacs Kaiser says multi-modal chain of thought would be basically a simulation of the world. Unpacking this, you can think of our own imaginations as essentially a multi-modal "next experience predictor", which we run forwards as part of planning future actions. We imagine a series of experiences, evaluate the desirability of those experiences, and then make choices to select the path to the desired outcome. This description of human planning sounds a lot like Q-learning - modeling the future experience space as a graph of nodes, where the nodes are experiences and the edges are choices, then evaluating paths through that space based on expected reward. An A* algorithm could also be used to navigate the space of experiences and choices, possibly giving rise to the name Q*, but it's been many years since I formally studied abstract pathfinding as a planning method for AI, and as far as I can tell from googling just now over my morning coffee, it seems like the A* Algorithm would not be an improvement over the markov decision process traditionally used to map the state space underlying Q-learning.
    My extrapolation gets a bit muddy at that point, but maybe there's something there. To me, a method that allows AI to choose a path to a preferred future experience would seem a valuable next step in AI development, and a possible match for both the name Q* and the thoughts of a researcher involved with it.

  • @grimaffiliations3671
    @grimaffiliations3671 5 หลายเดือนก่อน +45

    This really is the best AI channel around, we're lucky to have you

  • @a.s8897
    @a.s8897 5 หลายเดือนก่อน +14

    you are my first source for AI news, you go deep into the details and do not cut corners, like a true teacher

  • @rcnhsuailsnyfiue2
    @rcnhsuailsnyfiue2 5 หลายเดือนก่อน +13

    18:49 I believe Q* is a reference to the “A* search algorithm” in graph theory. Machine learning is fundamentally described by graph theory, and an algorithm like A* (which traverses each layer of a graph as efficiently as possible) would make total sense.

    • @bl2575
      @bl2575 5 หลายเดือนก่อน

      It was also my though when I heard the algorithm name. It is basically a cost minimization algorithm to reach a target node. Difficult part in this context is figuring out what heuristic to use to evaluate if a step of reasoning is closer to answering the question than another one. Maybe that where the Q-learning policy play a role.

  • @tai222
    @tai222 5 หลายเดือนก่อน +18

    This channel and Dave Shapiro are my go to for AI news!

    • @Veileihi
      @Veileihi 5 หลายเดือนก่อน +4

      lmao, I left the same comment on one of Daves videos but in reverse

    • @MarkosMiller15
      @MarkosMiller15 5 หลายเดือนก่อน +2

      I'd add Wes too which I discovered recently but yeah, those 2 really are the main trustworthy non *cryptobro* vibes channels

    • @krishp1104
      @krishp1104 5 หลายเดือนก่อน +5

      I just found Dave Shapiro today but I think he's wayyy too impulsive to sound the AGI alarm

  • @stcredzero
    @stcredzero 5 หลายเดือนก่อน +78

    This makes me want to produce a generative AI comic called, "The Verifier." It would be about a verifier AGI fighting a David versus Goliath guerilla war against a malevolent superoptimizer, using its ability to poke holes in the answers of a much larger model to save humanity. EDIT: The tactic of doing lots of iterations, then rewarding on the raw probability of winning -- This smells a lot like evolution by natural selection. It's a brutally simple emergent fitness function!

    • @lollerwaffleable
      @lollerwaffleable 5 หลายเดือนก่อน +5

      Heartwarming, thanks whoever you are

    • @omaviquadir
      @omaviquadir 5 หลายเดือนก่อน

      Read blame! for inspo.

    • @KalebPeters99
      @KalebPeters99 5 หลายเดือนก่อน +1

      Wow, amazing... 💕

  • @xXWillyxWonkaXx
    @xXWillyxWonkaXx 5 หลายเดือนก่อน +4

    By far one of the most informative and condensed videos about the essential concepts/building blocks towards creating AGI. Very succinct, great tempo. 👏🏼

  • @rioiart
    @rioiart 5 หลายเดือนก่อน +20

    Hands down best TH-cam channel for AI news.

  • @DavidsKanal
    @DavidsKanal 5 หลายเดือนก่อน +5

    "You need to give the model the ability to think longer than it has layers" is what really sticks with me, it's such an obvious next step for LLMs which currently run in constant time. Let's see where this leads!

  • @garrettmyles6493
    @garrettmyles6493 5 หลายเดือนก่อน +8

    As someone outside the industry, this is such a great resource. Thank you very much for the hard work and keeping us in the loop! I've been waiting for this video since the Reuters article

  • @apester2
    @apester2 5 หลายเดือนก่อน +16

    I was in two minds about whether to take the Q* thing seriously until you posted about it. Now I accept that it is atleast not just sensational hype. Thanks for keeping us up to date!

  • @MasonPayne
    @MasonPayne 5 หลายเดือนก่อน +7

    A* is an algorithm mainly used in path finding. Which works very similar to what you described as Q. Imagine the idea landscape as a set of information you need to search through to find a path to the answer. That is what I think they mean by Q*

  • @etunimenisukunimeni1302
    @etunimenisukunimeni1302 5 หลายเดือนก่อน +46

    Amazing work. Thanks for, ahem, pushing back the veil of ignorance 😁
    So refreshing to get an informed and non-sensational take on this latest OpenAI X-Files case. It doesn't even matter if your educated guess ends up missing the mark. It's this kind of detective work that is sorely needed in any case, at least before we get some official and/or trustworthy info on this James Bond style "great achievement" called Q*

  • @colin2utube
    @colin2utube 5 หลายเดือนก่อน +17

    Game Developers will be familiar with the "A*" algorithm, used to find optimal shortest paths between 2 points on a grid containing obstacles (eg. a path between the players location and some target, or between an AI opponents position and the players position). I wonder if Q* is some similar shortest path finding algorithm between two more abstract nodes in an AI network problem containing some kind of obstruction that has to be navigated around ?

    • @johntiede2428
      @johntiede2428 5 หลายเดือนก่อน +2

      I'd add that mazes can be decomposed into trees, and A* is applied to that. Think Trees of Thought not just Chain of Thought, and applying an A*-like algorithm.

  • @ShadyRonin
    @ShadyRonin 5 หลายเดือนก่อน +2

    Love the longer video format! Amazing as usual

  • @zandrrlife
    @zandrrlife 5 หลายเดือนก่อน +9

    I would say he's actually understating the dramatic impact CoT has on multi-modal output. Also things get wacky when you combine vertical CoT iteratively reflecting horizontal CoT outputs(actual outputted tokens). Increasing model inner monologue(computation width) across layers is def the wave.
    Again why I think synthetic data/hybrid data curation cost will soon match model pretraining. Even if you're perturbating existing data, you can lift it's salient density to better fit this framework. Also why I keep saying local models are the way and why I've been obsessed with increasing representational capacity in smaller models.

  • @KP-sg9fm
    @KP-sg9fm 5 หลายเดือนก่อน +29

    Would love to see you do interviews with lesser known but key figures in the industry, you would have such good questions.

  • @adfaklsdjf
    @adfaklsdjf 5 หลายเดือนก่อน +9

    as always, _whatever happens_ , thank you for your work

  • @sgstair
    @sgstair 5 หลายเดือนก่อน +7

    Here's the idea that I had:
    Let's say you think of the output of a "Let's verify step by step" prompt as a tree of possible responses. Each step has a wide variety of possible subsequent steps.
    Then let's say you have a classifier network that decides relatively how good chains of responses are
    Then you could run an A* search algorithm over the tree of possible response chains efficiently, only following the most useful ones, and explore an unimaginably huge search space without that much compute.

  • @agenticmark
    @agenticmark 5 หลายเดือนก่อน +3

    this is the basis of a montecarlo search, or even a genetic algorithm. you are simulating many worlds, and selecting the world that best fits the needed model. - by the way, this is great work. the research you did, the papers you referenced, and the video in general! love it.

  • @ddwarful
    @ddwarful 5 หลายเดือนก่อน +8

    Q* found the fabled NSA AES backdoor.

  • @Neomadra
    @Neomadra 5 หลายเดือนก่อน +7

    It's just incredible how you connect all these dots in such a short amount of time. Even if Q* turns out to be a mirage, at least I learned something about promising research directions :)

    • @xXWillyxWonkaXx
      @xXWillyxWonkaXx 5 หลายเดือนก่อน +2

      If im understanding this correct, its: Test Time Computation, Chain of Thought (CoT), Let's Verify Step by Step and Self-Taught Reasoning.

  • @zaid6527
    @zaid6527 5 หลายเดือนก่อน +1

    Just came across your ai channel, i found it to be one of the best ai channels on youtube that you can find, and also i like the intuition part where you told about the lets verify, Amazing video, keep up the good work 👍

  • @jimg8296
    @jimg8296 5 หลายเดือนก่อน +1

    Great research. Thank you very much. I appreciate how you have pulled together vast amount of data into an understandable video. It would take me months to get close to this understanding of Q*. Now is was just 1/2 hr with your research and video editing. RESPECT!

  • @FranXiT
    @FranXiT 5 หลายเดือนก่อน +5

    I was just thinking about how much I wanted a new video from you :3 thank you.

  • @TheLegendaryHacker
    @TheLegendaryHacker 5 หลายเดือนก่อน +5

    Damn, to me this feels like the discovery of nuclear chain reactions. It's not quite there yet, but you can see the faint glimmer of something world changing to come. Especially that "general self-improvement" stuff... GPT-5 is gonna be wild.

  • @Rawi888
    @Rawi888 5 หลายเดือนก่อน +1

    I'm laying here depressed beyond all reasoning, hearing you speak about your passions really lift my spirits. Thank you friend.

    • @aiexplained-official
      @aiexplained-official  5 หลายเดือนก่อน +1

      Thanks Rawi, that's so kind. Now time for you to find and speak on your passions!

    • @Rawi888
      @Rawi888 5 หลายเดือนก่อน

      @@aiexplained-official GOTCHA 🫡. You just joined twitter, imma find you and make you proud.

  • @MrSchweppes
    @MrSchweppes 5 หลายเดือนก่อน +2

    Oh, I've been waiting for your video since the Q Star news. Great dive. Thanks a lot for making this video! 👍

  • @spaceadv6060
    @spaceadv6060 5 หลายเดือนก่อน +22

    Still the highest quality AI channel on TH-cam. Thanks again!

  • @darinkishore9606
    @darinkishore9606 5 หลายเดือนก่อน +5

    you’re goated for this one man

  • @xwkya
    @xwkya 5 หลายเดือนก่อน +2

    This channel is a blessing. I have been navigating news the past week, but this is the place that I feel gives the most accurate information and informed speculations.

    • @xwkya
      @xwkya 5 หลายเดือนก่อน

      And the theory on Q* applying Q learning to decode is very interesting. Thinking of GPT Zero, I have wondered if algorithms used in Alphazero such as MCTS (using GPT as a policy function) have been tested for decoding purpose, this also fits the idea of increasing inference cost. I hope you will continue to share your knowledge and investigations

  • @tlskillman
    @tlskillman 5 หลายเดือนก่อน +6

    Great job. A real service to us all. Thank you.

  • @Lvxurie
    @Lvxurie 5 หลายเดือนก่อน +9

    Listening to the guy talk about AlphaGo reminds me of how human development occurs.
    An early stage of learning is the actor stage where kids copy what the people around them do to try and figure out the correct way to act often also copying poor behaviours.
    The next stage is called the motivated agent. To be an agent is to act with direction and purpose, to move forward into the future in pursuit of self-chosen and valued goals.
    Since AI is essentially trying to recreate human thinking i wonder if creating AI models that following the development of humans is the best way to get to AGI.

    • @honkhonk8009
      @honkhonk8009 5 หลายเดือนก่อน

      Lol Il apply that with my math courses.
      Im having trouble with proofs. Right now all I can do, is just copy what other people have wrote and regurgitate it.
      But hopefully with enough practice I can get into the "motivated agent" phase likie you suggest I ugess lmfao.

  • @kombinatsiya6000
    @kombinatsiya6000 5 หลายเดือนก่อน +2

    This is the channel i return to over and over again to make sense of the latest AI research.

  • @holographicman
    @holographicman 5 หลายเดือนก่อน +2

    Hands down the best AI update channel, I just remove any suggested channels popping up at this point. Oh and as a musician and synth developer, that last demo is cool, I can imagine a synthesizer or DAW in the future with where humans can interact in super creative ways. Love it. ❤

  • @6GaliX
    @6GaliX 5 หลายเดือนก่อน +3

    The name Q* might be just an hommage to the A* pathfinding method.
    Therefor a special way of creating chain of thoughts.
    While "Q" = Q-learning a common reinforment learning method in machine learning.

  • @Stephen_Lafferty
    @Stephen_Lafferty 5 หลายเดือนก่อน +4

    I can barely believe that is has been just seven days since Sam Altman was fired by OpenAI. What an American Thanksgiviing it was for Sam to return to OpenAI. Thank you for your insightful analysis as always!

  • @uraszz
    @uraszz 5 หลายเดือนก่อน +1

    I've been seeing news about Q* for a day or two but refused to watch anything before you uploaded. I trust you with anything AI. Thank you!!

    • @aiexplained-official
      @aiexplained-official  5 หลายเดือนก่อน +1

      I might be wrong, but I gathered quite a bit of evidence for you to evaluate!

  • @Robert_McGarry_Poems
    @Robert_McGarry_Poems 5 หลายเดือนก่อน

    20:00 I think you pretty much nailed it. This sounds pretty amazing. In all honesty this should be how the core models are trained. In my opinion, this type of processing would make alignment super easy. In the sense that you could have multiple "observers" all with their own _obviously programmed in bias_ as a second layer, that then would be filtered by a third layer, which is the true autonomous "observer."

  • @krishp1104
    @krishp1104 5 หลายเดือนก่อน +4

    I've been checking your channel impuslively waiting for this video

  • @middle-agedmacdonald2965
    @middle-agedmacdonald2965 5 หลายเดือนก่อน +4

    Thanks! Very down to earth, and well thought out.

  • @Gonko100
    @Gonko100 5 หลายเดือนก่อน +2

    By far the best channel regarding this topic. Like, it's not even close.

  • @guilleru2365
    @guilleru2365 5 หลายเดือนก่อน +1

    Things are going so fast that it’s hard to imagine how it will look like in just a few more weeks. Amazing work!

  • @ryanhm1004
    @ryanhm1004 5 หลายเดือนก่อน +7

    This reminds me of the movie "Arrival", when it was very difficult to communicate with aliens because you had to explain what an adjective is, what a noun is and it would be easier to communicate with robots through mathematics than language(like said Karpathy), because you could simply reward giving functions to solve and evolving this ability to reason about, in the end, as it says in Aristotelian Logic Theory, language is mathematics too.

    • @Mr_Duck_RVA
      @Mr_Duck_RVA 5 หลายเดือนก่อน +1

      I just watched that movie for first time the other night

    • @electron6825
      @electron6825 5 หลายเดือนก่อน

      ​@@Mr_Duck_RVAWhat did you think about it?

  • @DaveShap
    @DaveShap 5 หลายเดือนก่อน +5

    This is way better than breaking AES-192.

    • @zero_given
      @zero_given 5 หลายเดือนก่อน +2

      Loved your video mate!

    • @prolamer7
      @prolamer7 5 หลายเดือนก่อน +2

      You are big person for acknowledging that this video is better than yours!

    • @DaveShap
      @DaveShap 5 หลายเดือนก่อน

      @@prolamer7 we're all speculating here and I have a lot of respect for my fellow creators. I view it as all part of a bigger conversation.

    • @prolamer7
      @prolamer7 5 หลายเดือนก่อน

      @@DaveShap That said!!! Of many other AI youtubers you are consistently among TOP too!!! I hate to sound too simplistic. Sadly yt comment system is kinda designed to allow only short thoughts and shouts.

  • @skier340
    @skier340 5 หลายเดือนก่อน

    Fantastic breakdown. Really doing your homework to get us some real concrete possibilities of what's actually happening with the architecture of Q* when everything else just seems wild speculation.

  • @En1Gm4A
    @En1Gm4A 5 หลายเดือนก่อน +1

    thats i can go sleep fine again now - u uncovered the behind the scenes - It is even aligned with what i thought might be the key to more capabilities. THANK YOU !!!!

  • @nathanbanks2354
    @nathanbanks2354 5 หลายเดือนก่อน +18

    GPT-4 is already using let's verify step by step. I've often asked it to program something or refactor something, at the first thing it does is come up with an English list of what it's about to do. This list then becomes part of the tokens it uses to generate the following tokens as it actually writes the program. It's like it changes my query into an easier query. It wasn't doing this when I signed up in April.

    • @thearchitect5405
      @thearchitect5405 5 หลายเดือนก่อน +5

      It does it on small scales, but not quite on the same scale as in the paper. Otherwise you'd be getting 30 line responses to basic questions. It also doesn't verify on a step by step basis.

    • @nathanbanks2354
      @nathanbanks2354 5 หลายเดือนก่อน +1

      @@thearchitect5405 I meant that they're using the techniques suggested from some papers earlier this year which suggested to use "think step-by-step" as part of the query to an LLM. It was a prompt-engineering technique. This was one of several techniques which substantially improved accuracy for answering exam questions. It could definitely be improved and I didn't read this particular paper, so I'm sure you're right about the scale being larger.

    • @adfaklsdjf
      @adfaklsdjf 5 หลายเดือนก่อน

      @@nathanbanks2354 have you set any custom instructions, by chance? ;)

    • @homelessrobot
      @homelessrobot 5 หลายเดือนก่อน +1

      @@thearchitect5405 maybe there is a threshold of compexity or something, but a couple of weeks ago I did an open book calculus course with GPT-4. It was generating step-by-step answers so large that it would stop and ask me if I wanted it to continue. > 30 lines each. Much greater. These answers took several minutes to generate each in full. It also passes that course with flying colors.

  • @Y3llowMustang
    @Y3llowMustang 5 หลายเดือนก่อน +4

    I've been refreshing waiting for this video from you

  • @HenriKoppen
    @HenriKoppen 5 หลายเดือนก่อน

    Whenever I have a discussion about any topic, when someone is making a claim, I ask "please help me understanding your conclusion, can you bring me there step by step?" . This is so powerful, because when some claim is heavily biased it will emerge from this step-by-step process. It really made me stronger in having discussions and share my step-by-step reasoning. All truth comes from the details... This video is really inspiring, smart, in the right tone, well explained. Thank you for spending the time to do this right!

  • @minnie-piano3969
    @minnie-piano3969 5 หลายเดือนก่อน +2

    Really well done video! I actually was thinking that the star in Q* represents A star algorithm, which according to wiki maintains "a tree of paths originating at the start node and extending those paths one edge at a time." This sounds really similar to timestamp 37:37 of Andrej Karpathy's recent intro to LLM video where he talks about a lot of recent interest in system 2 thinking for llms as a "tree of thoughts"

  • @QuarkTwain
    @QuarkTwain 5 หลายเดือนก่อน +5

    As if things weren't trending enough towards the conspiratorial, now they have their own "Q". Feel the AGI!

    • @adfaklsdjf
      @adfaklsdjf 5 หลายเดือนก่อน

      I feel it.

  • @KyriosHeptagrammaton
    @KyriosHeptagrammaton 5 หลายเดือนก่อน +4

    I remember back when they had AIs learning to play mario and it was super slow to get generically good, and then they encouraged it to get a high score instead of reaching the end goal, or something like that, and suddenly it was learning way faster and much better at arbitrary levels.

  • @gobl-analienabductedbyhuma5387
    @gobl-analienabductedbyhuma5387 5 หลายเดือนก่อน

    Such deep research! Man, you're just always way ahead of everyone else with your work. Thank you!

  • @WilliamsDarkoh
    @WilliamsDarkoh 5 หลายเดือนก่อน +1

    Congrats on the 200 k, my predictions were on point!

  • @jtjames79
    @jtjames79 5 หลายเดือนก่อน +7

    Q* make me a design for a cold fusion powered jetpack, please. 😎👍

  • @nomadv7860
    @nomadv7860 5 หลายเดือนก่อน +3

    Amazing video. I appreciate your investigation into this

  • @sushihusi35
    @sushihusi35 5 หลายเดือนก่อน

    Damn, the research/investigation you just did on this topic is insane. Hats off, thank you for this video!

  • @DiscoTuna
    @DiscoTuna 5 หลายเดือนก่อน

    Wow - what a detailed line of thought and extensive amount of research you have gone through to produce this vid. Thanks

  • @aspuzling
    @aspuzling 5 หลายเดือนก่อน +3

    Thank you for not just spouting the "have OpenAI reached AGI?" hyperbole. This is really interesting research.

  • @jeff__w
    @jeff__w 5 หลายเดือนก่อน +4

    25:45 “I think the development is likely a big step forward for narrow domains like mathematics but is in no way yet a solution for AGI the world is still a bit too complex for this to work yet.”
    That’s a really important qualification-we’re not _yet_ on the verge of our glorious/terrifying AGI future-and that, I think, undercuts the (to me, much over-hyped) theory that some AI “breakthrough” was what spooked the board into ousting Sam Altman. Some old-fashioned power play/interpersonal conflict seems a lot more likely to me (although an AI breakthrough might have exacerbated the already-existing tensions).
    And that Q* is a reference to “the optimal Q-function” 18:44 seems entirely plausible. It’s just what you’d expect from the AI researchers at OpenAI.

  • @jontrounson5441
    @jontrounson5441 5 หลายเดือนก่อน

    This is the information I’ve been looking for. Thanks for doing the heavy lifting on the research that we’ve all needed on this topic!

  • @JustinHalford
    @JustinHalford 5 หลายเดือนก่อน +1

    I was waiting for this one! Absolutely riveting. Our collective progress is quickly being rendered compute bound.

    • @JohnSmith762A11B
      @JohnSmith762A11B 5 หลายเดือนก่อน +1

      This is perhaps why Sam has been running around trying to get new chip fabs built. Nvidia is simply not enough when infinite computing power is best. This in fact has always been a primary doom scenario: that an AGI/ASI becomes addicted to getting smarter and reformats the entire cosmos into one gigantic mind.

  • @Chris-se3nc
    @Chris-se3nc 5 หลายเดือนก่อน +4

    Obviously they have developed Q from Star Trek. Q is initially presented as a cosmic force judging humanity to see if it is becoming a threat to the universe, but as the series progresses, his role morphs more into one of a teacher to Picard and the human race generally - albeit often in seemingly destructive or disruptive ways, subject to his own will

  • @user-hk8jt6so3l
    @user-hk8jt6so3l 5 หลายเดือนก่อน +11

    YOU ARE THE BEST! I am so happy to have found you back at the beginning of AI "craze", and words cannot describe how grateful me and your other viewers are to you for such a high quality content! I believe your work will play a huge role in humanity's future!
    edit: grammar

  • @SamGirgenti
    @SamGirgenti 5 หลายเดือนก่อน

    You and Wes are the best AI presenters on youtube in my opinion. Thanks for taking the time to teach. :)

  • @a.thales7641
    @a.thales7641 5 หลายเดือนก่อน +1

    Philipp this video of you really brings some tweets from Jimmy Apples to my mind. It said something akin to this "the really important papers regarding AI and better models are already out, some since quite a time, but OpenAI tries to diverge the attention of those important papers to some other, new, but not that important ones"
    At least that is what I seem to remember.

  • @randomuser5237
    @randomuser5237 5 หลายเดือนก่อน +3

    This actually makes me even less hopeful about open-source AI. It's quite clear that most of the people who can make new breakthroughs are working in these companies and will not publish their research. It also throws out the idea that it's only about data and compute to make better models. Open source will keep lagging behind them every day unless the government steps up and provide the financial incentives to the national labs so that they get the top researchers and publish open source models.

    • @prolamer7
      @prolamer7 5 หลายเดือนก่อน

      You are right there is only really handful of really smart people in opensource which is for "free" unline in companies where you are paid milions. BUT once there is as smart model as GPT4 for everyone to use it will help even small guys to create novel and good models.

  • @felipoto
    @felipoto 5 หลายเดือนก่อน +3

    New Ai Explained video letss goooooo

  • @michaelwoodby5261
    @michaelwoodby5261 5 หลายเดือนก่อน +1

    20:43 is the most succinct description of Q* I have heard yet.

  • @tristanwegner
    @tristanwegner 5 หลายเดือนก่อน

    I follow closely over last weekend, but as my time investment is limited, to is great to have you to synthesize the facts together with some digging, like the barely watched videos!

  • @supremebeme
    @supremebeme 5 หลายเดือนก่อน +9

    AGI happening sooner than we think?

    • @SaInTDomagos
      @SaInTDomagos 5 หลายเดือนก่อน

      That’s the power of exponential functions.

  • @tomaszkarwik6357
    @tomaszkarwik6357 5 หลายเดือนก่อน +4

    3:44, hey, another polish note. "Łukasz" is indeed the polish version of lucas, but if you want to be 100% correct, the translatileretion into british english would be something like "wukash"
    Edit, i forgot to "cite" my sources. I am a native polish speaker

  • @jacorachan
    @jacorachan 5 หลายเดือนก่อน

    Great video as usual. Please keep on making them! You provide a thoughtful vision of current state of AI and I really appreciate the way that you elaborate your ideas with what you read or listen in videos.
    Again, fantastic work 👏

  • @danielcsillag1726
    @danielcsillag1726 5 หลายเดือนก่อน +1

    Been eagerly waiting for this 😂, great video!

  • @antoniopaulodamiance
    @antoniopaulodamiance 5 หลายเดือนก่อน +4

    Best channel. The amount of time dude spend reading and following all the noise to get to a high quality 15 min videos is fantastic

  • @andrew.nicholson
    @andrew.nicholson 5 หลายเดือนก่อน +7

    20:45 The idea of training a model on its own output makes me think about our own brains and how dreaming and sleep are critical to our ability to learn. Sleep is when we take what has happened during the day - our success and failures - and integrate them into our long term memory.

    • @adfaklsdjf
      @adfaklsdjf 5 หลายเดือนก่อน +1

      we also loop over our thoughts while we're thinking about a problem... we come up with an idea and then reconsider it, poke holes, test it out in various ways, compare it to other ideas. a neural net's inputs traverse the network once and become outputs.. without loops it's like it's only given one shot to "think about" something before giving its answer..
      the sleeping/dreaming/integration analogy is interesting.

    • @homelessrobot
      @homelessrobot 5 หลายเดือนก่อน

      or the concept of active recall. You read a little, then you answer questions about what you learned or summarize what you have learned and receive feedback for that. It's important to note that the output its training on isn't just raw output, it's not a closed loop. There is a second model 'grading' the answers. So, there is external feedback involved.

  • @williamjmccartan8879
    @williamjmccartan8879 5 หลายเดือนก่อน

    Thank you Phillip, glad to see you've taken the dive on X, thank you again, teaching these lessons are really important to a lot of us who haven't your skills and experience in researching all of this material and are educated through that process. Peace

  • @beowulf2772
    @beowulf2772 5 หลายเดือนก่อน +3

    A 6B model with that much capability 💀

  • @nanow1990
    @nanow1990 5 หลายเดือนก่อน +4

    Let's breakdown this step-by-step.

  • @errgo2713
    @errgo2713 5 หลายเดือนก่อน

    This is such a helpful round up. Thank you!!

  • @MrBorndd
    @MrBorndd 5 หลายเดือนก่อน +1

    This channel provides the best, most well researched, cutting edge information about AI development available. While the competition just repeat eachother and dont offer much more than what we already got from reuters. Excellent journalism!

  • @memegazer
    @memegazer 5 หลายเดือนก่อน +3

    Thanks!

    • @aiexplained-official
      @aiexplained-official  5 หลายเดือนก่อน +2

      Thanks memegazer!!

    • @memegazer
      @memegazer 5 หลายเดือนก่อน +1

      Really glad you dug into this to offer some new insight...it is really fascinating

  • @ZuckFukerberg
    @ZuckFukerberg 5 หลายเดือนก่อน +1

    This must be one of your hardest videos to tackle yet, will definitely require a few sittings to fully understand it.
    Thanks for not being afraid of providing us some complex information, even if perfectly simplified by you.

    • @aiexplained-official
      @aiexplained-official  5 หลายเดือนก่อน +1

      Thanks man, as I commented, wish I could have included more captions and explainers but ran out of time before flight

    • @ZuckFukerberg
      @ZuckFukerberg 5 หลายเดือนก่อน

      @@aiexplained-official Such is life, keep it up, AI Explained

  • @jumpstar9000
    @jumpstar9000 5 หลายเดือนก่อน +4

    Maybe it is a search for Quality/model refinement of the weights using something similar to the A* algorithm. Pure speculation of course.
    Very interesting stuff. Thanks for the insights and commentary Philip.

    • @aiexplained-official
      @aiexplained-official  5 หลายเดือนก่อน +1

      Thanks Jumpstar!

    • @jumpstar9000
      @jumpstar9000 5 หลายเดือนก่อน

      @@aiexplained-official I was thinking. Maybe it's a goal seeking strategy that's better than simple CoT. That would make a lot of sense.

    • @jumpstar9000
      @jumpstar9000 5 หลายเดือนก่อน

      @@aiexplained-official You know what else I was thinking. If Ilya is running the superalignment team, and super means superintelligence, doesn't that kind of imply that AGI is already done if they are on to superintelligence. Unless they are just trying to get ahead of the game a bit of course. but it is difficult to guess what an ASI would even be like.

  • @geepytee
    @geepytee 5 หลายเดือนก่อน +1

    Excellent video, and glad you're on twitter now

  • @Rawi888
    @Rawi888 5 หลายเดือนก่อน

    You, Matt Wolfe, Wes and David Shapiro are my only trusted sources. Especially you. I reaaaaally love and appreciate all the work you do.

  • @TheRealistMus
    @TheRealistMus 5 หลายเดือนก่อน +5

    First. Always a pleasure when AI explained uploads

    • @jerkevandenbraak
      @jerkevandenbraak 5 หลายเดือนก่อน +1

      Best AI channel there is (or that I know of)