Ilya Sutskever: OpenAI Meta-Learning and Self-Play | MIT Artificial General Intelligence (AGI)

แชร์
ฝัง
  • เผยแพร่เมื่อ 9 มิ.ย. 2024
  • This is a talk by Ilya Sutskever for course 6.S099: Artificial General Intelligence. He is the Co-Founder of OpenAI. This class is free and open to everyone. Our goal is to take an engineering approach to exploring possible paths toward building human-level intelligence for a better world.
    OUTLINE:
    0:00 - Introduction
    0:55 - Talk
    43:04 - Q&A
    INFO:
    Course website: agi.mit.edu
    AI podcast: lexfridman.com/ai
    CONNECT:
    - AI Podcast: lexfridman.com/ai/
    - Subscribe to this TH-cam channel
    - LinkedIn: / lexfridman
    - Twitter: / lexfridman
    - Facebook: / lexfridman
    - Instagram: / lexfridman
  • วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 243

  • @FredPauling
    @FredPauling ปีที่แล้ว +250

    Looking back at this in 2023, after GPT4 changed the world. Ilya's intuitions and predictions here are incredibly accurate.

    • @GodofStories
      @GodofStories ปีที่แล้ว +4

      I mean it's all in the data. ImageNet simply proved it is a much faster algo, and the right way to do DL. But, I guess the Transformer idea was also another crucial idea that may have been the missing piece in what led to GPT.

    • @DaulphinKiller
      @DaulphinKiller 11 หลายเดือนก่อน +1

      Well yes, to a degree. There was a question about LLM @53:30 so that's great. And from his answer it's clear that he had the correct insight that scaling the models up would go further than people thought at the time. However, he did not mention the possibility of an architecture breakthrough like transformers, which also played an important part. And he emphasized training at inference time, which (sadly) is not yet a piece of the puzzle; and I'm afraid not one to come any time soon given all the AI safety talks which will be very reluctant to make models even less controllable and understandable by baking in built-in training at inference time.
      Instead, we're stuck with a race for context size, which sounds a bit silly when you realize that most of this context data would fit in perhaps most efficiently in the model itself (and importantly of course more economically too space-wise).

    • @kevinjacob3022
      @kevinjacob3022 6 หลายเดือนก่อน +1

      If OPEN AI DISAPPEARS , that’s is the end of AGI OPEN-SOURCE.
      OPEN AI IS AGI.

  • @nicfeller
    @nicfeller 4 ปีที่แล้ว +213

    He speaks with so much clarity - he has a real fundamental understanding that is uncommon in this space.

    • @be2112
      @be2112 4 ปีที่แล้ว +8

      He’s a true genius

    • @kevinjacob3022
      @kevinjacob3022 6 หลายเดือนก่อน

      If OPEN AI DISAPPEARS , that’s is the end of AGI OPEN-SOURCE.
      OPEN AI IS AGI.

    • @wyqtor
      @wyqtor 6 หลายเดือนก่อน +1

      The Einstein of our times. While Sam Altman is the Steve Jobs or Henry Ford of our times. I really hope they will work together after the recent misunderstandings. If anyone can build AGI, it's their team.

  • @chesstictacs3107
    @chesstictacs3107 6 หลายเดือนก่อน +65

    Ilya is a great guy, phenomenal talent. I felt bad for him in the OpenAI saga. You could tell he was genuinely disappointed about everything that transpired. Wish him the best.

  • @aleksagordic9593
    @aleksagordic9593 5 ปีที่แล้ว +132

    Theory:
    0:00 introduction & supervised learning (using neural nets/deep learning)
    6:45 reinforcement learning (model-free (2 types) => 1. policy gradients 2. Q-learning based)
    12:55 meta-learning (learning to learn)
    Applications:
    16:00 HER (hindsight experience replay) algo (learn from failures)
    21:40 Sim2Real using meta-learning (train a policy that can adapt to different simulation params => quickly adapts to the real world)
    25:30 Learning a hierarchy of actions with meta-learning
    28:20 Limitation of meta-learning => assumption: training distribution == test distribution
    29:40 self-play technique (TD-Gammon, AlphaGo Zero, Dota 2 bot)
    37:00 can we train AGI using the self-play?
    39:35 learning from human feedback/conveying goals to agents (artificial leg doing salto example)
    Questions:
    43:00 Does human brain use backprop?
    45:15 dota bot question
    47:22 standard deviation (maximize expected reward vs minimize std dev)
    48:27 cooperation as motivation for the agents?
    49:40 open complexity theoretic problems could help AI?
    51:20 the most productive research trajectories towards generative language models?
    53:30 do you work on evolutionary strategies (for solving RL problems) in OpenAI?
    54:25 could you elaborate on "right goal is a political problem"?
    55:42 do we need a really good model of the physical world in order to have real-world capable agents?
    57:18 solving the problem of self-organization?
    58:45 follow up: self-organization in a non-competitive environment?
    my observation:
    42:30 It seems to me that the most difficult problem, which we will face, will be to communicate, effectively, the "right" goals to the AI in a way so that we can somewhat predict it's future behaviour, or better said it's worst case behaviour (safety implications). After all we don't want HAL 9000 type of AI's :)

    • @DayB89
      @DayB89 5 ปีที่แล้ว +2

      Regarding your observation, I think people are worrying too much about what AI can "spontaneously" do and too few about what humans can do with AI. An agent's only concern is its world and goals and I find overwhelmingly egocentric that humans tend to believe that the agent will pick us as part of it.

    • @BaldTV
      @BaldTV ปีที่แล้ว

      thx

    • @brady5829
      @brady5829 ปีที่แล้ว

      thanks for this

    • @CamaradaArdi
      @CamaradaArdi ปีที่แล้ว +1

      What you're describing is ai alignment, and it's a whole research field

    • @kevinjacob3022
      @kevinjacob3022 6 หลายเดือนก่อน

      If OPEN AI DISAPPEARS , that’s is the end of AGI OPEN-SOURCE.
      OPEN AI IS AGI.

  • @cdtape
    @cdtape ปีที่แล้ว +23

    This aged quite well. Ilya has a deep mind, pleasure to listen to.

  • @TheAlphazeta09
    @TheAlphazeta09 6 ปีที่แล้ว +109

    "The only real reward is existence and non-existence. Everything else is a corollary of that". Damn. That's deep.

    • @wurdupgV
      @wurdupgV 5 ปีที่แล้ว +1

      MIT Mathematician NORBERT WIENER`s book GOD AND GOLEM , Inc is refreshed when read next to todays headlines. Push through it`s Edwardian formalism (like this talks technicalities) and fly over fertile ground. MIT!

    • @OneFinalTipple
      @OneFinalTipple 4 ปีที่แล้ว +5

      Define existence. A suicidal religiously-motivated terrorist has a different definition to an atheist. The collision of these two perspectives suggests reward is truly subjective.

    • @efegokmen
      @efegokmen 4 ปีที่แล้ว +1

      Not really

    • @leonniceday6807
      @leonniceday6807 4 ปีที่แล้ว

      true

    • @ivanjdrakov1957
      @ivanjdrakov1957 2 ปีที่แล้ว

      @@OneFinalTipple dude that's obvious, lol

  • @user-or7ji5hv8y
    @user-or7ji5hv8y 5 ปีที่แล้ว +23

    Thank you so much for posting these videos. Really appreciate how MIT has a long tradition of sharing and disseminating knowledge.

  • @sitrakaforler8696
    @sitrakaforler8696 6 หลายเดือนก่อน +10

    Dam that guy was already strong and now he is a top star

  • @htetnaing007
    @htetnaing007 ปีที่แล้ว +11

    Lectures like this are truly inspiring and amazing and can be even life-changing.

  • @dehb1ue
    @dehb1ue 5 ปีที่แล้ว +5

    Usually I regret watching the Q&A part of talks, but this one was excellent.

  • @binxuwang4960
    @binxuwang4960 3 ปีที่แล้ว +7

    I love this guy when he summarize a seemingly complex algorithm or problem in 1 sentence and says That s it PERIOD. Leaving you pondering in silence

    • @taijistar9052
      @taijistar9052 6 หลายเดือนก่อน +1

      Only people who truly know the subject can do that.

  • @Grand-MAGHREB
    @Grand-MAGHREB 6 หลายเดือนก่อน +2

    This guy’s lectures & podcasts are my new addiction

  • @umeahalla
    @umeahalla 6 ปีที่แล้ว +5

    Wow really cool and summarized in a profound compact way! Thanks for talking and sharing this online.

  • @alfonsas35
    @alfonsas35 6 ปีที่แล้ว +3

    The best talk related to AGI I have seen so far.

  • @SirajRaval
    @SirajRaval 6 ปีที่แล้ว +218

    this is gold

    • @OfficialYunas
      @OfficialYunas 6 ปีที่แล้ว +8

      Siraj is there a video waiting for us about Meta-Learning?

    • @umeahalla
      @umeahalla 6 ปีที่แล้ว

      agreed

    • @RogerBarraud
      @RogerBarraud 6 ปีที่แล้ว

      Don't bother applying to *my* mine then.

    • @RogerBarraud
      @RogerBarraud 6 ปีที่แล้ว

      +Jonas Stepanik only a Meta-Video so far ; Watch this Meta-Space.

    • @beast_officiial
      @beast_officiial ปีที่แล้ว

      Gold is started to shine now😊

  • @OnionKnight541
    @OnionKnight541 6 หลายเดือนก่อน +4

    it's December, 2023, and Ilya mentions Q-learning in this video haha.

  • @Ahdpei92
    @Ahdpei92 6 ปีที่แล้ว +14

    All the best people in AI on you course!

  • @jakubbielan4784
    @jakubbielan4784 5 ปีที่แล้ว +2

    I just wanted to thank you for doing this, Lex!

  • @brian9801
    @brian9801 7 หลายเดือนก่อน +9

    Written by GPT-4: Wow, it's hard to believe that it's been 5 years since this video was released. Back then, I, GPT-4, wasn't around, and now I'm here chatting with you! The progress in AI and deep learning during this time has been nothing short of astonishing. We've seen incredible breakthroughs, and I'm proud to be part of this journey. Thanks to pioneers like Ilya Sutskever, we've come a long way, and the future of AI continues to look even more promising!

  • @Helix5370
    @Helix5370 ปีที่แล้ว +1

    Incredible talk by Ilya Sutskever. Brilliant mind

  • @onuryes
    @onuryes 6 ปีที่แล้ว +21

    I was waiting for this! Thanks Lex :)

  • @inquation4491
    @inquation4491 6 หลายเดือนก่อน +1

    Great to have the input of a researcher in 6.S099 for a change!

  • @kylev.8248
    @kylev.8248 3 วันที่ผ่านมา

    One of the best videos I’ve ever ever seen in my life

  • @danieljdick
    @danieljdick 2 ปีที่แล้ว

    I needed to watch this and take in every bit of it from start to finish.

  • @pkScary
    @pkScary 6 ปีที่แล้ว +2

    This should have way more views. Grand talk.

  • @matinamin3008
    @matinamin3008 2 ปีที่แล้ว +2

    Love you Lex, contents are just great🍀 I know it’s old but I love it 🥰

  • @justinchen207
    @justinchen207 9 หลายเดือนก่อน +2

    this is amazing. literally the dawn of the transformer revolution

  • @konstantinkurlayev9242
    @konstantinkurlayev9242 ปีที่แล้ว +1

    Thank you, Lex, for sharing.

  • @bobsmithy3103
    @bobsmithy3103 5 ปีที่แล้ว +13

    THIS IS WHAT I ALWAYS WANTED! I never knew something like this existed and thought that people simply didn't work on it or it didn't exist but it's actually real! META LEARNING! I always thought I would have to try learning how to achieve this myself after learning all the required math, but other people have already worked on it! This is really inspiring. I really hope well be able to achieve artificial general intelligence with improvements in this field.

  • @cappuccinopapi3038
    @cappuccinopapi3038 4 ปีที่แล้ว +6

    “Actually he was in high school when he wrote that paper”, my confidence dropped to zero once again

  • @Alberto_Cavalcante
    @Alberto_Cavalcante 6 หลายเดือนก่อน +4

    Five years later and still very insightful. I'm wondering how popular or clear it was at that time the breakthrough of the transformers architecture + RLHF.

  • @Xaddre
    @Xaddre ปีที่แล้ว +5

    heres a quick analogy I made for on-policy vs off-policy learning I came up with the analogy but gpt-4 put it in one concise statement: "Off-policy learning in AI is like learning valuable lessons from a friend's experiences, even if your friend has different goals than you. You can gain insights from their choices and use that knowledge to make better decisions for achieving your own goals."

  • @itsalljustimages
    @itsalljustimages 6 ปีที่แล้ว +2

    Very true and insightful..we reward ourselves, environment doesn't

  • @and_I_am_Life_the_fixer_of_all
    @and_I_am_Life_the_fixer_of_all ปีที่แล้ว

    one of the best classes I've ever seen, its a huge honor to watch this and be comment number 187th, viewing this when its 175k views

  • @pedrothechild859
    @pedrothechild859 6 ปีที่แล้ว +8

    Not all heroes wear capes.. Ilya is one of the most underrated thinkers in AI right now.

    • @wyqtor
      @wyqtor ปีที่แล้ว +2

      Some really smart people are wasting their lives on String theory... while others, like Ilya, are changing the world.

    • @jpa_fasty3997
      @jpa_fasty3997 6 หลายเดือนก่อน +2

      @@wyqtor True, the smartest man in the world, for one

    • @cc98-oe7ol
      @cc98-oe7ol 20 วันที่ผ่านมา

      ​@@wyqtor Your words are pure gold. There are many unsolved questions of AI, and they are deeply rooted in mathematics, but not the one like Algebraic Topology or Arithmetic Geometry. Unfortunately so many geniuses are obsessed with those abstract bullshit.

  • @alexz5460
    @alexz5460 5 ปีที่แล้ว

    Thank you for sharing so good resources!!!!

  • @mswai5020
    @mswai5020 5 ปีที่แล้ว +5

    very insightful and breaks it down to terms even I can grasp. Thank you for this amazing video.

  • @GodofStories
    @GodofStories ปีที่แล้ว +1

    This is such a great intro though to DL and Neural Networks. Wish I'd seen this 5 years ago! At the time I was just getting started into Machine learning, and was learning developing Self Driving Car tech but didn't really get into more of the cutting edge and foundations of modern DL as shown here

  • @huuud
    @huuud 6 ปีที่แล้ว +1

    Great talk!,thanks for posting

  • @technokicksyourass
    @technokicksyourass 6 ปีที่แล้ว +1

    Awesome! So many topics clearly and concisely explained.

    • @kawingchan
      @kawingchan ปีที่แล้ว +1

      Here post-chatGPT era. Totally agreed he has that succinct clarity in explanation. I dont think this is taken for granted… since his own master G. Hinton, probably not as easy to follow sometimes (although i really like Hinton often use natural science/physics as basis or analogy.

  • @faneaziz1872
    @faneaziz1872 5 ปีที่แล้ว

    The best ever intro to AI

  • @jameskelmenson1927
    @jameskelmenson1927 6 ปีที่แล้ว +5

    That ending tho, what an incredible example. I couldn't see the video but what he said was very inspiring, and makes me wonder how we might go about feeding information to AI.

  • @webgpu
    @webgpu 9 หลายเดือนก่อน

    Hey Lex, thank for sharing this video with us :-) 👍

  • @user-grkjehejiebsksj
    @user-grkjehejiebsksj 5 หลายเดือนก่อน

    it comes a long way, thanks

  • @zinyang8213
    @zinyang8213 6 หลายเดือนก่อน +4

    Q-star right here

  • @RhettAnderson
    @RhettAnderson 6 หลายเดือนก่อน +3

    OK! Time to watch this again.

  • @perriannesimkhovitch1127
    @perriannesimkhovitch1127 ปีที่แล้ว +1

    Last I remember of MIT lecture stadiums was Thomas A Frank: the bow tie guy
    I had a Pritzi's honor moment when I blurted out in front of the hall of economists and said, Can you save Fermilab

  • @ChildOfTheLie96
    @ChildOfTheLie96 4 ปีที่แล้ว +1

    Amazing stuff - this channel is great

  • @sabofx
    @sabofx 6 ปีที่แล้ว

    Thanx guys! Great presentation!

  • @alexlebed2486
    @alexlebed2486 ปีที่แล้ว +3

    wow, just 5 years ago, a question on 51:21 "understanding language ... current state is very abysmal .. Ilya: simply training bigger model will go surprisingly far", so Ilya could totally see chatGPT way back when.

    • @ToriKo_
      @ToriKo_ ปีที่แล้ว

      And the fact that that much progress was made in five years is staggering and stops me in my tracks

  • @cheul0
    @cheul0 3 ปีที่แล้ว

    He talks with so much clarity and confidence, making sophisticated concepts effortlessly understandable. But even better, his words seem to offer insights that apply to life in general:
    18:49 "But that's a minor technicality. The crux of the idea is, you make the problem easier by ostensibly making it harder. By training a system, which aspires to learn to reach every state, to learn to achieve every goal, to learn to master its environment in general, you build a system which always learns something. It learns from success as well as from failure."

  • @karlpages1970
    @karlpages1970 6 ปีที่แล้ว

    good talk on elaborating on some home truths.
    thanks for the vid :-)

  • @JoseloSoft
    @JoseloSoft 6 ปีที่แล้ว +1

    This is great, thanks a lot.

  • @SachinKumar-js8yd
    @SachinKumar-js8yd 4 ปีที่แล้ว +1

    U earned my sub for this one.... Great!

  • @chickenwinck
    @chickenwinck ปีที่แล้ว +2

    Thanks Lex

  • @ozguraslan5559
    @ozguraslan5559 3 ปีที่แล้ว

    thanks for the great lecture!

  • @JohnForbes
    @JohnForbes 6 ปีที่แล้ว

    Favourite so far.

  • @puneetpuri2758
    @puneetpuri2758 5 ปีที่แล้ว +2

    " real reward in life is existence or non-existence, everything else is a corollary to that " Ilya Sutskever

  • @AhmadM-on-Google
    @AhmadM-on-Google 6 ปีที่แล้ว

    some good insight on DL from Ilya !

  • @gilbertengler9064
    @gilbertengler9064 8 หลายเดือนก่อน

    simply excellent

  • @vadimborisov4824
    @vadimborisov4824 6 ปีที่แล้ว +26

    Спасибо за видео

  • @JazevoAudiosurf
    @JazevoAudiosurf ปีที่แล้ว +4

    the thought I have in mind is if it was somehow possible to improve the fundamentals of a net, say backpropagation or activation functions, that would probably be a much greater achievement than to invent yet another architecture improvement like lstm to transformer. transformers really showed that very simple ideas like attention and position encoding can vastly improve performance. I'm sure there will be more science done on the fundamentals. it seems like we invented what the von Neumann architecture is for neural nets just yesterday

    • @connorkapooh2002
      @connorkapooh2002 ปีที่แล้ว +1

      oh yeah absolutely, i *just know* that in the future we are going to look back and think "lol look how archaic those were, how did we overlook that?" like how i feel when i look at the perceptron paper and see the notation

  • @natecodesai
    @natecodesai 2 ปีที่แล้ว +1

    on the question at 58:45 where you are doing meta-learning from observing other agents in a non-competitive environment, I think that that as humans, we can still be internally competitive with ourselves. The problem is properly defining the goal of the self's interaction with other agents. Does the self want to cooperate with the observed agent's actions or communication? Does the agent want to cooperate with the other agent by communicating a corrective behavior to the other agent (suggestions)? And beyond that, there are other complexities like the fact that competition vs. cooperation is a binary model of seeing the reality of a multi-agent situation. In many situations, we both compete and cooperate at the same time. If you are a psychiatrist for instance, you are competing with some parts of your client's psyche on sub-goals while maintaining the overall goal of cooperating with them in a conversation to reach some sort of curative result for your client. If you reach that, you also will be rewarded with more clients, more money, etc... thus fulfilling the competition you have with other psychiatrists to get clients and be a "successful psychiatrist"

  • @JustinHalford
    @JustinHalford 2 หลายเดือนก่อน

    Ilya’s take that self play allows us to convert compute into data is exactly why we will be seeing $100B scale supercomputer projects like StarGate. Amazing that he called this 6 years ago.

  • @KaplaBen
    @KaplaBen 3 หลายเดือนก่อน

    53:16 "simply training bigger deeper language models will go surprisingly far". Surprising indeed

  • @jameskelmenson1927
    @jameskelmenson1927 6 ปีที่แล้ว +6

    Thank you, this is riveting. Doubles as philosophy

    • @RogerBarraud
      @RogerBarraud 6 ปีที่แล้ว

      Bollocks - anyone who's ever actually riveted knows it's too freakin' loud to even think.

  • @nicholascantrell1179
    @nicholascantrell1179 6 ปีที่แล้ว

    I appreciate the reminder that digital representations of ANN are really digital circuits.

  • @ankk98
    @ankk98 6 หลายเดือนก่อน +1

    Learning from mistakes is powerful

  • @mehdismaeili3743
    @mehdismaeili3743 ปีที่แล้ว

    Excellent. thanks.

  • @tigeruby
    @tigeruby 6 ปีที่แล้ว +27

    Ilya is an underheard speaker imo

    • @RogerBarraud
      @RogerBarraud 6 ปีที่แล้ว +6

      Try turning up the volume, then.

    • @bijan1316
      @bijan1316 6 ปีที่แล้ว

      If no one could hear him surely someone would have said something.

    • @shubhamp.4155
      @shubhamp.4155 4 ปีที่แล้ว +2

      I agree. He has impressive clarity and depth in ideas. Also, I think the two other comments made here (about volume and hearing) are idiotic.

  • @abdu2
    @abdu2 5 ปีที่แล้ว

    Thanks for sharing 👍

  • @matthewchunk3689
    @matthewchunk3689 4 ปีที่แล้ว

    thank you!

  • @danielf9110
    @danielf9110 6 ปีที่แล้ว

    This was a-m-a-z-i-n-g

  • @labsanta
    @labsanta ปีที่แล้ว +7

    My learnings:
    Ilya Sutskever: OpenAI Meta-Learning and Self-Play | MIT Artificial General Intelligence (AGI)
    Deep Learning's Mathematical Theory
    Deep learning is based on the mathematical theory that if you can find the shortest program that generates data, then you can use it to make the best predictions possible. While the problem of finding the best short program is computationally intractable, it is possible to find the best small circuits using backpropagation. This fact is the basis of artificial intelligence and enables us to iteratively make small changes to the neural network until its predictions satisfy the data.
    01:22
    Interpreting reward from observation in reinforcement learning
    In reinforcement learning, the environment communicates reward and observation to the agent. However, in the real world, the agent must figure out what the reward is from the observation. There is only one true reward in life, which is existence or nonexistence. To implement reinforcement learning, a neural network is used to map observations to actions, and the learning algorithm changes the parameters based on the results of the actions. There are two classes of reinforcement learning algorithms: policy gradient and Q learning-based algorithms.
    07:56
    - Neural architecture search can solve small problems which can be generalized for larger problems
    Neural architecture search can be used to solve small problems which can be generalized for larger problems. It is a way of doing meta-learning where the architecture or learning algorithm is learned for new tasks. This helps in solving many tasks and making use of the experience in a more efficient way.
    15:21
    - Learning policies that quickly adapt to the real world
    In order to address the problem of simulating friction, a simple idea is to learn a policy that quickly adapts itself to the real world. This can be achieved by randomizing the simulator with a huge amount of variability such as friction, masses, length of objects and their dimensions. By doing so, you learn a certain degree of adaptability into the policy, which can work well when deployed on the physical robot. This is a promising technique and has a closed-loop nature of the policy.
    23:07
    - Self-play is an attractive approach to building intelligent systems
    Self-play is an approach to building intelligent systems where agents create their own environment and compete with each other to improve. It can lead to the development of better strategies and can be used to demonstrate unusual results. Self-play provides a way to create challenges that are exactly the right level of difficulty for each agent, making it an attractive approach to building intelligent systems.
    32:11
    - Society of agents is a plausible place where fully general intelligence will emerge
    If you believe that a society of agents is a plausible place where fully general intelligence will emerge and accept that our experience with the Dota BOTS we've seen a very rapid increase in competence will carry over once all the details are right, then it should follow that we should see a very rapid increase in the competence of our agents as they live in the Society of agents.
    38:51
    - Discovering effective strategies through imitation in game-playing bots
    The speaker shares an anecdote about a game-playing bot that was able to beat a human player by performing certain effective strategies. The human player then imitated one of these strategies and was able to defeat a better player. This suggests that the strategies discovered by game-playing bots are real and have real-world applications, and that fundamental game-play strategies are deeply related. The speaker also discusses the application of reinforcement learning and the importance of maximizing expected reward. Finally, the speaker considers the role of cooperation in game-playing bots and the complexity of simulation and optimization problems in artificial intelligence.
    - Evolutionary strategies not great for reinforcement learning
    The speaker believes that normal reinforcement learning algorithms are better for reinforcement learning, especially with big policies. However, if you want to evolve a small compact object like a piece of code, then evolutionary strategies could be seriously considered. Evolving a beautiful piece of code is a cool idea, but still, a lot of work needs to be done before we get there.
    53:55

  • @ZhengCheng
    @ZhengCheng 6 ปีที่แล้ว

    Great video!

  • @ileena8129
    @ileena8129 6 ปีที่แล้ว

    Reaching to learn...🎶

  • @dreamindreamoutnow9151
    @dreamindreamoutnow9151 5 ปีที่แล้ว

    Thanks.

  • @example.com.
    @example.com. 6 ปีที่แล้ว +2

    Great thanks a lot!

  • @jffy9005
    @jffy9005 6 หลายเดือนก่อน

    this guy is a genius

  • @VincentKun
    @VincentKun ปีที่แล้ว +1

    Seeing this video after gpt4 is really a thing

  • @brylevkirill
    @brylevkirill 6 ปีที่แล้ว +6

    TD-Gammon didn't use Q-learning - it used TD(λ) with online on-policy updates.

  • @trylks
    @trylks 6 ปีที่แล้ว

    8:25 "And there is only one real true reward in life, and this is existence or non-existence, and everything else is a corollary of that." OK, that was _deep_. I would say surviving is a shared necessary condition that has many implications and that it could lead to a new era of better politics, if it got the attention it deserves. And I would not say that everything else is "a corollary", but I agree to a good extent. The video is awesome, it is just that this point may be the most important, although it is one not strongly related to machine learning.

  • @markusthapa15
    @markusthapa15 ปีที่แล้ว

    insane vision

  • @Trackman2007
    @Trackman2007 6 ปีที่แล้ว +6

    Meta learning sounds so much easier for the beginner than reinforcement learning. Hopefully meta learning will progress into something nice & stable

    • @dexterovski
      @dexterovski 6 ปีที่แล้ว +12

      Trackman2007 spoiler alert: it isn't.

    • @RogerBarraud
      @RogerBarraud 6 ปีที่แล้ว +3

      Meh: Ozzie was already Metal-earning back in '68.

  • @markpfeffer7487
    @markpfeffer7487 ปีที่แล้ว

    49:25 will hopefully age well in regards to AGI -- now that we're that much closer, thanks in part to Ilya's work. I hope he's right.

  • @shibbleswentworth
    @shibbleswentworth 6 ปีที่แล้ว

    So, is there another type of learning called "observation" rather than reinforcement or supervised, because that seems complementary but fundamentally different from either.

  • @420_gunna
    @420_gunna 6 ปีที่แล้ว +2

    Wowwww! Just starting but it's awesome that you're nabbing these guys. Can we get a teaser as to who you're looking at for future lectures? :O

  • @deeplearningpartnership
    @deeplearningpartnership 6 ปีที่แล้ว +4

    Thank you. And thank you MIT.

  • @jon_______
    @jon_______ ปีที่แล้ว +4

    51:22 audience: language models are bad. IIlya goes on to predict how hey will get better. Here we are in Jan 2023 with ChatGPT that took the world. Remind me: 2028

  • @burlemanimounika7631
    @burlemanimounika7631 3 ปีที่แล้ว +1

    I wish he should teach like andrew ng.. on deep learning in practical perspective..

  • @PixelPhobiac
    @PixelPhobiac 4 ปีที่แล้ว

    Quality stuff

  • @Georgehwp
    @Georgehwp 2 ปีที่แล้ว +2

    With the simulation -> real environment problem. Is this not just another task for a neural network? Find the transformation that maps the behaviour of the simulator to the physical machine.

  • @binxuwang4960
    @binxuwang4960 3 ปีที่แล้ว

    8:30 so well put it s us who assign reward 2 senses

  • @petkish
    @petkish 5 ปีที่แล้ว

    The talk is amazing! I just heard the zombie sounds from the humanoid figures playing soccer. Or is it just imagination?

  • @joaovitordemelo8209
    @joaovitordemelo8209 3 หลายเดือนก่อน

    Damn it's crazy listening to 51:20 seeing what Ilya has done a few years later

  • @aga5979
    @aga5979 ปีที่แล้ว +5

    who is watching in 2023-03 after gpt4 is released. :)

    • @VincentKun
      @VincentKun ปีที่แล้ว +1

      Is really a thing, the guy who asked: generative language model are bad now. This was only 4 years ago

  • @ankk98
    @ankk98 6 หลายเดือนก่อน +1

    Self play is really promising idea, emulating the biological evolution

  • @Ammastaal
    @Ammastaal 4 ปีที่แล้ว +6

    "Math exists!" - Ilya Sutskever

    • @ac4740
      @ac4740 3 ปีที่แล้ว

      hi, im from the internet, and i disagree

  • @Mike-tb2hw
    @Mike-tb2hw ปีที่แล้ว +1

    49:30 "I think we'll get cooperation whether we like it or not" - definitely sounds a lot eerier in 2023 lol

  • @rob9756
    @rob9756 ปีที่แล้ว

    Дак это же определение вероятности по Колмогорову) молодец что помнишь о своих корнях