I really enjoyed this conversation with David. Here's the outline: 0:00 - Introduction 4:09 - First program 11:11 - AlphaGo 21:42 - Rule of the game of Go 25:37 - Reinforcement learning: personal journey 30:15 - What is reinforcement learning? 43:51 - AlphaGo (continued) 53:40 - Supervised learning and self play in AlphaGo 1:06:12 - Lee Sedol retirement from Go play 1:08:57 - Garry Kasparov 1:14:10 - Alpha Zero and self play 1:31:29 - Creativity in AlphaZero 1:35:21 - AlphaZero applications 1:37:59 - Reward functions 1:40:51 - Meaning of life
Please invite Humberto Maturana: He had develop theories on human intelligence, consciousness and understanding. He is in his 90s, we could lose his takes on artificial intelligence
Again, Mr. Fridman, THANK YOU for keeping this going, especially now. When I need to get my mind off the current world situation I come here. Your talks always take me to a better place. Thank you. Be safe. Stay healthy.
The Future is the pieces of our preferred past we havent pulled down yet. Welcome home.
4 ปีที่แล้ว +106
Amazing, this conversations are so meaningful to the future of humanity that they should be broadcasted on national television. That way children would more easily find meaningful role models and access to the type of insightful ideas that give birth to passions and eventually discoveries.
I totally agree with you. These are the role models that our children must be familiarized with not some attention addicts on the social media who act as a catalyst to remove the brain from the anatomy of human beings.
Thanks for making this podcast. David Silver chooses his words very well, his stories are very clear and inspiring! I could have listened much longer ;-)
I watched Alpha Go vs. Lee sedol tournament documentary Deepmind recently uploaded, and I cried. It was so inspiring, touching and beautiful. Thanks very much Lex for this podcast.
Anyone else get excited by Deepmind's latest "muzero" algorithm that David discussed, starting from about 1:28:00 into the video? Supposedly a new algorithm that is able to figure-out the rules and constraints of the environment by itself. I'd love to hear more in depth discussions about Muzero's capabilities in future talks with Deepmind's finest 😎!
Lex, It is very clear that you love what you do. It totally shows. You are always super prepared and well engaged with your guests. Yours has become my absolutely favorite podcast. Listening to a 2 hr podcast of yours is as intellectually fulfilling as reading a 400 page incredible book.
Oh man! That meaning of life interpretation! I think I'm gonna click this 1:41:20 every night before sleep from now on. Thank you Lex for making this possible! ❤️
This was the AI interview I've been waiting for - it did deliver. It could have been a bit longer and included the protein folding work, though. Perhaps that's ongoing and still a competitive area. There is a certain clarity of articulation from the guests I enjoy most - reminds me of Jeff Hawkins. Also a sense of practical application.
Many academics are terrible at explaining their domain of expertise. David is a quality academic and has remained grounded enough to explain himself to normal folk like me. Well done.
1:06:48 That part implies that Lee Se-dol retired because of AlphaGo, while in reality he retired because of his dissatisfaction with the Korea Baduk Association, from which he quit in 2016. He mentioned AlphaGo but it is not the reason he quit.
Favorite parts: 1:21:16 - Self-play is optimal because the NN learns most robustly by making mistakes. Conclusion: there is no “pill” for intelligence, you evolve intelligence by correcting errors. AI introduced to the physical world would need systems tolerant to making countless errors. 1:24:47 - One model will beat another 100-0. We can construct a tower of models this way, each better than the previous. What's unclear is if this tower is totally ordered or partially ordered. Can a lower node beat a node higher in the tower? When does this occur? Where is this saturation point? How much higher is it than human intelligence in Go? There may exist an equilibrium of Go intelligences, not a greatest Go intelligence. This is the result of minimax optimization vs global optimization. 1:41:20 - It concludes with a fun interpretation of the meaning of it all :)
Agree with your comments. Regarding 1:24:47, I feel his statement merely reflects his desires and not the future reality of all programmed systems. Yes, something far intelligent can beat something far less ordered in a limited gaming setting but possibly not all the time. There is a limit to success in a totally ordered system where the outcome of two perfect playing systems end in stalemate most of the time. I would of liked to have heard the results of AlphaGo or AlphaZero playing against itself with recursive/feedback learning turned off.
Hey Lex! I really dig your podcasts. They're unique and inspiring. I wish i was smart enough to contribute to the AI community. I hope you will allow me to make a suggestion about the questions you ask your guests. Please ask more about the aspects of AI that still lie beyond the capability of the systems that your guests have built. Have them explain why these capabilities are so hard to achieve. Also, ask them what the current or next thing is, that they are working on. And on this topic, ask what the challenges are that they're currently facing? TYIA!
He did ask and David did elaborate that life's problems are not so structured, that they are messy as he put it. There are variables that the repetition algorithm would not know how to sufficiently handle without a ton of errors, which is how it learns in the first place. We, humans, have intuition by varying degrees to assist us. That's one reason, messiness, why it's hard to achieve besides we dont yet know how to code for it. Alpha*Anything is a great breakthrough but very much in its infancy.
Mind teased, tantalized, and finally thrown into a tizzy. Love every one of your interviews Lex. All I want to do is watch them to get inspired to think in new ways. THANKS MAN!
Crazy Lex.. I just went down the alpha learning machine rabbit hole this week. I watched the documentary on alphago, which was fascinating. I also watched the matches between the pro starcraft players and alphastar, which was even more fascinating (partially because I'm familiar with the game). I wonder in this sphere, how far a deep learning machine like this can go. This podcast was the icing on the cake at the bottom of the rabbithole, thanks brother!
Very proud of my old university - University of Alberta. Dr. Silver got his PhD there under Richard Sutton. Great interview. Was looking forward to this one.
Those who don’t have sophisticated backgrounds in Programming can really appreciate the way you relate what the computers are doing and capable of doing to the romantic human narratives
I found it interesting that there was some bafflement at the power of randomness. Randomness (mutations) coupled with an objective function (maximise fitness) produced us and all the wonders of life. What could be more powerful than that?!
Who/What inserted the objective function phase? Evolution is not a power, or a force, or an entity that can be identified, cataloged or bottled. It's a process inherent in the variation of pre-coded genetic material. Time itself codes nothing, it's possibility already has to be there within the code, same as Alpha*anything. The programmers programmed it to learn from reiteration or feedback repetition, after first a combination of repetition and a database of the best played games of Go, which is a breakthrough in our traditional thinking of how programming should or could work. I, for one, welcome our new AI Overlords.
@@chrisofnottingham failure in genetics means eventual, if not immediate, death of the system, no procreation, no passing Go, no collecting $200. Your simple pass/fail random reproduction system as the *progression* of a biological organism doesn't exist in any course of biology. Random mutations, otherwise known as re-coding errors, are mutations that no biological system known is built upon for speciation. Interspecies copulation usually results in sterility, if successful.
At some point around 12:00 David mentions that he just came from a panel discussion with Kasparov and the Deep Blue guy... any reference, link to that maybe?
Hey lex, really interesting episode. A guest I think you should have on your podcast is Leo Gura. His work is more particularly focused on the nature of consciousness and he is for me one of the most insightful people I have ever listened to.
Trying to reproduce the MCTS results on some other tasks. After several weeks of struggling, I learned that David Silver is really great in a sense that he foresee the future of deep learning research -- computational power really matters.
Man, David Silver is such a genius! I've enjoyed the interview so much. I wouldn't say Lex interview policy can be considerd as optimal yet, but the story you create through your questions, the way you try to go to the essence when you close your eyes and just the way you are make it be really close. If you read this, thank you
There is a crucial aspect of the Google alpha project that is, to my knowledge, never touched upon in any interviews; it is only mentioned in the technical articles that are published by the researchers. The google Alpha projects use a hybrid approach: the *tree search* is taken care of with explicit programming. The part of the play that is taken care of by the deep learning technology is the *evaluation* of positions. The tree search needs two kinds of position evaluation 1) Decisions which nodes to prune from the search tree 2) evaluation of end nodes That is: The Alpha projects did not have to learn to look ahead. That is very significant, because deep learning is ill suited to learn to anticipate. To my understanding: this is why Google Deepmind has shifted away from the board games projects, and has moved to Starcraft play. For comparison: OpenAI started its efforts with a task more difficult than the hybrid approach of the Alpha projects. Competitive DOTA play (and all games that require human level skill) involves *long term goals*. To my understranding: the OpenAI machines aren't hybrid, they have to learn all, including mastery of long term goals Reinforcement learning is feasible when the reward is almost immediate. The bigger the time lapse between action and reward, the harder it is for the reinforcement learning to get traction. So: the Alpha projects are hybrids: no learning to look ahead. In an interview: when this aspect is not mentioned then in my opinion the interview is severly devalued.
I think it's a point well made about look-ahead and difficulty RL has with disconnection of actions and reward. But as to explicit search vs deep learning for evaluation, I think it's fair to say the learned evaluation is doing the work of look-ahead to, because the eval encodes some version of knowing what will happen -- outcome. It's kind of a backwards way of thinking compared to how humans will often take a board state and chew on it, but imagine if you could "look ahead" 1 turn in any game and give a *very* accurate eval, your "search" algo would be really easy, and it'd require no modeling of state or system.
@@oncedidactic David Silver talks about that starting around 00:49:00 Deepmind's first exploratory research was analogous to deep learning for image recognition. A large collection of labeled images is processed, and if the learning process has been succesful then system is able to achieve high scores on image recognition for images it has not seen before. So that first exploratory system was presented with labeled positions, presumably labeled in terms of whether the position is equal or one color likely to achieve a win. Presumably (I haven't read that article) the input/output pair was: input: the current position, output: what the human player played next. David Silver describes that by design this exploratory system had no tree search. So it was very interesting to them what level of play would be achieved. David describes that this system achieved dan level play, on the 19 by 19 board. While still very far from top level play, this was already on par with the best tree search based systems. I'm not a Go player, but my understanding is that for human players looking ahead explicitly is relatively rare. Thinking ahead explicitly is possible only when for both sides there is on each move only one viable move, with only an occasional branch. My understanding: generally in Go there are always many, many viable moves. I disagree with your proposal that there is an element of *implicit* look ahead. I prefer to think of the instantaneous evaluation in term of overall health of the position. The system has learned to appreciate positions with good overall health.
Thanks for Boss content empowering people, many young people enjoying this content and in my opinion, such a treasure it is, the exponential tune to your tone.
Thank you for another enlightening, exploratory, and meaningful conversation that pushes us towards self-questioning and, one hopes, self-understanding.
The work this man has done with his team is both amazing and an absolutely terrifying necessity in the process of human progress. The DeepMind team and the wider field of AGI research will bring a revolution more radical than the industrial and digital ones
@37:00 « DL can understand the World itself » I have doubts .... big doubts... how would DL (or even better : RDL) understand that any object fall at the same speed without air resistance (without us knowing it) ?
Summary of his answer to the meaning of life: Perhaps the universe's meaning of life is to maximize entropy through physical laws. In trying to do this, it discovered this biological method for dispersing energy that works on planetary scales, called life. To make more of life, it implemented evolution, which in itself had it's own goal of passing on genes; the meaning of life for evolution is to pass on genes. This deviated slightly from the universe's goal, but it remained close enough. In order to better pass on genes, some biological life grew out of their hard-coded mannerisms and developed a sophisticated, wet, electric computer that cataloged and stored old data which it used to simulate future events to better survive the complex ecosystem (the brain). The brain in-turn deviated its meaning of life from evolution's, though it remained fairly aligned (aligned enough to make it work at least). The brain's meaning of life varies, but it's generally some mixture of achieving well-being for itself, helping other conscious beings discover that too, and unveiling truths. Now the brains of the world are creating new types of intelligence to better achieve their goals, which will likely partially align with our own, but also deviate in much the same way that our meaning deviated from evolution's, and evolution's deviated from the universe's. The meaning of life is a never ending stack of compounding energy dispersal errors. I added my own spices to the mix, though that was mostly David's special word stew.
Unfortunately, nothing, from which everything proceeded has no intentions, no innate ability to create and no substance from which to do so. There is no scientific test or conjecture that can prove or intelligently explain how nothing can have intention. What you are talking about is intelligence behind design and action just as he describes @1:22:30 onward.
What a profound way to discuss about the meaning of life! First there are several layers for the meaning of life. First layer would be "Does the universe have a meaning?". Well, it looks like it operates on some very fine-tuned laws and constants. At first glance, there is no meaning. For the next layer, let's look at 2nd law of thermodynamics. It's purpose to increase entropy. What if the evolution is just a mechanism (a sub goal) in order to increase entropy further? Evolution's goal is how to reproduce efficiently. In other words how to spread energy efficiently. Because of it, entropy will increase as efficiently as possible. This line of thought is truly mind-blowing. Probably I will not able to sleep for 2-3 days, because of thinking about this concept... Lex and David, Thank you for the conversation!
I dare say this was THE most interesting episode so far! Deep learning is solving perception big time but it seems to me that (deep) RL will solve the cognition part of the equation.
I really enjoyed this conversation with David. Here's the outline:
0:00 - Introduction
4:09 - First program
11:11 - AlphaGo
21:42 - Rule of the game of Go
25:37 - Reinforcement learning: personal journey
30:15 - What is reinforcement learning?
43:51 - AlphaGo (continued)
53:40 - Supervised learning and self play in AlphaGo
1:06:12 - Lee Sedol retirement from Go play
1:08:57 - Garry Kasparov
1:14:10 - Alpha Zero and self play
1:31:29 - Creativity in AlphaZero
1:35:21 - AlphaZero applications
1:37:59 - Reward functions
1:40:51 - Meaning of life
OMG THANK YOU
Thank you very much Lex 🙏
Thanks
Please invite Humberto Maturana: He had develop theories on human intelligence, consciousness and understanding. He is in his 90s, we could lose his takes on artificial intelligence
Bring David Deutsch please! :)
"He'll be remembered as the last person to beat AlphaGo"
man!!
,,, kudos n respect on that comment! ... greetINX from s.lem jr ... .. . ...............
Again, Mr. Fridman, THANK YOU for keeping this going, especially now. When I need to get my mind off the current world situation I come here. Your talks always take me to a better place. Thank you. Be safe. Stay healthy.
Seeing this after the AlphaGo doc!
Watching the documentary before watching this interview definitely adds value. th-cam.com/video/WXuK6gekU1Y/w-d-xo.html
As have I! I was searching of an Alpha Zero doc. This is where I got so far. Not disappointed at all!
Yes came here directly after the Doc as well. Had never heard of GO! prior to 3hrs a go. Indelibly registered and imprinted now :D
Same
maap no need to capitalize and exclaim, any more than you’d write CHESS!
THIS IS THE ONE I'VE BEEN WAITING FOR!
@@mikhailfranco dude, thanks 🙌
This is a banger of an interview. AlphaZero is a harbinger of the future
I can't describe or express how valuable this interview is for understanding what's going to happen in the future
The Future is the pieces of our preferred past we havent pulled down yet. Welcome home.
Amazing, this conversations are so meaningful to the future of humanity that they should be broadcasted on national television. That way children would more easily find meaningful role models and access to the type of insightful ideas that give birth to passions and eventually discoveries.
I totally agree with you. These are the role models that our children must be familiarized with not some attention addicts on the social media who act as a catalyst to remove the brain from the anatomy of human beings.
I also totally agree.. So beautifully phrased!!
Lol who watches national television though? More people will watch it on youtube.
Thanks for making this podcast. David Silver chooses his words very well, his stories are very clear and inspiring! I could have listened much longer ;-)
you just gotta love David Silver and his ideas, thoughts and accent
Discovery is a joy. Discovering the existence of David Silver and his amazing way of thinking is pure gold. Thank you Lex.
I watched Alpha Go vs. Lee sedol tournament documentary Deepmind recently uploaded, and I cried. It was so inspiring, touching and beautiful. Thanks very much Lex for this podcast.
Anyone else get excited by Deepmind's latest "muzero" algorithm that David discussed, starting from about 1:28:00 into the video? Supposedly a new algorithm that is able to figure-out the rules and constraints of the environment by itself. I'd love to hear more in depth discussions about Muzero's capabilities in future talks with Deepmind's finest 😎!
I am very happy to see that 3.22M people are watching this channel.
His answers are so articulate!
Lex,
It is very clear that you love what you do. It totally shows.
You are always super prepared and well engaged with your guests.
Yours has become my absolutely favorite podcast. Listening to a 2 hr podcast of yours is as intellectually fulfilling as reading a 400 page incredible book.
Oh man! That meaning of life interpretation! I think I'm gonna click this 1:41:20 every night before sleep from now on.
Thank you Lex for making this possible! ❤️
I initially cringed a little when Lex decided to "go there" with the meaning of life question but pshew! Silver gave a great answer.
sabelch yes that answer was very impressive and I think demonstrated his capacity of deep thinking
I was laughing to myself and thinking: "All he needs to do now is ask him the meaning of life question". And then he did!
Indeed, probably David's comment regarding the meaning of life was by far the most philosophically meaningful I have ever come across.
there's a book called 'the fabrics of reality'
I can ignore everyone else but David Silver talking about AI. His lectures and courses taught me RL.
Thanks
This was the AI interview I've been waiting for - it did deliver. It could have been a bit longer and included the protein folding work, though. Perhaps that's ongoing and still a competitive area. There is a certain clarity of articulation from the guests I enjoy most - reminds me of Jeff Hawkins. Also a sense of practical application.
Pala
They figured it out
@@Jacob-sb3su they?
Many academics are terrible at explaining their domain of expertise. David is a quality academic and has remained grounded enough to explain himself to normal folk like me. Well done.
1:06:48 That part implies that Lee Se-dol retired because of AlphaGo, while in reality he retired because of his dissatisfaction with the Korea Baduk Association, from which he quit in 2016. He mentioned AlphaGo but it is not the reason he quit.
1:40:51 : One of the best answers for the purpose and meaning of life I have heard so far. Incredible!
3 years later I am here... Latest AI developments makes me ask for a second round with David Silver. Thanks for sharing 👍🏼
David and demis, hope you get nobel prize someday soon.
It's 2024. It happened man! Demis got a Nobel prize for his breakthrough work on Proteins
@@jonathanmahenge8263 but i didn't know it was going to be that soon 😄. Just incredible and well deserved.
Watch this documentary if you want to get into the story of AlphaGo:
th-cam.com/video/WXuK6gekU1Y/w-d-xo.html
Man, David Silver is so incredibly humble...
Alpha Zero - "Give the system the ability to correct its own errors"
Favorite parts:
1:21:16 - Self-play is optimal because the NN learns most robustly by making mistakes. Conclusion: there is no “pill” for intelligence, you evolve intelligence by correcting errors. AI introduced to the physical world would need systems tolerant to making countless errors.
1:24:47 - One model will beat another 100-0. We can construct a tower of models this way, each better than the previous. What's unclear is if this tower is totally ordered or partially ordered. Can a lower node beat a node higher in the tower? When does this occur? Where is this saturation point? How much higher is it than human intelligence in Go? There may exist an equilibrium of Go intelligences, not a greatest Go intelligence. This is the result of minimax optimization vs global optimization.
1:41:20 - It concludes with a fun interpretation of the meaning of it all :)
Agree with your comments. Regarding 1:24:47, I feel his statement merely reflects his desires and not the future reality of all programmed systems. Yes, something far intelligent can beat something far less ordered in a limited gaming setting but possibly not all the time. There is a limit to success in a totally ordered system where the outcome of two perfect playing systems end in stalemate most of the time. I would of liked to have heard the results of AlphaGo or AlphaZero playing against itself with recursive/feedback learning turned off.
Hey Lex! I really dig your podcasts. They're unique and inspiring. I wish i was smart enough to contribute to the AI community.
I hope you will allow me to make a suggestion about the questions you ask your guests. Please ask more about the aspects of AI that still lie beyond the capability of the systems that your guests have built. Have them explain why these capabilities are so hard to achieve. Also, ask them what the current or next thing is, that they are working on. And on this topic, ask what the challenges are that they're currently facing?
TYIA!
He did ask and David did elaborate that life's problems are not so structured, that they are messy as he put it. There are variables that the repetition algorithm would not know how to sufficiently handle without a ton of errors, which is how it learns in the first place. We, humans, have intuition by varying degrees to assist us. That's one reason, messiness, why it's hard to achieve besides we dont yet know how to code for it. Alpha*Anything is a great breakthrough but very much in its infancy.
Thanks for putting the ads in the beginning !! It's way better than getting your concentration broke mid interview
This interview was so good it brought a tear to my eye!
Incredible podcast, probably my favourite! It would be incredible to have a second part!
6 months ago I didn’t even know who Lex was, now I can’t get enough of his podcasts. The powers of the internet. I hope he does become a billionaire.
David Silver is a real legend
This interview is LEGENDARY!... watching it for the second time. Definitely in the top 3 on youtube!
I love how the wall and window are decorated to resemble a go board
David is an amazing being.
I am struck by how small the audience is for this astonishing talk. It is so important that it should number in the millions, even billions.
Thank you for Lex and David! Very interesting and inspiring conversation about first principles of Artificial Intelligence.
Get Demis on here please!
Amen
Yes!
Yes please Lex Demi’s would be awesome 😎
Brilliant interview. Articulate and like yourself, I believe AlphaGo was a tipping point for the progress of humanity.
My Saturday blockbuster, thanks Lex. David is a cool dude, have to get Demis in now :)
Such an inspiring conversation, as a phd candidate who works on deep RL, I am quite motivated to try even harder! Thanks for your efforts Lex!
such an annoying comment, as someone who hates humble bragger, I am quite motivated to downvote your comment! Thanks mr poo on road!
@@smegmaprince314??? He just said he's inspired because he's working toward entering the same field as the podcast guest. Don't be dumb and weird.
This is the best of all episodes and I know I am biased. Thanks Lex.
Thank you both!
It was, again, an awesome conversation.
Wow, very insightful, nice to get our minds off of the pandemic and look to a bright future. Incredible potential behind DRL!
Wow! This was an incredibly insightful and inspiring conversation. Thank you Lex, David, and your teams for this.
Mind teased, tantalized, and finally thrown into a tizzy. Love every one of your interviews Lex. All I want to do is watch them to get inspired to think in new ways. THANKS MAN!
Thanks Lex! Even bigger greatness is coming your way!! Cheers! Stay safe!
I love the content you put out man! It's always interesting, always paradigm challenging, calm, informed, you! Thanks!
Crazy Lex.. I just went down the alpha learning machine rabbit hole this week. I watched the documentary on alphago, which was fascinating. I also watched the matches between the pro starcraft players and alphastar, which was even more fascinating (partially because I'm familiar with the game). I wonder in this sphere, how far a deep learning machine like this can go. This podcast was the icing on the cake at the bottom of the rabbithole, thanks brother!
Very proud of my old university - University of Alberta. Dr. Silver got his PhD there under Richard Sutton. Great interview. Was looking forward to this one.
Those who don’t have sophisticated backgrounds in Programming can really appreciate the way you relate what the computers are doing and capable of doing to the romantic human narratives
This is a really great interview and very enlightening. Thanks for all of your hard work bringing this stuff to us. Keep up the good work.
Awesome conversation, David is incredibly interesting and humble also amazing questions from Lex. Thanks to both of you for making it.
Excellent podcast, thank you
I found it interesting that there was some bafflement at the power of randomness. Randomness (mutations) coupled with an objective function (maximise fitness) produced us and all the wonders of life. What could be more powerful than that?!
That's exactly what I thought. From being microbes in the sea our algorithm was basically small random variations and then pass/fail reproduction.
Quite standard game theory concept. Correctly randomising close EV decisions will result in the most optimal/un-exploitative solution.
Celtlen i'm sure your intuition is correct. How about you run a simulation for a few billion years and get back to us with your results?
Who/What inserted the objective function phase? Evolution is not a power, or a force, or an entity that can be identified, cataloged or bottled. It's a process inherent in the variation of pre-coded genetic material. Time itself codes nothing, it's possibility already has to be there within the code, same as Alpha*anything. The programmers programmed it to learn from reiteration or feedback repetition, after first a combination of repetition and a database of the best played games of Go, which is a breakthrough in our traditional thinking of how programming should or could work. I, for one, welcome our new AI Overlords.
@@chrisofnottingham failure in genetics means eventual, if not immediate, death of the system, no procreation, no passing Go, no collecting $200. Your simple pass/fail random reproduction system as the *progression* of a biological organism doesn't exist in any course of biology. Random mutations, otherwise known as re-coding errors, are mutations that no biological system known is built upon for speciation. Interspecies copulation usually results in sterility, if successful.
I must say, one of the best podcasts. Thanks, Lex and David
At some point around 12:00 David mentions that he just came from a panel discussion with Kasparov and the Deep Blue guy... any reference, link to that maybe?
Murray Campbell
YES DEEPMIND!!! (I had decided to write in all caps when I saw the thumbnail)
thank you again lex, another phenomenal interview, i cannot get enough of this wonderful channel!
his course on youtube is amazing
This is an instant like from me :)! Many thanks Lex!
You, Sir, are a gentleman and a scholar.
Mate thank you for your videos. your channel is great.
Hey lex, really interesting episode. A guest I think you should have on your podcast is Leo Gura. His work is more particularly focused on the nature of consciousness and he is for me one of the most insightful people I have ever listened to.
Trying to reproduce the MCTS results on some other tasks. After several weeks of struggling, I learned that David Silver is really great in a sense that he foresee the future of deep learning research -- computational power really matters.
Love David Silver's lectures on RL
Man, David Silver is such a genius! I've enjoyed the interview so much.
I wouldn't say Lex interview policy can be considerd as optimal yet, but the story you create through your questions, the way you try to go to the essence when you close your eyes and just the way you are make it be really close. If you read this, thank you
Thank you, one of the most interesting talks in a long time!
Thank you!! Been looking forward to this.
David is adorable, I have watched his RL Course 3-4o times. Brilliant guy and funny too
I learnt about New dimension of thinking and understanding things.
It's funny I got chance to watch it today again. Now this interview.
What a fantastic conversation!!!
There is a crucial aspect of the Google alpha project that is, to my knowledge, never touched upon in any interviews; it is only mentioned in the technical articles that are published by the researchers.
The google Alpha projects use a hybrid approach: the *tree search* is taken care of with explicit programming. The part of the play that is taken care of by the deep learning technology is the *evaluation* of positions. The tree search needs two kinds of position evaluation 1) Decisions which nodes to prune from the search tree 2) evaluation of end nodes
That is:
The Alpha projects did not have to learn to look ahead.
That is very significant, because deep learning is ill suited to learn to anticipate.
To my understanding: this is why Google Deepmind has shifted away from the board games projects, and has moved to Starcraft play.
For comparison: OpenAI started its efforts with a task more difficult than the hybrid approach of the Alpha projects. Competitive DOTA play (and all games that require human level skill) involves *long term goals*. To my understranding: the OpenAI machines aren't hybrid, they have to learn all, including mastery of long term goals
Reinforcement learning is feasible when the reward is almost immediate. The bigger the time lapse between action and reward, the harder it is for the reinforcement learning to get traction.
So: the Alpha projects are hybrids: no learning to look ahead. In an interview: when this aspect is not mentioned then in my opinion the interview is severly devalued.
I think it's a point well made about look-ahead and difficulty RL has with disconnection of actions and reward. But as to explicit search vs deep learning for evaluation, I think it's fair to say the learned evaluation is doing the work of look-ahead to, because the eval encodes some version of knowing what will happen -- outcome. It's kind of a backwards way of thinking compared to how humans will often take a board state and chew on it, but imagine if you could "look ahead" 1 turn in any game and give a *very* accurate eval, your "search" algo would be really easy, and it'd require no modeling of state or system.
@@oncedidactic David Silver talks about that starting around 00:49:00
Deepmind's first exploratory research was analogous to deep learning for image recognition. A large collection of labeled images is processed, and if the learning process has been succesful then system is able to achieve high scores on image recognition for images it has not seen before.
So that first exploratory system was presented with labeled positions, presumably labeled in terms of whether the position is equal or one color likely to achieve a win. Presumably (I haven't read that article) the input/output pair was: input: the current position, output: what the human player played next.
David Silver describes that by design this exploratory system had no tree search. So it was very interesting to them what level of play would be achieved. David describes that this system achieved dan level play, on the 19 by 19 board. While still very far from top level play, this was already on par with the best tree search based systems.
I'm not a Go player, but my understanding is that for human players looking ahead explicitly is relatively rare. Thinking ahead explicitly is possible only when for both sides there is on each move only one viable move, with only an occasional branch. My understanding: generally in Go there are always many, many viable moves.
I disagree with your proposal that there is an element of *implicit* look ahead.
I prefer to think of the instantaneous evaluation in term of overall health of the position. The system has learned to appreciate positions with good overall health.
Great podcast! Please get Jakob Foerster too!
I love you Lex Fridman
Thanks for Boss content empowering people, many young people enjoying this content and in my opinion, such a treasure it is, the exponential tune to your tone.
Fantastic one!! So many cool ideas in there!! Thanks Lex 🤘🏽
The great conversation! Now I finally understand how alphaGo and alpha Zero were created.
Hey man, awesome interviews! You seems to be a really good person. Thank you for what you are doing.
Thank you for another enlightening, exploratory, and meaningful conversation that pushes us towards self-questioning and, one hopes, self-understanding.
This talk is so inspiring.
Many thanks for sharing this amazing interview!
The work this man has done with his team is both amazing and an absolutely terrifying necessity in the process of human progress. The DeepMind team and the wider field of AGI research will bring a revolution more radical than the industrial and digital ones
I've been taking his rl lectures currently.Thanks
Good to hear the logic based programming language PROLOG mentioned.
And here we have AlphaFold 😸⚡
amazing episode
Great interview.
Thank you Lex, Great convo.
@37:00 « DL can understand the World itself » I have doubts .... big doubts... how would DL (or even better : RDL) understand that any object fall at the same speed without air resistance (without us knowing it) ?
Man i can’t thank you enough ❤️
Summary of his answer to the meaning of life:
Perhaps the universe's meaning of life is to maximize entropy through physical laws. In trying to do this, it discovered this biological method for dispersing energy that works on planetary scales, called life. To make more of life, it implemented evolution, which in itself had it's own goal of passing on genes; the meaning of life for evolution is to pass on genes. This deviated slightly from the universe's goal, but it remained close enough.
In order to better pass on genes, some biological life grew out of their hard-coded mannerisms and developed a sophisticated, wet, electric computer that cataloged and stored old data which it used to simulate future events to better survive the complex ecosystem (the brain). The brain in-turn deviated its meaning of life from evolution's, though it remained fairly aligned (aligned enough to make it work at least).
The brain's meaning of life varies, but it's generally some mixture of achieving well-being for itself, helping other conscious beings discover that too, and unveiling truths.
Now the brains of the world are creating new types of intelligence to better achieve their goals, which will likely partially align with our own, but also deviate in much the same way that our meaning deviated from evolution's, and evolution's deviated from the universe's. The meaning of life is a never ending stack of compounding energy dispersal errors.
I added my own spices to the mix, though that was mostly David's special word stew.
nice
Unfortunately, nothing, from which everything proceeded has no intentions, no innate ability to create and no substance from which to do so. There is no scientific test or conjecture that can prove or intelligently explain how nothing can have intention. What you are talking about is intelligence behind design and action just as he describes @1:22:30 onward.
@Tarik El Lel No problem, muse away!
What a profound way to discuss about the meaning of life!
First there are several layers for the meaning of life. First layer would be "Does the universe have a meaning?". Well, it looks like it operates on some very fine-tuned laws and constants. At first glance, there is no meaning. For the next layer, let's look at 2nd law of thermodynamics. It's purpose to increase entropy. What if the evolution is just a mechanism (a sub goal) in order to increase entropy further? Evolution's goal is how to reproduce efficiently. In other words how to spread energy efficiently. Because of it, entropy will increase as efficiently as possible.
This line of thought is truly mind-blowing. Probably I will not able to sleep for 2-3 days, because of thinking about this concept...
Lex and David, Thank you for the conversation!
I dare say this was THE most interesting episode so far! Deep learning is solving perception big time but it seems to me that (deep) RL will solve the cognition part of the equation.
Well done. Its great how you went into the deep background at the end there/
Thank you so much LF! Great job.