MIT 6.S091: Introduction to Deep Reinforcement Learning (Deep RL)

Lex Fridman

มุมมอง 300 919

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 2 พ.ย. 2024

ความคิดเห็น • 144

@lexfridman 5 ปีที่แล้ว ⁺³⁴³
Deep RL is my favorite subfield of AI, because it asks some fundamental questions about what it takes to build safe and intelligent robots that operate in the real world. So many open problems and interesting challenges to solve!
@thesk8erdav 5 ปีที่แล้ว ⁺³
we love you Lex!
@farhadsafaei1910 5 ปีที่แล้ว ⁺³
It's my favorite one, too. Thanks for the lecture, I did enjoy a lot watching it.
@colouredlaundry1165 5 ปีที่แล้ว ⁺³
With these lectures and interviews you are sharing and creating immense value: knowledge. Thank you!
@dklvch 5 ปีที่แล้ว ⁺¹
Thank you Lex, awesome presentation!
@liuculiu8366 5 ปีที่แล้ว ⁺¹
love your spirit in sharing the latest information. appreciate!
@NakedSageAstrology 2 ปีที่แล้ว ⁺⁵³
I wish you still did videos like this, we appreciate you sharing such knowledge.
@KeepingUp_withAI 5 ปีที่แล้ว ⁺²⁶
Deep RL is the field that excites me the most. Thank you Lex.
@wendersonj 5 ปีที่แล้ว ⁺⁸
Since 2017, Lex have improved his lessons spectacularly ! Now (2019), I watch a more fluid video with a feeling that this guy know exactly what his talking without hesitating . Once again, thanks Lex, for sharing this videos. Congratulations and thanks from Brazil.
@kawingchan 5 ปีที่แล้ว ⁺⁶
I really like that tongue in cheek chuckle when Lex talked about that multiverse and whoever created it.....
@Techieadi 5 ปีที่แล้ว ⁺⁶⁸
Thank you for bringing these lectures to us.
@nova2577 5 ปีที่แล้ว ⁺³²
"Every type of machine learning is supervised learning", cannot agree more!!!
4 ปีที่แล้ว
In fact, learning itself is a supervised process, otherwise it is acquiring not learning.
@samuelschmidgall2090 5 ปีที่แล้ว ⁺¹¹
Seriously the best Deep RL lecture out there to date.
@tarunpaparaju5382 5 ปีที่แล้ว ⁺²
I have tried to study and understand Deep RL using several books and lectures over the last few years, but I only feel like I understood something in RL after listening to this lecture. Thanks, Lex. I am grateful to you for posting this lecture on TH-cam. Thank you!
@akarshrastogi3682 5 ปีที่แล้ว ⁺⁹
1:04:40 Best part, that grin after he just casually dropped that line in an MIT lecture.. All of infinite universes being Simulations
@ronaldolum464 6 หลายเดือนก่อน
Certainly, one of the best videos on deep learning I have come across.
@DennisZIyanChen 4 ปีที่แล้ว ⁺¹
I honestly don't care about AlphaGo or Dota 2 or the robots, I just cannot get over how incredible the thought structure is behind this. What is mean by thought structure is the strategy behind how to quantify the right things, asking the right questions, and model the policy upon which growth can be created. IT IS SICK
@sivaa6130 5 ปีที่แล้ว ⁺⁶
Every Lecture has a historical context, evolution, mathematics and inspiration, Technical overview, Network Architecture overview. Well Summarized!!
@akarshrastogi3682 5 ปีที่แล้ว ⁺¹²
Professor Lex, can we get the entirety of 6.S091 on MIT OCW ? This is an incredibly interesting topic that I've been working on (Evolutionary Computing) and am currently enrolled in a project with thorough knowledge of Deep RL as a requisite. This research field has very few online resources besides Stanford's CS 234 and Berkeley's CS 285.
Your explanations are immensely helpful and intuitive. Humanity will present it's gratitude if this whole course is made available ! AGI and AI safety issues need more attention before it's the greatest immediate existential risk, your courses can help raise general AI awareness and advance our civilization to higher dimensions. Loved the fact that you grinned while just casually mentioning the Simulation Hypothesis..
@judedavis92 2 ปีที่แล้ว ⁺¹
Loved the lecture. Definitely recommend his podcast. Quality.
@vast634 4 ปีที่แล้ว ⁺³
Important detail when trying to transfer from a simulation to the real world: make the simulation have many random variations in its behavior/mechanics during runtime. (such as drag, gravity, friction, size of the agent, random perturbations, etc) This will make the agent have to generalize more, and not over optimize on the details in the sim. This makes it easier to transfer the agents capabilities to a real world environment.
@user-sc8ph2ds2m 2 ปีที่แล้ว
gravity is fake buddy ;)
@vast634 2 ปีที่แล้ว
@@user-sc8ph2ds2m Take a brick, stand still, throw it straight up, then you can observe if gravity exists, or not. Very simple experiment to administer.
@user-sc8ph2ds2m 2 ปีที่แล้ว
@@vast634 you will experience buoyancy 🤦
@merebhayl5826 2 ปีที่แล้ว
I like how you quoted many theorems from Dostoevsky and also a few axioms from the Nietzsche's texts
@merebhayl5826 2 ปีที่แล้ว
I had never seen Lex's lecture videos other than the philosophical podcasts. This is my first. And I just wrote the above comment as a joke without seeing the video and three minutes in, I found Socrates, Kant, Nietzsche... 😂😂 That's very Lex👌
@danielvelazquez4472 5 ปีที่แล้ว ⁺¹¹
Haha he says "that is super exciting", without being excited! He is a robot!
Thanks for the open lectures
@MistaSmilesz 5 ปีที่แล้ว
I've seen a lot of these videos & read some of the books in ML; Lex has a clarity thats rare
@charlesotieno6309 5 ปีที่แล้ว
Thanks Lex !! Deep Reinforcement Learning opens up a new world..Life is not that complex like the baby in your video taking his first steps...unsupervised learning. Take into account the amount of time and effort(brains+USD) of getting an AI to do what the baby is doing..WALK in a few days and in the years to come -be a professor and continue with this subject
The baby is the moral of the story....what we are doing is not working...we need a radical way of thinking...Your radical way is the way forward
@amandajrmoore3216 2 ปีที่แล้ว
As always Le a generous Share, which will be a useful resource for loads of folks. Thanks.
@mrr5183 2 ปีที่แล้ว
I appreciate the philosophical insights sprinkled throughout the lecture!
@chinbold 5 ปีที่แล้ว
I like his lecture because it's more understandable. And I also like his tones.
@ArghyaChatterjeeJony 5 ปีที่แล้ว ⁺²
Lex Fridman, I just love your videos. I am your great fan sir. Carry on.
@Arghamaz 10 หลายเดือนก่อน
This is interesting for me as this is my favorite Mathematics n Statistics combined Algebraic equations 🎉 MATHEMATICS is the Best Subject in World 🌎 👌 ❤🎉🎉
@samferrer 5 ปีที่แล้ว
Another detail I have noticed in many presentations ... those agents are not trying to model the environment ... that is semantically impossible ... what they are trying to do instead, I believe, is to model AN INSTANCE OF A DUAL SPACE associated to the environmental space. It is very common to use linear regressions for instance ...
@samferrer 5 ปีที่แล้ว
Kevvy Kim hmmm ... we are saying the same thing ... it seems that practitioners and lectures keep it short without realizing perhaps the big conceptual gap is being created.
@abdulrahmankerim2377 5 ปีที่แล้ว ⁺⁴
One of the best lectures, I have ever watched ....Keep it up.
@jefferysherwood7424 4 ปีที่แล้ว
🐸🐸🐸🐸🐸
@neutrinocoffee1151 5 ปีที่แล้ว ⁺⁶
Loved this lecture. I learned a lot. Thank you.
@ruinsaneornot 5 ปีที่แล้ว ⁺²⁷
30:30 "you know, MIT does better than Stanford that kind of thing" xD
@benyaminewanganyahu 11 หลายเดือนก่อน
This guy should do podcasting.
@onwrdandupwrd5303 3 ปีที่แล้ว
that DeepRL animation looks like something out of Bamzooki
@sofina527 9 หลายเดือนก่อน
very helpful, thanks a lot dear prof.
@bayesianlee6447 5 ปีที่แล้ว ⁺⁵
Lex, I heard that DL professionals are now using the simulation which has nature based environment and using it to teach AI agent like making this agent to learn how to walk or run by itself.
Yoshua bengio said next evolution will be based on simulation environment for AI.
Would you have any idea or information to share with that?
I really really appreciate all your works and spirit you have. All the world who have interests on AI really appreciate your work and sharing. Thank you ! :)
@borispyakillya4777 5 ปีที่แล้ว
Do you mean smth like GYM-based simulations? Mujoco is based on physical laws - you can already train with RL methods
@kaneelsenevirathne7085 3 ปีที่แล้ว
I took the engineering plasma class taught by your dad at Drexel :D
@msamogh96 4 ปีที่แล้ว ⁺²
This guy is a better Siraj Raval.
@heinrichwonders8861 5 ปีที่แล้ว ⁺³
I have been waiting for this.
@Lorkin32 5 ปีที่แล้ว ⁺³
Much better than the Standford university lecture, where the lady basically only reads the equations without giving any real intuition to what's going on.
@LidoList 4 ปีที่แล้ว
Very good explanation of RL, thanks for the speaker !
@hansharajsharma2765 4 ปีที่แล้ว
Love this. Thanks Lex.
@oldPrince22 2 ปีที่แล้ว
very good lecture! Thanks.
@noname76787 2 ปีที่แล้ว
thank you so much for the lecture!
@datta97 4 ปีที่แล้ว
Thanks for the last slide.
@Asmutiwari 4 ปีที่แล้ว ⁺²
Amazing lecture on DRL, can you also show us how can we implement Q function in Neural Network?
@AbhishekKumar-mq1tt 5 ปีที่แล้ว ⁺¹
Thank u for this awesome video
@mrektor 5 ปีที่แล้ว ⁺¹
Amazing work. Excelent lecture
@emilecureau 2 ปีที่แล้ว
"when the reward flips, the optimal path is grad school, taking as long as possible and never reaching the destination....pffff" lol 21:20
@CarlosGutierrez-go9hq ปีที่แล้ว ⁺¹
since i begin my journey of data science, machine learning, and AI I have been seeing patterns, I am the only one who see that is probably that we are just programs seeking for a never-ending end of this simulation, the way that q-learning is created is the most realistic comparison to human thought, so in order to maximize my output i have to reconsider my reward mechanism? (taking some info from huberman also).
@junxu147 2 ปีที่แล้ว
Great lecture!
@jonk.3947 5 ปีที่แล้ว
Love the Digital Physics reference at 1:04:00 :)
@alec1975 2 ปีที่แล้ว
very good intro
@stmandl 5 ปีที่แล้ว ⁺²
Hi Lex, thanks for this great lecture! Which books of Nietzsche did you have on your mind around 4:33?
@Lunsterful 5 ปีที่แล้ว ⁺¹
Excellent talk.
@yu-siangwang1818 5 ปีที่แล้ว
Great overview of DRL
@konouzkartoumeh 5 ปีที่แล้ว
Great lecture! Thank you.
@AviaEfrat 4 ปีที่แล้ว ⁺¹
27:24 - There is no "reload" in Doom =)
@eeee8677 5 ปีที่แล้ว
THANK YOU MIT
@stabgan 5 ปีที่แล้ว
You are my idol lex
@jasonabc 5 ปีที่แล้ว
Really great lecture learned a lot
@kaiwang2924 5 ปีที่แล้ว
Wonderful lecture.
@sarathrnair9499 5 ปีที่แล้ว ⁺¹
Why no one is asking any doubts ? Or is that portions edited out? Nice lecture
@johnmacleod7789 5 ปีที่แล้ว
Brilliant!!
@OldGamerNoob 5 ปีที่แล้ว ⁺¹
My naive perception is that every frame of "video" entering into each of our eyes and every second of sensory data we receive from birth constitutes a rather large data set for our brains to train on (although having the possibility to constantly train and update the network)
@mutyaluamballa 5 ปีที่แล้ว
Yes, but my perception is, the brain is already a trained model with the data from all our ancestors and at the time of birth. we will have a trained model only with all the necessary weights excluding the dataset it is trained on (our ancestors' life). which can be retrained on the go, based on our experiences. : )
@kawingchan 5 ปีที่แล้ว
I think this maybe mostly true for other mammals, the less intelligent, the more hard wired. When it comes to human, maybe not so sure how much we rely on genetic wiring, vs. neural plasticity aka training. Not sure if any ethical experiments can bring any insight.
@caizifeng 4 ปีที่แล้ว
great lecture
@thepalad1n197 4 ปีที่แล้ว ⁺³
oh shit i listen to your podcast lmao
@putzz67767 5 ปีที่แล้ว
very good!!
@benaliamima9903 3 ปีที่แล้ว
Thank you for this amazing video. I want to know if i can use the DRL principe to enhance the QoS requirements in vehicular network??
Any suggestions??
@OEFarredondo 5 ปีที่แล้ว ⁺²
Remove the human factor. Have the traffic be free of human crossing
@inaamilahi5007 3 ปีที่แล้ว
Awesome
@reinerwilhelms-tricarico344 ปีที่แล้ว
Couldn't always follow. Was distracted by the two cats and then later by the fool who fell in the water. 🙂
@Lorkin32 5 ปีที่แล้ว ⁺¹
How/why can you even upload this for free? Doesn't university cost loads in the US?
Great stuff though!
@m3awna 5 ปีที่แล้ว
I guess that's because MIT is focusing more on workshops/hands-on learning, AND to raise the barre for other universities/institutes... hhh
@petevenuti7355 2 ปีที่แล้ว
But if a diploma is your goal , it sometimes helpful to sit in on a class before you take it for credit, can make it easier, but sometimes it just makes it boring and counterproductive the second time around.
@petevenuti7355 2 ปีที่แล้ว
Sitting in doesn't get you credits or a diploma.
@bryanbocao4906 5 ปีที่แล้ว ⁺¹
It would be appreciated if anyone can have specific steps to get all the directions on the map from 18:51 to 21:32 in great detail.
@Twgvlogs539 5 ปีที่แล้ว ⁺¹
Super
@sauravsingh9177 2 ปีที่แล้ว
check out - "Spinning up with Deep RL by openai"
@nisman.lo.desvivieron ปีที่แล้ว
27:07 lex is scared of Doom
@deeplearningpartnership 5 ปีที่แล้ว ⁺²
Nice.
@aabkhcdcz6067 5 ปีที่แล้ว
شكرا جزيلا
@msp9331 3 ปีที่แล้ว
isnt that the guy from joe rogans podcast? it takes me a week to grasp what he says in 5 minutes.
@samferrer 5 ปีที่แล้ว
I am having hard ... very hard time believing that the brain uses back propagation as learning mechanism ... it just makes no sense in a space-time governed universe ... god damn good lecture ... by the way ...
@MrPeregrineFalcon 4 ปีที่แล้ว
Lex doesn't say the brain uses it (he says it's a mystery). And more generally most cognitive neurologists don't believe it does - although some think there are similar biological correlates. But it's a very efficient algorithm for ANNs to perform gradient descent.
@petevenuti7355 2 ปีที่แล้ว ⁺²
As far as I know , biological brains don't use back propagation. But there are neural circuits where the flow of information goes opposite. There is also the chemical side of things integrating many levels of homeostasis from hunger to pain to emotion.
I would say the combination of those two are the mysterious correlates of back propagation, back propagation being the obviously oversimplified version.
@samferrer 2 ปีที่แล้ว
@@petevenuti7355 got it ...
@rorylennon 2 ปีที่แล้ว
Nice vijeo...
@scorpion7434 5 ปีที่แล้ว ⁺¹
The most funny part is where he was trying to explain the ability of human brains by evolution at 6:33 ! And he literally said, "it is some how being encoded" which contradicts the rewards concept he is introducing!
Son, the most logical reason of having a predefined encoding scheme that never been trained, is the existence of a creator!
@vincentschmitt392 3 ปีที่แล้ว
nice tie
@ProfessionalTycoons 5 ปีที่แล้ว
very good
@kevinayers7144 3 ปีที่แล้ว
Is the entire deep RL course available?
@ns4235 3 ปีที่แล้ว
just create a large number of random simulations. if you're successful in a large number of other realities then this one should be easy. o_o
@el_lahw__el_khafi 11 วันที่ผ่านมา
where are the rest of the lectures?
@pittyconor2489 3 ปีที่แล้ว
nice
@abhaysap 5 ปีที่แล้ว
Can we take the idea's or clues from Biomimicry architecture in Reinforcement learning
@jeanjacqueslundi3502 3 ปีที่แล้ว
Are we really morally equipped to build AI that is safe and also built it for the right reasons.
This is my problem with contporary science/techhnology... We dont focus on if we SHOULD do something. Just because its doable doesnt mean it should be made.
@liberator328 5 ปีที่แล้ว ⁺¹
Which Nietzsche book is he recommending at 4:12 ?
@fizzfox8886 3 ปีที่แล้ว
the robots won't be happy to see that we kicked them in our labs instead of being friendly :/
@samlaf92 5 ปีที่แล้ว
@50:06 DQN can't learn stochastic policies. DQN has a softmax output on actions... isn't that a stochastic policy in itself?
@abhirishi6200 4 ปีที่แล้ว
Nice yo
@rikelmens 5 ปีที่แล้ว
Lex is super low on cortisol and super high on gaba. So much so he sounds quite sleepy sometimes.
@guilhermeparreiras8467 5 ปีที่แล้ว ⁺³
Could bet he is a fan of Jordan Peterson.
@ryanvb3452 5 ปีที่แล้ว
What makes you think so?
@devonk298 4 ปีที่แล้ว
I like turtles!
@Mark-vv8by 4 ปีที่แล้ว
the viewers are a lot less from the first clip
@arsh2489 8 หลายเดือนก่อน
2:15
@midishh 2 หลายเดือนก่อน
hugest*
@skyfeelan 2 ปีที่แล้ว
34:12
@spinLOL533 5 ปีที่แล้ว
Insert comment

ต่อไป

เล่นอัตโนมัติ