Reinforcement Learning: Machine Learning Meets Control Theory

แชร์
ฝัง
  • เผยแพร่เมื่อ 2 ก.พ. 2025

ความคิดเห็น • 262

  • @s3pi0n
    @s3pi0n 2 ปีที่แล้ว +10

    This is THE BEST explanation on reinforcement learning over all the articles, books, or youtube videos, that I've seen so far. Period.

  • @ronniec8805
    @ronniec8805 4 ปีที่แล้ว +446

    Steve is a phenomenal lecturer, isn't he?

    • @cnbrksnr
      @cnbrksnr 4 ปีที่แล้ว +6

      never seen a better one

    • @uforskammet
      @uforskammet 4 ปีที่แล้ว

      very much so

    • @flybekvc
      @flybekvc 4 ปีที่แล้ว

      He is!

    • @mail2cmin
      @mail2cmin 4 ปีที่แล้ว

      Yessss

    • @reocam8918
      @reocam8918 4 ปีที่แล้ว

      no, he is the most phenomenal one!! Respect

  • @Globbo_The_Glob
    @Globbo_The_Glob 4 ปีที่แล้ว +88

    Just wanted to comment about how much I love these videos. Last year while applying for PhDs I was searching for passions. In a discussion with my friend (a computer scientist), I accidentally outlined genetic programming without knowing it. My friend told me so and I went researching. Found these videos and became enthralled. Now I have a PhD studentship in soft robotics and plan to use SINDy to help with modelling and control and honestly think that giving machines brains may be my future work too. Thanks Brunton, my passion was helped by your own.

    • @Eigensteve
      @Eigensteve  4 ปีที่แล้ว +34

      That is amazing to hear! Helping people develop their passions is exactly why I do this!

  • @baseljamal8907
    @baseljamal8907 3 ปีที่แล้ว +15

    I just cannot express how grateful I am to prof Steve Brunton for posting these videos. Waking up at 6am to watch him explain is the most satisfying thing ever. Thank you! We all are grateful.

  • @steven-bt7ud
    @steven-bt7ud 4 ปีที่แล้ว +46

    I wish i knew this channel at the start of quarantine

    • @hysenndregjoni853
      @hysenndregjoni853 3 ปีที่แล้ว

      I found about the channel just as quarantine had started. It was quite the treat.

  • @subhikshaniyer613
    @subhikshaniyer613 2 ปีที่แล้ว +1

    Every time he said "good", i felt appreciated for not giving up on a lecture whose subject is far, far away from mine and im pushing myself to try and learn the concept. thank you, steve.. much love.!

  • @ethanspastlivestreams
    @ethanspastlivestreams 4 ปีที่แล้ว +37

    Viewing reinforcement learning as time delayed supervised learning is a really good way of looking at it.

    • @JousefM
      @JousefM 4 ปีที่แล้ว +3

      Indeed!

  • @motbus3
    @motbus3 4 ชั่วโมงที่ผ่านมา

    I've studied RL before but I wanted to refresh what I know. And this is by far the best Policy/Value explanation I've had ever
    Thank you professor.

  • @Spiegeldondi
    @Spiegeldondi 3 ปีที่แล้ว +5

    I love how you emphasize the intersection between machine learning and control (theory). That's exactely what sparks my interest about reinforcement learning!

    • @Eigensteve
      @Eigensteve  3 ปีที่แล้ว +1

      Glad you like it! I always found this connection fascinating and a very natural way to merge the two fields.

  • @elultimopujilense
    @elultimopujilense 3 ปีที่แล้ว +4

    Is there something you dont know dude? You seem to be an expert on everything. You are such an inspiration.

  • @teegnas
    @teegnas 3 ปีที่แล้ว +3

    As a CS grad student who took RL in the last semester ... this is truly the best refresher I have seen until now. Thanks a lot for uploading.

  • @sankalp1391
    @sankalp1391 4 ปีที่แล้ว +42

    Would love for a full series on how can we use RL to control real world dynamical systems!

  • @christiankraghjespersen994
    @christiankraghjespersen994 4 ปีที่แล้ว +65

    I still have no idea as to who could possibly dislike these videos

    • @phaZZi6461
      @phaZZi6461 4 ปีที่แล้ว

      u

    • @devashishbose1521
      @devashishbose1521 4 ปีที่แล้ว

      @@phaZZi6461 I wanted to add a comment but 69 looks so good
      \

    • @Arocarus
      @Arocarus 5 หลายเดือนก่อน

      It could have been someone who only believes in deterministic models.

  • @thelazygardener9493
    @thelazygardener9493 4 ปีที่แล้ว +1

    I've been seriously considering starting a degree in A.I./Machine learning but with videos of this quality available for free, it is hard to justify the cost. Subscribed and liked!

    • @thelazygardener9493
      @thelazygardener9493 4 ปีที่แล้ว +1

      Just incase you read this and have time to reply... Do you have any suggestions for an education path to your level of understanding? There are degrees for data science, computer science, artificial intelligence, software engineering, etc. They all seem so inter-related. I want to know them all but I'm struggling to pick a starting point.
      My current level of related education is highschool level advanced maths and a year of teaching myself MQL4/5 and R code mostly from free resources online. Just so you know my starting point (or state haha).

  • @Optinix-gz1qg
    @Optinix-gz1qg 4 ปีที่แล้ว +6

    Never clicked a video that fast 😆. Great content prof as always love it!

  • @spencerhong4687
    @spencerhong4687 3 ปีที่แล้ว

    Mr.Brunton saves me from my final review. His lectures made crystal clear those seemingly unfathomable terms. Just watched him videos for days and I already like him!

    • @spencerhong4687
      @spencerhong4687 3 ปีที่แล้ว

      those bipedals are too cute they deserve another cmt

  • @souravjha2146
    @souravjha2146 3 ปีที่แล้ว +2

    I am binge watching this chanel from past 3 hours

  • @thiagocesarlousadamarsola3990
    @thiagocesarlousadamarsola3990 3 ปีที่แล้ว +2

    This sweet spot between control theory and machine learning definitely interests me, especially applied to astrodynamical systems. Please, continue making these videos, Professor Brunton!

  • @sistemasecontroles
    @sistemasecontroles 4 ปีที่แล้ว +8

    Great channel! Please record more videos on the edge of reinforcement learning and control theory. Congrats on your work.

  • @whasuklee
    @whasuklee 4 ปีที่แล้ว +56

    *"WELCOME BACK"*

  • @msauditech
    @msauditech ปีที่แล้ว +1

    That's an awesome video indeed. A great introduction to RL!

  • @cuongnguyentranhuu4616
    @cuongnguyentranhuu4616 3 ปีที่แล้ว +1

    you have created such high quality content that i just really enjoy watching it instead of playing games :)))

  • @rich_girl_bookclub
    @rich_girl_bookclub 3 หลายเดือนก่อน

    need more content like this! Thank you for this video, made the content incredibly digestible.

  • @mbengiepeter965
    @mbengiepeter965 3 หลายเดือนก่อน

    This is an excellent lecture. It seems to me that each time an agent makes an observation(s), it computes the best action(a) to take, executes the action and the recieves the next state (s') . It is the agent that decides if this next state or configuration of the environment is a reward (r) or not. Take the example of a scientist in contact or interacting with the environment. It is the scientist/observer who decides if s' is a reward or not.

  • @tai94bn
    @tai94bn 2 ปีที่แล้ว +1

    It's really interesting to watch this video, although I have also studied and read it a few times, its boredom is hard to describe.
    thank you teacher

  • @carlphilip4393
    @carlphilip4393 3 ปีที่แล้ว +1

    Dear Steve,
    Im very, very grateful that I get to watch such extraordinary instructive videos for free!!! Thinking that elsewhere in the world people are killing others atm (as in Kabul), it gives me a lot of hope seeing how people like you just make the world a little better and allmost brings tears into my eyes. You have such great talent in teaching, thank you!

  • @wizardOfRobots
    @wizardOfRobots 4 ปีที่แล้ว +12

    Wow! I would love to see Prof take on RL topics!

  • @AW_tuber
    @AW_tuber 3 ปีที่แล้ว

    The lecture was very well constructed. Well done! As an electrical engineering student trying to specialize in ML I find that you really hit the mark when it comes to putting these though and convoluted topics together with examples.

  • @fzigunov
    @fzigunov 4 ปีที่แล้ว +1

    Looks like I'm not the only one working on a video early in the morning! Really cool stuff, love the doggie!!

  • @pellythirteen5654
    @pellythirteen5654 3 ปีที่แล้ว

    Your series are excellent . They have a good pace and use powerful graphics to explain difficult concepts.
    I've watched many of your videos on my TV which doesn't allow me to give a thumbs up. See here it is.
    I am not a Python programmer , but I am sure that those watching who DO use Python must have itchy fingers.

  • @nahidmahmud8234
    @nahidmahmud8234 4 ปีที่แล้ว

    I am doing research on the Model-based RL for safety-critical systems; I really enjoy doing it. These are so cool. Thanks for making videos on this topic!

  • @hudhuduot
    @hudhuduot 3 ปีที่แล้ว +2

    Steve is one of the gifted teachers. I wish you can guide postgraduate to make a good publication in control and learning by highlighting the hot topics and promising research aspects.

  • @sachinr3823
    @sachinr3823 4 ปีที่แล้ว

    Waiting this topic from long time, your lectures are so clear. Thanks lot.

  • @loopuleasa
    @loopuleasa 4 ปีที่แล้ว +1

    top quality
    this is what they said about education on the internet that "the best teacher can teach everyone"
    this is that video for this topic

  • @ghostofhacker2818
    @ghostofhacker2818 4 ปีที่แล้ว

    I just found out your channel and the contents you cover is a treasure to me!

  • @terryliu3635
    @terryliu3635 10 หลายเดือนก่อน

    Awesome lecture! Thanks Steve. I really enjoyed watching this!

  • @merv893
    @merv893 2 ปีที่แล้ว

    How very mean, I was looking forward to see the trial 7 right away. Great explaining. Thanks

  • @diegoguisasola3858
    @diegoguisasola3858 4 ปีที่แล้ว

    I really love your content, please keep spoiling us!
    These were the fastest 26 minutes! I learnt a lot and I'm looking forward to the python lab implementations of these concepts! Thank you very much for your work.

  • @phafid
    @phafid 2 ปีที่แล้ว

    What I like is I don't pay for this knowledge. I was planning to take a data science certificate, but you know what. Let me spend 6 months learning by myself I have spent a solid 1 month only on your videos starting from SVD. it has been amazing. I love when a small thing builds up into a bigger thing. Soon I will make a sample project based on what I have learned from your video.

  • @kouider76
    @kouider76 4 ปีที่แล้ว +1

    Simply great subject and excellent presentation thank you prof for all your efforts

  • @RasitEvduzen
    @RasitEvduzen 3 ปีที่แล้ว

    Professor you're awesome. My thesis topic is, deep reinforcement learning based robotic arm torque control. I love control theory and machine learning. Thx for your support.

  • @JousefM
    @JousefM 4 ปีที่แล้ว +2

    Theeeere we go Steve! Waited for this :)

  • @theclassoftorchia3856
    @theclassoftorchia3856 2 ปีที่แล้ว

    Hi, Steve. I've been working on Fluid Mechanics 25 years or so. Always using experimental and some analytical tools to approach the subject. I had a lot colleagues migrating to CFD back in the 2000s because these methods seem to find valid results with "little" effort in comparison to expensive, frustrating and time-consuming experiments. So I always disregarded CFD as nice tool that could predict a lot of stuff that you will never know if it is correct or not.
    However, I have to say, that from some time, reinforced (see what I did there?) by new material that I am studying and your papers on ML for Fluid Mechanics I am looking at the subject with new eyes. Thank you very much for your material and the dedication you put in every video.

  • @givemeArupee
    @givemeArupee ปีที่แล้ว

    Steven lectures are great help to the society ❤

  • @alistja4337
    @alistja4337 4 ปีที่แล้ว

    Explained in an understandable way and RL nicely connected to control theory!

  • @focusonlife3242
    @focusonlife3242 3 ปีที่แล้ว

    Dude, you are the best lecturer. DONE

  • @JoeM370
    @JoeM370 ปีที่แล้ว

    The essence of this content is profoundly influential. A book with akin messages was transformative. "Game Theory and the Pursuit of Algorithmic Fairness" by Jack Frostwell

  • @tabindahayat3492
    @tabindahayat3492 3 ปีที่แล้ว

    I love u, Steve! I have been currently working on Machine Teaching and Project Bonsai. I really needed to know this.

  • @pierredubois8715
    @pierredubois8715 4 ปีที่แล้ว +1

    Thank you so much for this lecture. I really enjoy your videos, this is helpful as a PhD student. I also bought your book "Data-driven science and engineering" which have nice explanations for the tools I use. Keep on this awesome work! Greetings from France!

  • @mertbozkir
    @mertbozkir 3 ปีที่แล้ว

    Perfect video, I will watch all other at one time 😍

  • @Voke
    @Voke 4 ปีที่แล้ว

    Great video! If everyone was as great on TH-cam as your delivery we would have a lot more passion in the area. Keep up the good work, train on!

  • @dr.-ing.shehzadhasan3387
    @dr.-ing.shehzadhasan3387 4 ปีที่แล้ว +2

    You have a nice way of explaining the topics.

  • @subramaniannk3364
    @subramaniannk3364 4 ปีที่แล้ว +2

    Yay! Hero has decided to teach Reinforcement Learning

  • @SRIMANTASANTRA
    @SRIMANTASANTRA 4 ปีที่แล้ว +3

    Hi Professor Steve, Lovely presentation.

  • @tytuer
    @tytuer 3 ปีที่แล้ว

    After watching these videos I have actually understood the concept of reinforcement learning. I might be wrong but to me it seems it generalizes the feedback loops into more abstract concepts of agent action policy environment etc. In a feedback loop we have control policy which is a PID controller that controls the behaviour of a plant it is attached to. The model of the plant is environment here and the action is taken by the output of the PID controller. The reward in feedback loop is to converge to desired output value at the steady state by ignoring its transition time values, so it is in a sense a semi supervised learning. The states in the feedback loop is derivative components of the system. In noisy systems, sometimes it is crucial to remove derivative component to avoid impulsive behaviour which corresponds to state feedback from environment to agent in RL. By thinking like this, RL is more meaningful to me as an engineer, that RL is a generalized feedback system where we try to get a desired output given some input to the system. Thank you for these video series!!

  • @ramanikrishnamurthy7086
    @ramanikrishnamurthy7086 4 ปีที่แล้ว

    I thought I was witnessing a breakthrough concept trying to link deterministic control theory with machine learning. But, when you mentioned the words probability and policy, I was disappointed. Looking forward to more conceptual lectures. Could also highlight real world applications. Thanks.

  • @CamillaSantAnnadaSilva
    @CamillaSantAnnadaSilva 3 หลายเดือนก่อน

    melhor série de vídeos existente nesse planeta

  • @car0lm1k3
    @car0lm1k3 4 ปีที่แล้ว

    i have been trying to teach my guys that machine learning and control theory (fuzzy autotuning) is the same principle. This video will be used!

  • @Kong9901
    @Kong9901 4 ปีที่แล้ว +2

    That's so interesting and well explained. Thank you !

    • @Eigensteve
      @Eigensteve  4 ปีที่แล้ว

      Glad you liked it!

  • @francesco884
    @francesco884 3 ปีที่แล้ว

    Thank you, professor Steve Brunton. I am pleased to inform you that I am considering to do, after my master degree in computer engineering, a PhD related to the Data Driven Control Theory subject and the merit in part is also your.

  • @sounghwanhwang5422
    @sounghwanhwang5422 3 ปีที่แล้ว

    the fantastic lecture that I've ever seen...

  • @timanb2491
    @timanb2491 3 ปีที่แล้ว

    it's brilliant ! . Keep working with this topic please

  • @aniruddhadatta925
    @aniruddhadatta925 4 ปีที่แล้ว

    Amazing feeling to watch a video After completing a project on the same topic

  • @shehrozeshahzad4363
    @shehrozeshahzad4363 หลายเดือนก่อน +1

    Steve thanks a lot very comprehensive lectures. Can you also share the notes and assignments for this one?

  • @chrisogonas
    @chrisogonas 2 ปีที่แล้ว

    Very well illustrated! Thanks

  • @leopardus4712
    @leopardus4712 4 ปีที่แล้ว +1

    Keep up the good work, love your videos

  • @minglee5164
    @minglee5164 2 ปีที่แล้ว

    RL can be interpreted from this perspective, amazing

  • @melvinlara6151
    @melvinlara6151 4 ปีที่แล้ว +1

    I was waiting for this!!!

  • @lazyoneswapples2962
    @lazyoneswapples2962 ปีที่แล้ว

    A very well done lecture. Bravo!
    I'd like to make a suggestion, if I may, to modify the Policy function as
    pi(s,a) = Pr(A = a, S = s); A is the place holder for an action, and a is the actions of taking; S is the place holder for the state and s is the given state.

  • @TheProblembaer2
    @TheProblembaer2 ปีที่แล้ว

    This is really really great teaching.

  • @arrahul316
    @arrahul316 2 ปีที่แล้ว

    Amazing Clarity

  • @Bineshmht12
    @Bineshmht12 หลายเดือนก่อน

    wtf i din't find this channel at first...no any channel like this ever...

  • @curumo_curunir
    @curumo_curunir 3 ปีที่แล้ว

    Thank you very much for the video. This is priceless. Thank you very much for your efforts and sharing high quality information.
    I created some time-stamps for myself. But if they help someone, I would be happy.
    15:37 Optimization
    21:49 Q-learning

  • @TheAIEpiphany
    @TheAIEpiphany 4 ปีที่แล้ว +1

    Hey Steve! Loved your lecture! Could you tell me what your setup is? I love your production, setup, and content of course!
    Some questions:
    1. Do you have a screen/script in front of you and a green screen behind?
    2. Which cam and mic do you use? Is it only a lav mic? I assume it's not shotgun since you're far away from any particular point of the frame.
    3. How much time does it take to create a video like this one?
    4. How many dry runs do you usually do? Or for this video in particular?
    You're setting a new standard for production (and beyond haha), keep up the good work!
    I'd really appreciate your answers, thank you in advance!

    • @Eigensteve
      @Eigensteve  3 ปีที่แล้ว

      Thanks, glad you like it! No script, but I have a screen so I can see where I am relative to the presentation. I use a lav mic and a canon 4k camera. I usually do everything in one run, sometimes I redo the intro a couple times until i'm happy with it.

    • @TheAIEpiphany
      @TheAIEpiphany 3 ปีที่แล้ว

      @@Eigensteve thanks Steve!

  • @radhen171992
    @radhen171992 4 ปีที่แล้ว

    I really like your videos. Keep up the good work! :)

  • @marco_gallone
    @marco_gallone 4 ปีที่แล้ว +1

    I’ve been following your content for a at least 4 years now! It’s the reason I am a robotics control engineer now, you pulled me through 4th year control systems with your conveniently-timed boot camp. Please keep up the great content!
    PS are you accepting PhD students?

  • @riccardodelpozzo8683
    @riccardodelpozzo8683 2 ปีที่แล้ว

    phenomenal video, thank you

  • @pardonchawatama
    @pardonchawatama 3 ปีที่แล้ว +1

    Great lesson.. Thank you

  • @carriefu458
    @carriefu458 3 ปีที่แล้ว

    Prof Brunton: You are one bad-ass teacher!!!🤓

  • @Turcian
    @Turcian 4 ปีที่แล้ว +1

    I think it's also important to mention the distinction between discrete and continuous action spaces.

  • @fahimehjabbarinia401
    @fahimehjabbarinia401 2 ปีที่แล้ว

    the best one i have ever seen

  • @HD-qq3bn
    @HD-qq3bn 4 ปีที่แล้ว

    we also look forward to your explanation for GAN in the future

  • @aakashdewangan7313
    @aakashdewangan7313 6 หลายเดือนก่อน +1

    video on ADRC please. (Active disturbance rejection control,) with implementation on Simulink.
    The other videos r not good by others

  • @Lucas_Lima606
    @Lucas_Lima606 2 ปีที่แล้ว

    thank you very much for your lesson, it is really useful to me!

  • @TheRestalyn
    @TheRestalyn ปีที่แล้ว

    love your lectures

  • @hhhhhhhhhhhhha
    @hhhhhhhhhhhhha 8 หลายเดือนก่อน

    Great Lecture

  • @Firestorm-tq7fy
    @Firestorm-tq7fy 3 ปีที่แล้ว

    I Like this video, but there is already a very common way to optimise the density problem. It’s called actor critic, where you basically get a second network to learn what reward it is expecting (q-learning) where the actor is a policy gradient network.
    Works fine so far and I know it’s not enough to get away from the semi-supervised but let’s be honest, the “semi” is what really defines the technique. Because the agent needs to learn what “could” be good in future by itself without a supervisor. That’s how animals and humans learn too. so a fully supervised agent wouldn’t be exploring the world on its own anymore.
    Greetings Firestorm
    (ETH-Zurich Student)

  • @aiahmed608
    @aiahmed608 3 ปีที่แล้ว

    Thank you, professor!

  • @SiriGadipudi
    @SiriGadipudi 2 ปีที่แล้ว +2

    All of your lecture series are very good and very helpful. A series on convex optimization problems would be good. Any thoughts about it?

  • @praharaj2007
    @praharaj2007 4 ปีที่แล้ว +2

    Thanks Steve.

    • @Eigensteve
      @Eigensteve  4 ปีที่แล้ว

      You are very welcome!

  • @MarinaOvchinnikov
    @MarinaOvchinnikov ปีที่แล้ว

    16:30 - wonder if it's differential or differentiable programming? Great video.

  • @Physicsandmathswithpraveen
    @Physicsandmathswithpraveen 4 ปีที่แล้ว +1

    It would be an honor to be supervised for a PhD by him.

  • @alexanderschiendorfer2203
    @alexanderschiendorfer2203 4 ปีที่แล้ว +2

    At 16:26 did you by any chance mean "dynamic programming" (value iteration, q value iteration, etc.) instead of "differential programming". I couldn"t make sense of the combination of TD, MC and DP?

    • @alexanderschiendorfer2203
      @alexanderschiendorfer2203 4 ปีที่แล้ว +3

      Also, it would be awesome if you could elaborate on a comparison of control theory and reinforcement learning. When to use CT, when to use RL, etc.

    • @Eigensteve
      @Eigensteve  4 ปีที่แล้ว +1

      Good catch, thanks!

  • @forkshire2868
    @forkshire2868 10 วันที่ผ่านมา

    I had a doubt, won't giving dense rewards yield the same results as that of a supervised learning model since we are getting pretty frequent answers ?

  • @matejsuty5024
    @matejsuty5024 3 ปีที่แล้ว

    Great video, thanks.

  • @yassinezarrouk9806
    @yassinezarrouk9806 3 ปีที่แล้ว +1

    Phenomenal video ..thank you sooo much, could you please make a video about using deep reinforced learning along with opencv and robotics

    • @Eigensteve
      @Eigensteve  3 ปีที่แล้ว

      Great suggestion! I'll look into it

  • @ryanmckenna2047
    @ryanmckenna2047 ปีที่แล้ว

    When should a system be modelled deterministically vs probabilistically?

  • @HD-qq3bn
    @HD-qq3bn 4 ปีที่แล้ว

    I really like your explanation

  • @Voke
    @Voke 4 ปีที่แล้ว +2

    One piece of unsolicited advice: don’t call these golden nuggets lectures and more people will enjoy watching them... These aren’t lectures, they’re complex topics simplified or explainer videos, etc! 💥 BOOM 💥

    • @Voke
      @Voke 4 ปีที่แล้ว

      Subscribed BTW, love your passion!

  • @matthewjames7513
    @matthewjames7513 3 ปีที่แล้ว

    Great talk, but I understand the difference between the quality function Q(s,a) and the policy pi(s,a). They seem to do the same thing?