Nonlinear Control: Hamilton Jacobi Bellman (HJB) and Dynamic Programming

แชร์
ฝัง
  • เผยแพร่เมื่อ 28 ก.ย. 2024

ความคิดเห็น • 95

  • @ailsani8749
    @ailsani8749 2 ปีที่แล้ว

    I am a follower from his 'control bootcamp' series. Just trying to tell everyone new here that his video is life-saving.

  • @charlescai6672
    @charlescai6672 5 หลายเดือนก่อน +4

    Very good explanation to derivative of HJB equation. But there's a point I may have to add that I think there may be a typo in 'DERIVING HJB EQUATION': In dV/dt, minimizing the integral of L(x,u), the lower limits of integral should be t instead of 0. Only by the case, we can conclude in the second last equation that -L(x(t), u(t)) can be obtained from the time derivative of integral of function L(x,u)...

  • @prantel1
    @prantel1 2 ปีที่แล้ว +6

    At 11:47 the bounds of the integral should be from “t” to “tf”; not from 0 to tf. If you make that change then the derivative of the integral wrt to t will be -L(.,.)

    • @BalajiSankar
      @BalajiSankar ปีที่แล้ว

      Can you please tell how changing lower limit changes the sign?

    • @BarDownBoys
      @BarDownBoys ปีที่แล้ว +2

      @@BalajiSankar I’m happy I can answer as I came here to ask the same question, and Behzad cleared it up for me.
      As behzad stated, it should be integral (t to tf). Then you agree that you can write this as negative the integral (tf to t). Then simply look at the fundamental theorem of calculus - the lower limit being a constant drops out and the upper limit being the variable you’re differentiation is with respect to just means that what’s inside (-L) is your output

    • @kirar2004
      @kirar2004 4 หลายเดือนก่อน

      @@BarDownBoys Thanks

  • @hydropage2855
    @hydropage2855 26 วันที่ผ่านมา +1

    It doesn’t make sense to me how you took the derivative of an integral from 0 to tf, and that didn’t go to 0. Isn’t tf a constant? So an integral over constant bounds in time is a constant in time as well?

  • @SRIMANTASANTRA
    @SRIMANTASANTRA 2 ปีที่แล้ว +2

    Lovely, Professor Steve

  • @dmitry.bright
    @dmitry.bright 2 ปีที่แล้ว

    thanks Steve for a great lecture; looking forward to more lectures on RL and non-linear control if possible with some simple examples. thank you very much!

  • @julienriou4511
    @julienriou4511 2 ปีที่แล้ว +12

    that's weird not to talk about Pontryagin Maximum Principle in an introduction to optiaml control

    • @Eigensteve
      @Eigensteve  2 ปีที่แล้ว +11

      That's a great point. There are a lot of things conspicuously missing from these intro lectures. A lot of it is that I'm still learning more about these topics myself. Maybe a topic for another day!

  • @ronbackal
    @ronbackal 7 หลายเดือนก่อน

    Thanks! That is very interesting. I have the book Data driven science and engineering, which I want to get to sometime to learn more deeply

  • @nitishabordoloi3987
    @nitishabordoloi3987 ปีที่แล้ว

    Hello Steve, can you please comment on the necessity of terminal cost in the performance index

  • @emmab5151
    @emmab5151 2 ปีที่แล้ว

    Amazing!

  • @ScotHudec-g7b
    @ScotHudec-g7b 9 วันที่ผ่านมา

    Lee Kimberly Hernandez Angela Miller Jeffrey

  • @GeorgeOkins-t6h
    @GeorgeOkins-t6h 24 วันที่ผ่านมา

    Schulist Light

  • @__--JY-Moe--__
    @__--JY-Moe--__ 2 ปีที่แล้ว

    👍I don't know why I see super mario Bros!! I love Calculus though!! this goes well, with my jacobian meshing geometries! Rosey the Robot was so over worked! X0-Xn= Cello...ha..ha..💫

  • @PhilipAlkire-c3j
    @PhilipAlkire-c3j 18 วันที่ผ่านมา

    Young Jeffrey Taylor Thomas Taylor Donald

  • @hfkssadfrew
    @hfkssadfrew 2 ปีที่แล้ว +27

    Hey Steve, on 9:11 it should be integration from t to t_f, then that’s where the - comes from.

    • @umarniazi7320
      @umarniazi7320 2 ปีที่แล้ว

      Yes, you are right.

    • @MBronstein
      @MBronstein 2 ปีที่แล้ว

      But then shouldn’t there also be an integral going from t0 to t?

    • @hfkssadfrew
      @hfkssadfrew 2 ปีที่แล้ว

      @@MBronstein I guess it is because such t can varying arbitrarily from t0 to tf. And the whole point is to analyze the derivative wrt anyway. so there is no need to derive another one from t0 to t.

    • @MBronstein
      @MBronstein 2 ปีที่แล้ว

      @@hfkssadfrew But the definition of V goes from t_0 to t_f.. So, we have V= integral of L going from t_O to t and from t to t_f +Q. Notice, if we take derivative now, we get -L from the first integral and +L from the second integral. I don't understand why we can just ignore the second integral

    • @Eigensteve
      @Eigensteve  2 ปีที่แล้ว

      Good catch, thanks! I caught this in the 2nd edition book proofs, but not before the video...

  • @higasa24351
    @higasa24351 2 ปีที่แล้ว +14

    This is the first time I've ever seen the explanation of HJB-DP in a intuitive and fashionable way, not by following the text book lines one by one. Thank you so much for the great talk.

  • @alanzhus2730
    @alanzhus2730 2 ปีที่แล้ว +10

    Can't believe serious topic as this can have thousands of views hours after release. TH-cam is really a magic place.

    • @Eigensteve
      @Eigensteve  2 ปีที่แล้ว

      It's pretty wild to me how many people like hard math :)

  • @blitzkringe
    @blitzkringe 2 ปีที่แล้ว +6

    Please do more of this content. Thank you.

    • @Eigensteve
      @Eigensteve  2 ปีที่แล้ว

      Glad you like it!

  • @CoffeeVector
    @CoffeeVector 2 ปีที่แล้ว +2

    In the equation, dx/dt = f(x(t), u(t), t), why is there an extra dt at the end?

  • @ramanujanbose6785
    @ramanujanbose6785 2 ปีที่แล้ว +2

    Steve I follow all of your lectures. Being a mechanical engineer I really got amazed by watching your turbulence lectures. I personally worked with CFD using scientific python and visualization and computation using python and published a couple of research articles. I'm very eager to work under your guidance in the field of CFD and Fluid dynamics using Machine learning specifically simulation and modelling of turbulence fluid flow field and explore the mysterious world of turbulence. How should I reach you for further communication?

  • @chinamatt
    @chinamatt 2 ปีที่แล้ว +3

    Hi Steve, thanks for the lecture. At the beginning, should the differential equation be dx/dt = f(x,u,t)? As in the derivation of the HJB equation, the subsitution of dx/dt to f(x,u) is made.

  • @djredrover
    @djredrover ปีที่แล้ว +1

    it would be lovely if you could do a MATLAB demo of an ONC using HJB for a hovercraft/drone with full 6-DOF model.

  • @SusanBeall-p6o
    @SusanBeall-p6o 7 วันที่ผ่านมา

    Williams Christopher Martin Betty Gonzalez Matthew

  • @sounakmojumder5689
    @sounakmojumder5689 2 ปีที่แล้ว +1

    thank you, I have a request if you can please upload a lecture on infinite horizon model predictive control......

  • @rolandaustin2206
    @rolandaustin2206 7 วันที่ผ่านมา

    Anderson Scott Williams Edward Jackson Anthony

  • @RockyTrujillo-b7q
    @RockyTrujillo-b7q 8 วันที่ผ่านมา

    Davis Elizabeth Moore Sarah Anderson Jeffrey

  • @rajanisingh1148
    @rajanisingh1148 4 หลายเดือนก่อน

    @Eigensteve, Thanks for such a nice and interesting videos. I've seen all your videos on reinforcement learning. It would be really helpful if you could do a lecture on how dynamic games (either discrete or continuous time) can be solved using reinforcement learning with a walkthrough example. For now, the theoretical concepts on reinforcement learning are clear from your videos, but how it's actually implemented to solve problems is still unclear. Also if you can recommend some resource that would be bonus!

  • @MamieRichardson-cu6xo
    @MamieRichardson-cu6xo 29 วันที่ผ่านมา

    Thompson Cynthia White Sharon Rodriguez Jeffrey

  • @canis_mjr
    @canis_mjr 4 หลายเดือนก่อน

    Шикарный ролик (нет) пример где? Идею прдзода понятнати примитивна, как наипрактике жто применить?

  • @junhyeongjunhyeong
    @junhyeongjunhyeong หลายเดือนก่อน

    nice introduce to HJB. 12:25 why do we take an action at xn(the terminal state)?it is not intuitively clear to me. if cost function L is given, we can get action at xn. it is the action that minimize the cost function at xn. but it is obviously an unnecessary action when i think about it

  • @qejacwa
    @qejacwa 2 ปีที่แล้ว +2

    This is a fantastic video on the derivation. However, there are quite some typos in the video. Hopefully, Steve can correct them. For example, the lower limit in the integral is supposed to be t instead of 0 in the derivation of HJB equation.

    • @sai4007
      @sai4007 2 ปีที่แล้ว

      Yep, without this correction -L(x, u) derivation doesn't make sense

  • @DavidPollerds-j3h
    @DavidPollerds-j3h 25 วันที่ผ่านมา

    Walker Laura Brown Jose Martinez Anthony

  • @JesseHankins-q6e
    @JesseHankins-q6e 13 วันที่ผ่านมา

    Rodriguez Mary Gonzalez Gary Robinson Jose

  • @leventguvenc917
    @leventguvenc917 2 ปีที่แล้ว +2

    Very nice video. In deriving the HJB equation, the lower limit of the integral should be t instead of 0.

  • @JerryReed-m7t
    @JerryReed-m7t 21 วันที่ผ่านมา

    Garcia Patricia Thomas Timothy Thomas Frank

  • @DolaAktar-t9d
    @DolaAktar-t9d 16 วันที่ผ่านมา

    Anderson Deborah Garcia Sharon Garcia Gary

  • @CupuycA
    @CupuycA 2 ปีที่แล้ว +1

    1:35 mistake in the equation

  • @FreemanArno
    @FreemanArno 25 วันที่ผ่านมา

    Wilson Jeffrey Davis Brian Hall Laura

  • @qiangli4022
    @qiangli4022 5 หลายเดือนก่อน

    actor-critic seems to be categorized as a model-free rl in other literatures.

  • @TommyJosephine-u3r
    @TommyJosephine-u3r 16 วันที่ผ่านมา

    Altenwerth Landing

  • @TaylorJean-x6k
    @TaylorJean-x6k 6 วันที่ผ่านมา

    047 Chadd Fords

  • @RonaldBrown-f8i
    @RonaldBrown-f8i 12 วันที่ผ่านมา

    29777 Nia Square

  • @G12GilbertProduction
    @G12GilbertProduction 2 ปีที่แล้ว

    This Hilbert space is include in f(x(k),u(k) * (x(0),y(k)-0) or outside the x(k) - (without double equation)?

  • @qiguosun129
    @qiguosun129 6 หลายเดือนก่อน

    Great Lecture, could you think about discusing HJB with variational inequality? thanks!

  • @hw1875
    @hw1875 ปีที่แล้ว

    16:58, shoud the V at RHS of Discrete time HJB be associated with n, not n-1? Because cost to go (from k to n) should be equal to current cost plus cost to go (from k+1 to n)

  • @RGDot422
    @RGDot422 ปีที่แล้ว

    Why d ( integral ( L(x,u)dt )/dt = - L(x,u)?... Specifically, why is the negative sign?

  • @aiwithhamzanaeem
    @aiwithhamzanaeem 5 หลายเดือนก่อน

    Thanks Professor Steve, Finally I completed the playlist.

  • @Silva98122
    @Silva98122 2 ปีที่แล้ว

    In general, if DP algorithm depends on discretization and interpolation in continuous state space and input space when solving a discrete time, finite time optimal control problem, does it yield a suboptimal solution?

  • @Connect.2source
    @Connect.2source ปีที่แล้ว

    Is there any way I can learn from you in more detail? Any programs you offer by chance? Thanks so much!!

  • @mohammadabdollahzadeh268
    @mohammadabdollahzadeh268 ปีที่แล้ว

    Thanks dear steve for this wonderful tutorial
    I was wondering would it be ok if you solving an example for that?

  • @clairecheung5388
    @clairecheung5388 ปีที่แล้ว

    The lower bound of the integral for V(x(t),t,t_f) should be t instead of 0.

  • @matouspikous
    @matouspikous 2 ปีที่แล้ว

    min(L) != -min(-L), I don't know how to cancel these minus signs.

  • @wikipiggy0.0
    @wikipiggy0.0 2 ปีที่แล้ว

    the derivation is not clear. maybe it is due to the typos metioned in other comments I find it hard to follow

  • @ctrlaltdebug
    @ctrlaltdebug 2 ปีที่แล้ว

    Your trajectory x(t) is not a function.

  • @geonheelee4717
    @geonheelee4717 ปีที่แล้ว

    A Great Lecture. I hope the next lecture will open asap. In particular, I'm interest in detailed relationship between RL and optimal control.

  • @beaglesnlove580
    @beaglesnlove580 ปีที่แล้ว

    Crap ur 100x better than this horrible professor I had who was teaching hjb equation without any background.

  • @KHMakerD
    @KHMakerD 2 ปีที่แล้ว

    Lol solving PDE’s is heinous by definition 😂😂

  • @amirhosseinafkhami2606
    @amirhosseinafkhami2606 2 ปีที่แล้ว +1

    Hi Dr. Brunton, thanks for your excellent lecture.
    Do you have any good code examples of solving the HJB equation for non-linear systems?
    And what resources do you suggest for getting more depth into this field?

    • @Eigensteve
      @Eigensteve  2 ปีที่แล้ว

      I don't have a good recent code... way back in grad school I remember solving these numerically as a two point boundary value problem... but all of that code is deprecated. Will look into a better example

    • @amirhosseinafkhami2606
      @amirhosseinafkhami2606 2 ปีที่แล้ว +1

      @@Eigensteve Actually, I took a look into chapter 11 of your book, but unfortunately, unlike other chapters, I did not find any sample code in it. I think it would be great if an example code for solving the HJB of a non-linear system was added to the book! This could be a great complementary to this chapter!
      Thank you so much again for making such great contents

    • @Eigensteve
      @Eigensteve  2 ปีที่แล้ว

      @@amirhosseinafkhami2606 Totally agree, but this will need to wait for an updated version. Definitely in the works though.

    • @amirhosseinafkhami2606
      @amirhosseinafkhami2606 2 ปีที่แล้ว

      @@Eigensteve I look forward to the updated version of the book then

  • @batoolalhashemi1167
    @batoolalhashemi1167 2 ปีที่แล้ว

    Please give us some examples to more understanding

  • @InfernalPasquale
    @InfernalPasquale 6 หลายเดือนก่อน

    Excellent communication

  • @peasant12345
    @peasant12345 2 ปีที่แล้ว

    7:10 the bellman opt must include Q(x(t),t)

  • @tuptge
    @tuptge 2 ปีที่แล้ว

    More on non linear control please! Im trying to make up my mind on topics for my postgrad thesis!

  • @vietanhle6321
    @vietanhle6321 ปีที่แล้ว

    Good instructor

  • @mingyucai6559
    @mingyucai6559 2 ปีที่แล้ว

    Clear tutorial. Thanks Prof. Steve. Keep following your steps.

  • @ecologypig
    @ecologypig 2 ปีที่แล้ว

    Excellent. Can see a lot of connections with Control and how the essence of Bellman equation are all over the place in different fields. Thanks Prof. Brunton!

  • @boldirio
    @boldirio 2 ปีที่แล้ว

    Great as always Steve! I was wondering if you have any experience in transfer learning, specifically domain adaptation? If so it would be a cool topic to go through! /J

  • @justinting1422
    @justinting1422 2 ปีที่แล้ว

    What's the purpose of the terminal cost? It just disappears when you take the time derivative at 9:22, since it's just a constant, so it shouldn't affect the trajectory of u(t). Also, isn't the cost of the final state already taken into account in the integral, since it integrates all the way to tf anyway?

    • @sechristen
      @sechristen 2 ปีที่แล้ว

      The terminal cost term will appear as a boundary condition in the PDE that HJB gives us, as V(x(t_f),t_f,t_f)=Q(x(t_f),t_f).
      The terminal cost cannot be taken inside the integral (without breaking all the other math by including delta functions as valid cost functions).
      The formulas in the video are derived with the idea of a fixed tf, so if t_f doesn't vary the final cost function will probably look like "After attempting to control the dynamical system, did it end where I wanted it to? eg Q=(x(t_f)-x_target)^2"

  • @amaarquadri
    @amaarquadri 2 ปีที่แล้ว

    Wow it's so cool that these concepts from reinforcement learning apply so perfectly to nonlinear control.

  • @CupuycA
    @CupuycA 2 ปีที่แล้ว

    9:15 it's not obvious, that the operators min and d/dt commute. In general this of course is not true.

    • @matouspikous
      @matouspikous 2 ปีที่แล้ว

      I think there shouldn't be the minimum. V is just what is in the minimum. You do the calculations and then, you say that some V* is the optimal, which has the minimum in the equation.

  • @sounghwanhwang5422
    @sounghwanhwang5422 2 ปีที่แล้ว

    One of the best lectures that I've ever seen!