Bellman Equation - Explained!

แชร์
ฝัง
  • เผยแพร่เมื่อ 11 ธ.ค. 2024

ความคิดเห็น • 10

  • @gauravshinde8767
    @gauravshinde8767 ปีที่แล้ว +16

    TH-cam algo, please make the relevance score of this video to 10/10. This video is too good to be ignored

    • @CodeEmporium
      @CodeEmporium  ปีที่แล้ว +1

      Thank you! Now if only the TH-cam gods listen

  • @vanilan3585
    @vanilan3585 ปีที่แล้ว +4

    you just make video. what am i about to study😃

  • @jsp991204
    @jsp991204 10 หลายเดือนก่อน +1

    Thanks alot!!😀

  • @slitihela1860
    @slitihela1860 10 หลายเดือนก่อน +1

    can you prepare a video for Double Q-Learning Network
    and Dueling Double Q-Learning Network
    please

  • @borneoland-hk2il
    @borneoland-hk2il 3 หลายเดือนก่อน

    So there is only two method-based in RL, Value-based, and Policy Gradient-based,
    Actor-Critic based is fall into category Policy Gradient-based, for confirmation is that correct? and from what source this information? or would you like to cover some Actor-Critic based method RL videos?

  • @alirezasalehabadi1422
    @alirezasalehabadi1422 5 หลายเดือนก่อน

    Thank you.

  • @bhaveshachhada7242
    @bhaveshachhada7242 10 หลายเดือนก่อน +18

    I was confused. You made me more confused. This doesn't explain the intuition.

    • @RelaxHERE-zk8ts
      @RelaxHERE-zk8ts 2 หลายเดือนก่อน

      lol what was confusing here he simply told about the policy generation and value function based policy generation method.. then told two types of policy generation methods from value functions which are V(s) and Q(s,a).. the simple intution was to be able to detect maximum reward state.. you should watch first markov decision process then it will make sense.

  • @rinibhasin17
    @rinibhasin17 8 หลายเดือนก่อน +3

    Confused :(