Passive Reinforcement Learning

แชร์
ฝัง

ความคิดเห็น • 5

  • @hypebeastuchiha9229
    @hypebeastuchiha9229 2 ปีที่แล้ว +2

    What a legend
    Thanks so much you have a talent for teaching!

  • @monikaklein222
    @monikaklein222 2 ปีที่แล้ว +1

    Thank you for this marvelous video! You explain these concepts so well!!!

  • @sahhaf1234
    @sahhaf1234 5 ปีที่แล้ว +1

    @14:55 do we know R(s) or do we estimate it?

    • @berty38
      @berty38  5 ปีที่แล้ว

      For ADP, we don't exactly know R. Though for this type of MDP, we can just memorize the R(s) we observe. In other MDPs, sometimes the reward can be randomized, so you can't just memorize it.

  • @lingchen8849
    @lingchen8849 2 ปีที่แล้ว

    @14:25 The function seems incorrect according to my understanding. Since the policy is fixed. Why we need to select action.. I am very confused.