Q Learning Algorithm شرح

แชร์
ฝัง
  • เผยแพร่เมื่อ 14 ธ.ค. 2024

ความคิดเห็น • 17

  • @roozy_313a2
    @roozy_313a2 6 หลายเดือนก่อน +1

    رحم الله والديك دكتور 🌺

  • @shahd.h6430
    @shahd.h6430 ปีที่แล้ว +1

    episode 1:
    state= room 4, action-room =0
    Q(4,0)=100+ 0.1*0 =100
    episode 2:
    state=3, action=4 Q(3,4)=0+0.1*100=10
    episode 3:
    state=1, action=3
    Q(1,3)=0+0.1*10=1
    So the path will be from room 1 to room 3 to room 4 to room 0 which is the goal

  • @jawaheralbaddawi
    @jawaheralbaddawi 11 หลายเดือนก่อน +1

    what we do if start from room 0 ?

  • @yosrmahmod5473
    @yosrmahmod5473 ปีที่แล้ว

    Episode 1:
    Initial state at room 1
    Q(1,3)=0+0.1*0=0
    Q(3,4)=0+0.1*100=10
    Episode 2:
    Initial state at room 4
    Q(4,0)=100+0.1*0=100
    Episode 3:
    Initial state at room 5
    Q(5,4)=0+0.1*100=10
    Path 5 to 4 to 0

  • @youssef-ns9ny
    @youssef-ns9ny ปีที่แล้ว

    Youssef mohamed
    The quiz consists of three episodes, where the goal is to determine the values of certain Q-functions. In episode 1, the Q-function for state (4,0) is determined to be 100, which signals the completion of the episode. In episode 2, the Q-function for state (5,4) is found to be 10, with the Q-function for state (4,0) still being 100, and the episode is completed. Similarly, in episode 3, the Q-function for state (3,4) is found to be 10, with the Q-function for state (4,0) still being 100, and the episode is completed. The final path to reach the goal room 0 involves moving from room 1 to room 3 to room 4 before reaching the goal in room 0.

  • @AbdallahAli-n3r
    @AbdallahAli-n3r ปีที่แล้ว

    Abdallah Ali Ahmed
    In episode 1, the quiz begins with the initial state at room 1. The Q-function for state (1,3) is calculated to be 0, where the reward obtained from moving from state (1,3) to state (3,4) is 0, and the discount factor is 0.1. On the other hand, the Q-function for state (3,4) is determined to be 10, where the reward obtained from moving from state (3,4) to terminal state (4,0) is 100, and the discount factor is 0.1.
    In episode 2, the quiz starts with the initial state at room 4. The Q-function for state (4,0) is calculated to be 100, where the reward obtained from moving from state (4,0) to terminal state (4,0) is 0, and the discount factor is 0.1.
    In episode 3, the quiz begins with the initial state at room 5. The Q-function for state (5,4) is determined to be 10, where the reward obtained from moving from state (5,4) to terminal state (4,0) is 100, and the discount factor is 0.1.
    To reach the goal at room 0, the final path involves moving from state (5,4) to state (3,4) to state (4,0).
    final path from 5 to 4 to 0,
    Thank you,

  • @nadaabdo7830
    @nadaabdo7830 ปีที่แล้ว

    Nada Abdelregal
    The answer of the quiz is:
    Episode 1:
    State:room4 , action: room 0
    Q(4,0)=100 which is the goal
    Episode 2:
    State:room5 , action: room 4
    Q(5,4)=0+0.1*(100 ,0 ,0)=10
    Episode 3:
    State:room3 , action: room 4
    Q(3,4)=0+0.1*(100 ,0 ,0)=10
    So the path will be from room 1 to room 3 to room 4 to room 0 which is the goal

  • @SaraAhmed-t1f
    @SaraAhmed-t1f ปีที่แล้ว

    Episode 1
    Q(4,0) = 100+0.1*0 =100
    Episode 2
    Q(3,4) = 0 + 0.1*100= 10
    Episode 3
    Q(1,3) = 0+0.1*10 = 1
    So from room 1 to 3 to 4 to 0 and that's our goal

  • @AhmedAli-sz9vi
    @AhmedAli-sz9vi ปีที่แล้ว

    Ahmed Ali :
    the answer of quiz :
    episode 1:
    Q(4,0)=100 (goal)
    Then finish the episode
    episode 2:
    Q(5,4)=10
    Q(4,0)=100 (goal)
    Then finish the episode
    episode 3:
    Q(3,4)=10
    Q(4,0)=100 (goal)
    Then finish the episode
    and the path in the end will be
    from room 1 to room 3 to room 4 to the goal room 0

  • @gamaladel9308
    @gamaladel9308 ปีที่แล้ว +1

    ❤❤

  • @mouradfakhfakh4823
    @mouradfakhfakh4823 8 หลายเดือนก่อน

    Would you please share the python code. Thanks!

  • @MaiMohamed-d5j
    @MaiMohamed-d5j ปีที่แล้ว

    Mai Mohamed
    The answer of the quiz is:
    Episode 1:
    State:room4 , action: room 0
    Q(4,0)=100 which is the goal
    Episode 2:
    State:room5 , action: room 4
    Q(state, action)=R(state, action)+Gamma*max(Q(next state)
    Q(5,4)=0+0.1*(100 ,0 ,0)=10
    Episode 3:
    State:room3 , action: room 4
    Q(state, action)=R(state, action)+Gamma*max(Q(next state)
    Q(3,4)=0+0.1*(100 ,0 ,0)=10
    So the path will be from room 1 to room 3 to room 4 to room 0 which is the goal

  • @manarmohamed2923
    @manarmohamed2923 ปีที่แล้ว

    Manar Mohamad
    The Answer Of the Quiz is :
    Episode 1
    Random Stata room 3 action 4
    Update Q Matrix
    Q(3,4)=0+0.1*100=10
    4 not a goal
    Then choose state 4 action 0
    Q(4,0)=100+0.1*0=100
    0 is a goal - > Episode 1 Finished
    --------------------------
    Episode 2
    Random State room 5 action 4
    Update Q Matrix
    Q(5,4)=0+0.1*100=10
    4 not a goal
    Then choose state 4 action 0
    Q(4,0)=100+0.1*0=100
    0 is a goal - > Episode 2 Finished
    --------------------------
    Episode 3
    Random State room 4 action 0
    Update Q Matrix
    Q(4,0)=100+0.1*0=100
    0 is a goal - > Episode 3 Finished
    --------------------------
    Start From Stare 5 by choosing Max value
    Optimal path is 5 to 4 Then From 4 to 0
    [5-4-0]

  • @mirnasaied-x6t
    @mirnasaied-x6t ปีที่แล้ว

    episode 1:
    state= room 4 , action=room =0
    Q(4,0)=100 + 0.1*0 =100
    episode 2:
    state=5 , action=1
    Q(5,1)= 0 + 0.1*0 =0
    episode 3:
    state=3 , action=4
    Q(3,4)= 0 + 0.1*100 =10
    Then finish the episode

  • @MoatazMEwis
    @MoatazMEwis ปีที่แล้ว

    Here is the quiz, prof. Mohamed:
    drive.google.com/file/d/12cYcErLIkQWl3pyGn-ya5N6hSftSJBTp/view?usp=drivesdk

  • @MK-cu1se
    @MK-cu1se 5 หลายเดือนก่อน

    ❤❤❤