Plot in MATLAB Phase Portraits and State-Space Trajectories of Dynamical Systems

Reinforcement Learning Tutorial: Monte Carlo Method for Learning State Value Functions in Python

Reinforcement Learning from scratch

ไฮไลท์ฟุตบอล พรีเมียร์ลีก 2024/25 สัปดาห์ที่ 14 : เลสเตอร์ ซิตี้ พบ เวสต์แฮม

In this life, you have become my poem and I have become your dream. This is our mandarin duck butte

VLOG #265 ภูเก็ตที่ไม่มีเธอ !! โสดก็ตอแหลสิคะ กลับมารอบนี้แตกๆ 3วันกับพี่กะทิ บอกเลยพังเละเทะ …….

Iterative Policy Evaluation Algorithm in Python and OpenAI Gym - Reinforcement Learning Tutorial

Aleksandar Haber PhD

มุมมอง 4 660

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 3 ธ.ค. 2024

ความคิดเห็น • 15

@aleksandarhaber 2 ปีที่แล้ว ⁺¹
It takes a significant amount of time and energy to create these free video tutorials. You can support my efforts in this way:
- Buy me a Coffee: www.buymeacoffee.com/AleksandarHaber
- PayPal: www.paypal.me/AleksandarHaber
- Patreon: www.patreon.com/user?u=32080176&fan_landing=true
- You Can also press the Thanks TH-cam Dollar button
@RahulYadav-w1v4l ปีที่แล้ว ⁺²
Your explanation of the concept was beautiful. Thank you so much.
@aleksandarhaber ปีที่แล้ว
Thank you very much!
@peralser ปีที่แล้ว ⁺¹
Thanks for your explanation.
@aleksandarhaber ปีที่แล้ว ⁺¹
Thank you!
@pulkitprajapat7862 ปีที่แล้ว ⁺¹
thanks a lot for such videos, i am loving them.
@aleksandarhaber ปีที่แล้ว
Great! Thank you for the encouraging comments!
@samlee9126 2 ปีที่แล้ว ⁺¹
Thank you for your tutorial! It really helps me with my project. A small gift has sent you via Paypal.
@aleksandarhaber 2 ปีที่แล้ว ⁺¹
Thank you very much!
@northstar6887 ปีที่แล้ว ⁺¹
Do you have video covers policy iteration?
@aleksandarhaber ปีที่แล้ว
Check if there is a tutorial in the reinforcement learning list.
@dulanjanaperera988 ปีที่แล้ว ⁺²
Isn't the value of the goal state 1?
@aleksandarhaber ปีที่แล้ว ⁺³
First of all, the definition of the state value function in the current state is the expected sum of rewards that you will obtain by going from that state to the next states. That is, it is a sum of rewards that does not include the reward obtain by reaching the current state. Since the goal state is a terminal state, you do not go to any other state. Consequently, the sum of rewards is zero, and this means that the state value function in the terminal state is zero. This is actually by definition (see Sutton's and Barto's book on reinforcement learning).
It is a recommendation from Sutton's and Barto's book to initialize all state value functions in terminal states to zero. The goal state value function is not being updated in the iterative algorithm. If you set an initial value it will stay at that value. Its value function is not relevant. You only get a reward of +1 by reaching the goal state. The value function at the goal state is a boundary condition in the Bellman equation I think. I am not sure what will happen if you initialize this state value function in the goal state to non-zero value. You can try.
@dulanjanaperera988 ปีที่แล้ว ⁺¹
@@aleksandarhaber Thanks for the clarification. I understand the reason now.
@aleksandarhaber ปีที่แล้ว ⁺¹
@@dulanjanaperera988 good!

ต่อไป

เล่นอัตโนมัติ

Plot in MATLAB Phase Portraits and State-Space Trajectories of Dynamical Systems

Plot in MATLAB Phase Portraits and State-Space Trajectories of Dynamical Systems

Reinforcement Learning Tutorial: Monte Carlo Method for Learning State Value Functions in Python

Reinforcement Learning Tutorial: Monte Carlo Method for Learning State Value Functions in Python

Reinforcement Learning from scratch

Reinforcement Learning from scratch

ไฮไลท์ฟุตบอล พรีเมียร์ลีก 2024/25 สัปดาห์ที่ 14 : เลสเตอร์ ซิตี้ พบ เวสต์แฮม

ไฮไลท์ฟุตบอล พรีเมียร์ลีก 2024/25 สัปดาห์ที่ 14 : เลสเตอร์ ซิตี้ พบ เวสต์แฮม

In this life, you have become my poem and I have become your dream. This is our mandarin duck butte

In this life, you have become my poem and I have become your dream. This is our mandarin duck butte

VLOG #265 ภูเก็ตที่ไม่มีเธอ !! โสดก็ตอแหลสิคะ กลับมารอบนี้แตกๆ 3วันกับพี่กะทิ บอกเลยพังเละเทะ …….

VLOG #265 ภูเก็ตที่ไม่มีเธอ !! โสดก็ตอแหลสิคะ กลับมารอบนี้แตกๆ 3วันกับพี่กะทิ บอกเลยพังเละเทะ …….

My lovely daughter arranged for me, a security guard, to marry a female CEO.

My lovely daughter arranged for me, a security guard, to marry a female CEO.

Introduction to OpenAI Gym (Gymnasium): Cart-Pole Environment - Reinforcement Learning Tutorial

Introduction to OpenAI Gym (Gymnasium): Cart-Pole Environment - Reinforcement Learning Tutorial

Detailed Explanation and Python Implementation of Q-Learning Algorithm in OpenAI Gym (Cart-Pole)

Detailed Explanation and Python Implementation of Q-Learning Algorithm in OpenAI Gym (Cart-Pole)

Deep Reinforcement Learning Tutorial for Python in 20 Minutes

Deep Reinforcement Learning Tutorial for Python in 20 Minutes

What Alex did with 53 seconds

What Alex did with 53 seconds

Bellman Equations, Dynamic Programming, Generalized Policy Iteration | Reinforcement Learning Part 2

Bellman Equations, Dynamic Programming, Generalized Policy Iteration | Reinforcement Learning Part 2

Q-Learning Tutorial 1: Train Gymnasium FrozenLake-v1 with Python Reinforcement Learning

Q-Learning Tutorial 1: Train Gymnasium FrozenLake-v1 with Python Reinforcement Learning

How I animate 3Blue1Brown | A Manim demo with Ben Sparks

How I animate 3Blue1Brown | A Manim demo with Ben Sparks

2 Years of C++ Programming

2 Years of C++ Programming

Deep Q-Learning Network From Scratch in Python, TensorFlow, and OpenAI Gym - Part 1- Reinforcement

Deep Q-Learning Network From Scratch in Python, TensorFlow, and OpenAI Gym - Part 1- Reinforcement

SARAN x เถาวัลย์ - เดินทางโดยสวัสดิภาพ (Official MV)

SARAN x เถาวัลย์ - เดินทางโดยสวัสดิภาพ (Official MV)

เลือกสวนน้ำ ที่คุณอยากไป!

เลือกสวนน้ำ ที่คุณอยากไป!

路飞做的坏事被拆穿了 #路飞#海贼王

路飞做的坏事被拆穿了 #路飞#海贼王

ลองของต้องคำสาป 50 อย่าง เขาเอาผมถึงตาย!! (SPD)

ลองของต้องคำสาป 50 อย่าง เขาเอาผมถึงตาย!! (SPD)

[LIVE] : ONE ลุมพินี 89 | คู่เอก "ยอดไอคิว vs คิริลล์"

[LIVE] : ONE ลุมพินี 89 | คู่เอก "ยอดไอคิว vs คิริลล์"

This marshmallow hack is APPROVED @chefkoudy

This marshmallow hack is APPROVED @chefkoudy

NEW Scan Run Challenge - Help Vineria to Find Simon Phase 2 Incredibox Sprunki

NEW Scan Run Challenge - Help Vineria to Find Simon Phase 2 Incredibox Sprunki

ชุดนี้โหดเกินต้าน..ใช้คําว่า เอาท์คลาส ได้เปลืองมากครับ ลิเวอร์พูล พบ ซิตี้ | ตัวเทพฟุตบอล

ชุดนี้โหดเกินต้าน..ใช้คําว่า เอาท์คลาส ได้เปลืองมากครับ ลิเวอร์พูล พบ ซิตี้ | ตัวเทพฟุตบอล