Continuous Action Space Actor Critic Tutorial

What is Actor-Critic?

Actor Critic Methods Are Easy With Keras

Homeless Hero Returns Lost Wallet to Its Owner #shorts

LIVE🔴 : Timor-Leste vs Thailand | ASEAN Championship 2024 | 08.12.24

老人右腿被卷入车轮，危急时刻，他们从四面八方冲过来抬车救人。谢谢好心人！也提醒,务必注意行车安全！#熱門 #中国

Actor Critic (A3C) Tutorial

Skowster the Geek

มุมมอง 19 937

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 10 ธ.ค. 2024

ความคิดเห็น • 16

@hemreozgur 5 ปีที่แล้ว ⁺²
It is remarkable summarization and demonstration. Thank you!
@erickarwa-0705 4 ปีที่แล้ว
For this reason, I am implementing A2C for my masters. Thank you.
@chenwang6341 4 ปีที่แล้ว
This is amazing. Especially for someone like I who know A3C but haven't implemented them yet. Thanks!
@dariuszkrynicki9184 2 ปีที่แล้ว
you can't tell you know the algorithm if you have not implemented it before.
@pavelkoryakin5750 5 ปีที่แล้ว ⁺¹
04:00 Why saving s_prime if you do not use it ?
@Kingstanding23 5 ปีที่แล้ว ⁺¹
0:30 What are “actual estimated Q-values”?
@SandwichMitGurke 5 ปีที่แล้ว ⁺¹
Dom McKean in deep q learning the output of the neural net is the "actual estimated q value" for each action for a specific state. You then take the highest number to get the action you should perform.
@Kingstanding23 5 ปีที่แล้ว ⁺⁴
S4ndwichGurk3 ‘actual estimated’ sounds like an oxymoron.
@SandwichMitGurke 5 ปีที่แล้ว ⁺³
@@Kingstanding23 ah that's what you mean :D sorry haha i didn't even notice
@pavelkoryakin5750 5 ปีที่แล้ว
06:18 line 32, why do you use ReLU for value network ? Value of a state can be negative! It should not have any activation, should not it ?
@jeffrey5602 5 ปีที่แล้ว ⁺¹
@Paval Koryakin ReLU is not used in last layer of value head, therefore it can still be negative.
@MrChilledstep 5 ปีที่แล้ว ⁺¹
6:30 - You talk about adding gamma * V(S') to the value of state S. You say that you're actually adding gamma^{num_steps} - I don't see how this is true. It looks to me like you're just adding (gamma times) the value of one step in the past, not recursively adding the value of all states that lead to state S as the Bellman equation describes. Please, can you clarify?
@marcusrose8239 3 ปีที่แล้ว
What is a prime state, is the max V(s) at state, or ?
@spinity8468 4 ปีที่แล้ว
It smells Maxim Lapan's book. Anyway, the tutorial was great! :D
@pablodiaz1811 5 ปีที่แล้ว
Thanks for share
@benwillis4716 4 ปีที่แล้ว ⁺⁴
Good videos, but you can't just copy code and pretend it is your own. The is just copied from 'Deep reinforcement learning hands-on' by Maxim Lapan, and looking back at some of your other videos, it is clear that you copied them as well. I have no problem with you using other peoples code, you just have to properly reference it in your video and description, and don't just link to your own GitHub. That is shady as shit.

ต่อไป

เล่นอัตโนมัติ

Continuous Action Space Actor Critic Tutorial

Continuous Action Space Actor Critic Tutorial

What is Actor-Critic?

What is Actor-Critic?

Actor Critic Methods Are Easy With Keras

Actor Critic Methods Are Easy With Keras

Homeless Hero Returns Lost Wallet to Its Owner #shorts

Homeless Hero Returns Lost Wallet to Its Owner #shorts

LIVE🔴 : Timor-Leste vs Thailand | ASEAN Championship 2024 | 08.12.24

LIVE🔴 : Timor-Leste vs Thailand | ASEAN Championship 2024 | 08.12.24

老人右腿被卷入车轮，危急时刻，他们从四面八方冲过来抬车救人。谢谢好心人！也提醒,务必注意行车安全！#熱門 #中国

老人右腿被卷入车轮，危急时刻，他们从四面八方冲过来抬车救人。谢谢好心人！也提醒,务必注意行车安全！#熱門 #中国

Now You C-Amy EP.207 I รวมแก๊งค์ สายตี้ เอมี่ พี่ป๊อก ยายป๋อมแป๋ม

Now You C-Amy EP.207 I รวมแก๊งค์ สายตี้ เอมี่ พี่ป๊อก ยายป๋อมแป๋ม

Overview of Deep Reinforcement Learning Methods

Overview of Deep Reinforcement Learning Methods

Actor Critic Algorithms

Actor Critic Algorithms

AI Olympics (multi-agent reinforcement learning)

AI Olympics (multi-agent reinforcement learning)

Monte Carlo Reinforcement Learning Tutorial

Monte Carlo Reinforcement Learning Tutorial

Multicore Deep Reinforcement Learning | Asynchronous Advantage Actor Critic (A3C) Tutorial (PYTORCH)

Multicore Deep Reinforcement Learning | Asynchronous Advantage Actor Critic (A3C) Tutorial (PYTORCH)

A brief review of Actor Critic Methods

A brief review of Actor Critic Methods

An introduction to Policy Gradient methods - Deep Reinforcement Learning

An introduction to Policy Gradient methods - Deep Reinforcement Learning

Policy Gradient Theorem Explained - Reinforcement Learning

Policy Gradient Theorem Explained - Reinforcement Learning

มายคราฟแต่ถ้าผมพูด "เอฟเฟกต์" อะไรมันจะออกมา!?

มายคราฟแต่ถ้าผมพูด "เอฟเฟกต์" อะไรมันจะออกมา!?

[LIVE] : ONE ลุมพินี 90 | คู่เอก "ก้องไกล vs แอนตาร์"

[LIVE] : ONE ลุมพินี 90 | คู่เอก "ก้องไกล vs แอนตาร์"

This pasta was almost APPROVED @kentycook

This pasta was almost APPROVED @kentycook

Now You C-Amy EP.207 I รวมแก๊งค์ สายตี้ เอมี่ พี่ป๊อก ยายป๋อมแป๋ม

Now You C-Amy EP.207 I รวมแก๊งค์ สายตี้ เอมี่ พี่ป๊อก ยายป๋อมแป๋ม

ไฮไลท์ฟุตบอล บุนเดสลีกา | บาเยิร์น มิวนิค 4-2 ไฮเดนไฮม์ | 7 ธ.ค. 67

ไฮไลท์ฟุตบอล บุนเดสลีกา | บาเยิร์น มิวนิค 4-2 ไฮเดนไฮม์ | 7 ธ.ค. 67

อันไหนเสียงดัง💥 #freefire #ฟีฟาย #shorts

อันไหนเสียงดัง💥 #freefire #ฟีฟาย #shorts

ว้าแดงล้ำไทย กับเรือประมงถูกยิง ปมปัญหาความมั่นคงไทย-เมียนมา อะไรคือทางออก | GLOBAL FOCUS #102

ว้าแดงล้ำไทย กับเรือประมงถูกยิง ปมปัญหาความมั่นคงไทย-เมียนมา อะไรคือทางออก | GLOBAL FOCUS #102

JJ's sister and cat save fish - minecraft animation #shorts

JJ's sister and cat save fish - minecraft animation #shorts