Introduction to Reinforcement Learning | DigiKey

DigiKey

มุมมอง 37 094

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 1 มิ.ย. 2024
Reinforcement Learning (RL) is a field of machine learning that aims to find optimal solutions to control theory problems for various tasks. It employs an artificial intelligence (AI) “agent” that takes in observations, chooses actions, and learns from rewards. Modern RL algorithms train agents using trial-and-error approaches that involve directly interacting with the given environment.
In the video, we cover the basic theory behind RL and demonstrate how to use Farama Foundation Gymnasium (gymnasium.farama.org/) and Stable Baselines3 (stable-baselines3.readthedocs...) in Python to train an AI agent to solve the classic cartpole (gymnasium.farama.org/environm...) control theory problem. At the end of the video, we encourage you to try applying the knowledge to solve the slightly more advanced inverted pendulum problem (gymnasium.farama.org/environm....
The solution to the challenge can be found here: www.digikey.com/en/maker/proj...
Code for training RL agents to solve both the cartpole and pendulum problems can be found here: github.com/ShawnHymel/reinfor...
In RL, the environment can be anything the agent interacts with, such as board games, video games, virtual settings, or the real world. We often use a code wrapper (e.g. Gymnasium) to observe this environment, perform agent-specified actions, and assign rewards. Note that rewards are considered part of the environment and are instrumental in training.
The decision-making process for choosing actions based on observations is known as the “policy.” During training, the agent selects actions randomly or per policy. The environment then offers a new observation and reward, guiding the training algorithm to help the agent choose actions leading to higher predicted total rewards in the future.
The cartpole problem consists of a virtual pole balanced on top of a cart that can only move left and right. The goal is to design an AI agent that can keep the pole balanced by pushing the cart left or right. In the video, we use Deep Q-Learning (towardsdatascience.com/deep-q...) to train a Deep Q-Network (DQN) to solve the cartpole problem.
We list some recommended reading and viewing materials below if you would like to dive deeper into reinforcement learning.
Articles:
Reinforcement Learning Algorithms - an intuitive overview - / reinforcement-learning...
Which Reinforcement learning-RL algorithm to use where, when and in what scenario? - medium.datadriveninvestor.com...
Q-Learning vs. Deep Q-Learning vs. Deep Q-Network - www.baeldung.com/cs/q-learnin...
Deep Q Networks (DQN) With the Cartpole Environment - wandb.ai/safijari/dqn-tutoria...
RL - Proximal Policy Optimization (PPO) Explained - / rl-proximal-policy-opt...
Proximal Policy Optimization (PPO) - huggingface.co/blog/deep-rl-ppo
Related Videos:
Exploring Reinforcement Learning: Can AI Learn to Play QWOP?
Intro to Edge AI
Related Project Links:
Intro to Reinforcement Learning Using Gymnasium and Stable Baselines3
Related Articles:
Teach an AI to play QWOP
What is Edge AI? Machine Learning + IoT
Learn more:
Maker.io - www.digikey.com/en/maker
DigiKey’s Blog - TheCircuit www.digikey.com/en/blog
Connect with Digi-Key on Facebook / digikey.electronics
And follow us on Twitter / digikey
00:00 - Intro
00:59 - History of reinforcement learning
02:14 - Environment and agent interaction loop
06:21 - Gymnasium and Stable Baselines3
07:55 - Hands-on: how to set up a gymnasium environment
26:57 - Markov decision process
31:02 - Bellman equation for the state-value function
34:12 - Bellman equation for the action-value function
35:47 - Bellman optimality equations
36:43 - Exploration vs. exploitation
38:39 - Recommended textbook
39:25 - Model-based vs. model-free algorithms
40:27 - On-policy vs. off-policy algorithms
41:19 - Discrete vs. continuous action space
42:36 - Discrete vs. continuous observation space
43:56 - Overview of modern reinforcement learning algorithms
46:29 - Q-learning
49:27 - Deep Q-network (DQN)
51:59 - Hands-on: how to train a DQN agent
01:12:36 - Usefulness of reinforcement learning
01:13:26 - Challenge: inverted pendulum
01:14:10 - Conclusion
วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 7

@OMNI_INFINITY 3 หลายเดือนก่อน
Glad shawn is making tutorials again! Thanks! Congratulations on getting hired by digikey! Should make a “PCB art in KiCAD” video and an “surface mount prototyping with only a desoldering hotplate” video and a “how to make a diy pick-and-place robot arm that has machine vision capability”. And yes, would maybe be up for collaborating on designing that oick and place robot. Really don’t like soldering by hand or placing tiny components by hand.
@dave20874 10 หลายเดือนก่อน ⁺⁴
I watched thinking this would be a good refresher for material I'd learned over the last couple years. But I see a lot has changed with gymnasium. And the stable baselines material was all new to me. I learned more than I expected.
@PatrickHoodDaniel 9 หลายเดือนก่อน
This is, by far, the best ad that I have ever seen!! Also, a great explanation. Thank you for pushing this to me.
@geekzombie8795 10 หลายเดือนก่อน ⁺²
Thank you for this informative video!
@OMNI_INFINITY 3 หลายเดือนก่อน
If society was a meritocracy, both I and shawn would already be millionaires or billionaires. On with the casing design of My new computer product. Wish I had millions to market it more properly.
@insanitygamer_vibing 10 หลายเดือนก่อน ⁺¹
Why is this an ad on TH-cam music?? 😂😂😂
@geekzombie8795 10 หลายเดือนก่อน
C’est éducation!

ต่อไป

เล่นอัตโนมัติ

Exploring Reinforcement Learning: Can AI Learn to Play QWOP? | Digi-Key Electronics