- 87
- 345 789
Johnny Code
United States
เข้าร่วมเมื่อ 26 ส.ค. 2022
I take complex topics in programming, Machine Learning, Reinforcement Learning and make it simpler for you ;D
I'm a Walking Dead fan (back when it was good), as you can guess from my profile pic, which my wife drew. If my stuff helped you, consider buying me a coffee (or brainz):D
I'm a Walking Dead fan (back when it was good), as you can guess from my profile pic, which my wife drew. If my stuff helped you, consider buying me a coffee (or brainz):D
Dueling DQN (Dueling Architecture) Explained & Implemented | DQN PyTorch Beginners Tutorial #11
Enhance the DQN module with Dueling DQN (aka Dueling Architecture).
*Support me here😀😀😀:* www.buymeacoffee.com/johnnycode
Github code: github.com/johnnycode8/dqn_pytorch
*Support me here😀😀😀:* www.buymeacoffee.com/johnnycode
Github code: github.com/johnnycode8/dqn_pytorch
มุมมอง: 287
วีดีโอ
How Prioritized Experience Replay (PER) Works in Deep Q-Learning
มุมมอง 21428 วันที่ผ่านมา
Explanation of Prioritized Experience Replay (PER) in Deep Q-Learning. Deep Q-Learning Explained - th-cam.com/video/EUrWGTCGzlA/w-d-xo.html Support me here: www.buymeacoffee.com/johnnycode
Creating Enemy Animations, Collision Layers and Masks | Godot 4.3 “Your First 2D Game” Tutorial
มุมมอง 137หลายเดือนก่อน
We'll create the enemy animations, collision layers, and collision mask. Video walk thru of the "Your First 2D game" tutorial: docs.godotengine.org/en/stable/getting_started/first_2d_game/index.html
Coding Player Movement, Animations, Collision Detection | Godot 4.3 “Your First 2D Game” Tutorial
มุมมอง 195หลายเดือนก่อน
We'll code the player's movement, animations, and collision detection. Video walk thru of the "Your First 2D game" tutorial: docs.godotengine.org/en/stable/getting_started/first_2d_game/index.html
Creating the Player Character Scene | Godot 4.3 “Your First 2D Game” Tutorial
มุมมอง 334หลายเดือนก่อน
Let's learn Godot together! We'll set up the Godot project (GDScript, not C#) and create the player scene. Video walk thru of the "Your First 2D game" tutorial: docs.godotengine.org/en/stable/getting_started/first_2d_game/index.html
Easiest Way to Train AI to Play Atari Games with Reinforcement Learning | RL Baselines3 Zoo Tutorial
มุมมอง 7602 หลายเดือนก่อน
No coding or reinforcement learning experience required! This step-by-step tutorial shows you how to train AI agents to play Atari games using deep reinforcement learning algorithms. We'll be using the RL Baselines3 Zoo, a powerful training framework that lets you train and test AI models easily through a command line interface. Support me here: www.buymeacoffee.com/johnnycode 00:00 Intro 00:13...
Q-Learning Tutorial 7: Adjust Learning Rate to Unstuck MountainCarContinuous-v0 from Local Optima
มุมมอง 4372 หลายเดือนก่อน
I tried training the Gymnasium "Mountain Car Continuous" but the car got stuck at the bottom of the hill. Let's talk about how to adjust the hyperparameters to get the car out unstuck. My code: github.com/johnnycode8/gym_solutions Support me here: www.buymeacoffee.com/johnnycode Q-Learning Series: th-cam.com/play/PL58zEckBH8fBW_XLPtIPlQ-mkSNNx0tLS.html Ready for Deep Q-Learning? th-cam.com/vide...
Q-Learning Tutorial 6: Train Gymnasium Pendulum-v1 on Continuous Action and Observation Spaces
มุมมอง 6332 หลายเดือนก่อน
Walkthru Python code that uses the Q-Learning to train a Pendulum agent. My code: github.com/johnnycode8/gym_solutions Support me here: www.buymeacoffee.com/johnnycode Ready for Deep Q-Learning? th-cam.com/video/EUrWGTCGzlA/w-d-xo.html Need help installing the Gymnasium library? th-cam.com/video/gMgj4pSHLww/w-d-xo.html Reinforcement Learning Tutorials: th-cam.com/play/PL58zEckBH8fCt_lYkmayZoR9X...
Q-Learning Tutorial 5: Train Gymnasium Acrobot-v1 on Continuous Action and Observation Spaces
มุมมอง 5612 หลายเดือนก่อน
Walkthru Python code that uses the Q-Learning to train a 2 joint swinging agent. Steal my code: github.com/johnnycode8/gym_solutions Support me here: www.buymeacoffee.com/johnnycode Ready for Deep Q-Learning? th-cam.com/video/EUrWGTCGzlA/w-d-xo.html Need help installing the Gymnasium library? th-cam.com/video/gMgj4pSHLww/w-d-xo.html Reinforcement Learning Tutorials: th-cam.com/play/PL58zEckBH8f...
Tianshou Reinforcement Learning Library: DQN Sample Code Walkthrough
มุมมอง 2222 หลายเดือนก่อน
Get started with the Tianshou Reinforcement Learning library. For this first video, we'll try running the DQN algorithm on Cartpole.
Double DQN (DDQN) Explained & Implemented | DQN PyTorch Beginners Tutorial #10
มุมมอง 1.1K3 หลายเดือนก่อน
Enhance the DQN code with Double DQN (DDQN). *Next:* th-cam.com/video/3ILECq5qxSk/w-d-xo.html *Support me here😀😀😀:* www.buymeacoffee.com/johnnycode Github code: github.com/johnnycode8/dqn_pytorch References: Double Q-Learning papers.nips.cc/paper_files/paper/2010/file/091d584fced301b442654dd8c23b3fc9-Paper.pdf Double Deep Q-Learning (DDQN) arxiv.org/pdf/1509.06461
How to Generate Notes/Summaries From YouTube Videos with Gemini Flash & ChromaDB (RAG)
มุมมอง 1.1K3 หลายเดือนก่อน
Do you use TH-cam for learning? I'll show you how to generate high-quality notes from TH-cam videos using just a bit of Python. We’ll extract transcripts from videos and use Google’s Gemini Flash Large Language Model (LLM) to convert them into concise notes. Then, we’ll save these notes in a vector database (ChromaDB) and show you how to use LLM to ask questions on your saved notes. Steal my co...
Train the DQN Algorithm on Flappy Bird! | DQN PyTorch Beginners Tutorial #9
มุมมอง 1.1K5 หลายเดือนก่อน
Finally train the DQN algorithm on Flappy Bird! *Next:* th-cam.com/video/FKOQTdcKkN4/w-d-xo.html *Support me here😀😀😀:* www.buymeacoffee.com/johnnycode Github code: github.com/johnnycode8/dqn_pytorch
Test the DQN Algorithm on CartPole-v1 | DQN PyTorch Beginners Tutorial #8
มุมมอง 1.2K5 หลายเดือนก่อน
Complete the DQN algorithm code and test it against a simple environment like Cart Pole. *Next:* th-cam.com/video/P7bnuiTVJS8/w-d-xo.html *Support me here😀😀😀:* www.buymeacoffee.com/johnnycode Github code: github.com/johnnycode8/dqn_pytorch
Optimize the Target Network PyTorch Code | DQN PyTorch Beginners Tutorial #7
มุมมอง 8885 หลายเดือนก่อน
Make the Target Network code more efficient by taking advantage of PyTorch capabilities. *Next:* th-cam.com/video/Ejv8yv5-i0M/w-d-xo.html *Support me here😀😀😀:* www.buymeacoffee.com/johnnycode Github code: github.com/johnnycode8/dqn_pytorch
Explain Loss, Backpropagation, Gradient Descent | DQN PyTorch Beginners Tutorial #6
มุมมอง 8525 หลายเดือนก่อน
Explain Loss, Backpropagation, Gradient Descent | DQN PyTorch Beginners Tutorial #6
Implement the Target Network | DQN PyTorch Beginners Tutorial #5
มุมมอง 1.2K5 หลายเดือนก่อน
Implement the Target Network | DQN PyTorch Beginners Tutorial #5
How to Set Up ChromaDB with Docker & Enable Role-Based Token Authentication
มุมมอง 1.8K5 หลายเดือนก่อน
How to Set Up ChromaDB with Docker & Enable Role-Based Token Authentication
Implement Epsilon-Greedy & Debug the Training Loop | DQN PyTorch Beginners Tutorial #4
มุมมอง 1.3K5 หลายเดือนก่อน
Implement Epsilon-Greedy & Debug the Training Loop | DQN PyTorch Beginners Tutorial #4
Implement Experience Replay & Load Hyperparameters from YAML | DQN PyTorch Beginners Tutorial #3
มุมมอง 1.4K5 หลายเดือนก่อน
Implement Experience Replay & Load Hyperparameters from YAML | DQN PyTorch Beginners Tutorial #3
Implement the Deep Q-Network Module | DQN PyTorch Beginners Tutorial #2
มุมมอง 2.3K5 หลายเดือนก่อน
Implement the Deep Q-Network Module | DQN PyTorch Beginners Tutorial #2
Implement Deep Q-Learning with PyTorch and Train Flappy Bird! | DQN PyTorch Beginners Tutorial #1
มุมมอง 5K5 หลายเดือนก่อน
Implement Deep Q-Learning with PyTorch and Train Flappy Bird! | DQN PyTorch Beginners Tutorial #1
Q-Learning Tutorial 2: Train Gymnasium Taxi-v3 on Multiple Objectives
มุมมอง 2.7K6 หลายเดือนก่อน
Q-Learning Tutorial 2: Train Gymnasium Taxi-v3 on Multiple Objectives
How to Install Gymnasium Box2D on Windows Subsystem for Linux (WSL)
มุมมอง 9336 หลายเดือนก่อน
How to Install Gymnasium Box2D on Windows Subsystem for Linux (WSL)
Install PettingZoo Multi-Agent Reinforcement Learning (MARL) Library
มุมมอง 5266 หลายเดือนก่อน
Install PettingZoo Multi-Agent Reinforcement Learning (MARL) Library
Stable Baselines3 Tutorial 3: Save the Best Model and Auto-Stop Training | Demo on BipedalWalker-v3
มุมมอง 6416 หลายเดือนก่อน
Stable Baselines3 Tutorial 3: Save the Best Model and Auto-Stop Training | Demo on BipedalWalker-v3
Stable Baselines3 Tutorial 2: Dynamically Load RL Algorithm for Training | Demo on Pendulum-v1
มุมมอง 7526 หลายเดือนก่อน
Stable Baselines3 Tutorial 2: Dynamically Load RL Algorithm for Training | Demo on Pendulum-v1
Stable Baselines3 Tutorial: Beginner's Guide to Choosing Reinforcement Learning Algorithms
มุมมอง 1.5K6 หลายเดือนก่อน
Stable Baselines3 Tutorial: Beginner's Guide to Choosing Reinforcement Learning Algorithms
MIP Broke My C# File Saving Excel Code, How to Fix it with 1 Line (Microsoft Information Protection)
มุมมอง 386 หลายเดือนก่อน
MIP Broke My C# File Saving Excel Code, How to Fix it with 1 Line (Microsoft Information Protection)
REINFORCE (Vanilla Policy Gradient VPG) Algorithm Explained | Deep Reinforcement Learning
มุมมอง 1.7K7 หลายเดือนก่อน
REINFORCE (Vanilla Policy Gradient VPG) Algorithm Explained | Deep Reinforcement Learning
Tks for your work. Is the method you use called double deep q learning
No, this is not Double Deep Q-Learning, even though 2 networks are used. I have an explanation of DDQN in my DQN deep dive series: th-cam.com/video/FKOQTdcKkN4/w-d-xo.html
Thank you for the awesome videos!
Thanks!!!
Thank you!
this videos is helpfull for me and solved my problem, thank so much.
Great tutorial thanks
Ready to get started with Stable Baselines3? th-cam.com/video/OqvXHi_QtT0/w-d-xo.html
Hi Johnny, Great Video! I have been following along with your flappybird series up to now. Can this be implemented instead of using the python deque experience replay function? If so, could you demonstrate an implementation of this? Thanks!
Yes, I will show how to combine Prioritized Experience Replay with DQN in a future video.
@ Amazing, thank you. I’ve managed (With a little help from Copilot) to throw together something that works in this regard. But I’d love to see how you do it also. 👍
Thanks for the great video. It makes me easier to build my own training environments.
hi, can you please explain why, even i am not using the slippery, it still sometimes learns nothing?😁
One possibility is that epsilon is decreased to 0 before the elf finds the goal (or finds the goal often enough). In this case, try slowing down the epsilon decay rate. Epsilon-Greedy has a randomness factor, so it is possible that the elf never randomly finds the goal.
Dude this is really simple and impressive. Love it ! thank you for putting this effort into this content
A+ explanation. Thank you for sharing and explaining your work, that was a HUGE help for me.
Thanks for vid! So after training network, how does testing work?
You can modify the test() function to not render the graphics and also collect some statistics on success rate. Then run test() for X number of times and check the success rate.
I found your channel recently while trying to grasp some details about ML and now RL, and thanks for the nice series and videos for beginners that you created!
You are the man. You saved me hours of work 🔥
thank you, your videos are very helpful, just curious, are there any generalized learning strategy, so it can solve a different environment as well?
Q-Learning is the basis of one of the general learning strategies: Deep Q-Learning, also known as Deep Q-Network or DQN. My other video on Stable Baselines3 shows how to use a reinforcement learning library to apply several different learning algorithms: th-cam.com/video/OqvXHi_QtT0/w-d-xo.html
Thank you for the great videos. I'm not a native English speaker, so I've been putting a lot of efforts on reading the documents on stable baselines 3 website. But you made SB3 and RL-Zoo easier for me. I really appreciated this. Thank you so much.
This is an excellent video with a nice simple introduction to the ideas of a neural network, well done! I'm afraid there is a bit of an error in the math (and I don't mean the 2y thing). When motivating the MSE, you showed that the sum of (y[i] - y_hat[i]) was zero with w=0.5. This is correct of course, but it shows that the gradient couldn't be 2x/n * sum(y[i] - y_hat[i]), otherwise it would just be 0. However, your code has the mostly correct formula, which is -2/n*sum(x[i] * (y[i]-y_hat[i])) The problem is that the loss isn't a function of a single y_hat; it is a multivariable function of y_hat[i]] for i = 1,2,3,4. This of course complicates the math, because the derivative dl/dw involves using the chain rule for partial derivatives. Also, in your code when you take the dot product, it produces a scalar, so the mean is just over the single value, and doesn't divide by n=4. You would need to either do the division directly instead of calling mean(), or use numpy's array multiplication instead of the dot product, and then take the mean of the resulting array.
Thanks for the feedback, I’ll go back and check!
thanks alot
Fire Video brother!
Hi, thanks for the video. Did you try to solve it with SARSA and SARSA lambda too? If yes, did you find the optimal solution. I am currently stucked in a suboptimal solution...
I have not tried sarsa. Maybe try someone’s sarsa GitHub code and see if it makes any difference.
Only shows the calls - NOT the redirects...
This is so well explained, from theory to code! The handwritten part that goes step by step really helps me better understand the DQN. And then, as a cherry on top, there’s the nice, easy code walkthrough that connects with the handwritten steps. Thanks a ton for making this video!
Man, if only more of youtube were like this: clear, to the point, no long introduction trying to make this a 20min video, just pure information! Thank you!!
thank you so much dude
Thank you very much for this video! It saved me a lot of time. I also subscribed.
Thanks for the video. It's really helpful. I got a question on ChromaDB used in the Google Colab environment. Quite often, i'm facing the issue "OperationalError: attempt to write a readonly database "...I've tried several approaches I got from the Internet without success. Do you have any suggestions for me? Thanks.
I've used Colab with the ChromaDB files in Google Drive, is this what you're doing: th-cam.com/video/ziiLezCnXYU/w-d-xo.html
@@johnnycode not exactly the same. i was using LangChain's wrapper for a simple RAG scenario... "Chroma.from_documents" was the function I used to ingested encodings out of a PDF doc.
The only thing i can think of is if you are loading data using multiple colab sessions or using Python multiprocess, since that would lock up the backend SQLite database.
I have been using a different coding structure using DQN to solve the mountain car v0 problem with discrete action. And the main problem was that every time I run this same code, I get different results, that is sometimes good and sometimes bad. What do you advice me. this problem is really embarrassing me. Thank you for this video and keep up the good work brother. I wish I can show you my code to discuss about it. But I think youtube does not allow us to do that. Any way I can't wait to see your response.
I have a video using DQN on MountainCar-v0 using discrete actions: th-cam.com/video/oceguqZxjn4/w-d-xo.html If you wrote your own DQN code, you can compare it to mind and check for mistakes: th-cam.com/play/PL58zEckBH8fCMIVzQCRSZVPUp3ZAVagWi.html You can use Stable Baselines3's DQN as well: th-cam.com/video/OqvXHi_QtT0/w-d-xo.html
Great tutorial mate, keep going!
Hey @johnnycode! Great to see this series! I can't wait to see what else you teach us in Godot. Great video!
You can spend hours trying to figure it out how to install gymnasium or... just watch this video! Thanks!
Great video btw, really helpful! Just a heads up for anyone trying to render using numpy>2.0.0, there might be a TypeError: "size must be two numbers" when using the env.step function. Can be fixed by downgrading to numpy==1.26.4
a time saver for sure, thanks!
Thanks a lot
Best Deep-Q tutorial
you just saved me a lot of frustration. thank you.
Hey i love your videos they were so helpful and i appreciate all the work youve done. I was wondering if you can do, or you know of any tutorials to make your own environment as i would like to try make my own model and train it if possible :)
Check out my videos in my "build your own RL environment" playlist and let me know what you think: th-cam.com/play/PL58zEckBH8fDt-F9LbpVASTor_jZzsxRg.html
October 9th works for me, thanks man!
took so long to find a video this helpful, thanks!!
Bro, you're amazing
He just goes up n down pasting code, not teaching a thing, but the code seems to work. Need to pause a hundred times the ChatGPT-Voice of this guy
Awesome content, man. I've been searching for videos on RL for a while, and you're the best I've come across. Do you have a 'path' to learn and practice RL? I'm really interested in it. I know ML and a little bit of DL, but I don't know much about RL. If you have a good learning path, I would be grateful. i' m struggling in finding it
Thanks, but I'm not an RL expert. I just make videos about what I've learned to make RL easier for others. I see questions similar to yours in the 'reinforcementlearning' subreddit, so you might want to try searching there.
Thank you for the great tutorial video! I appreciate the effort you put into preparing it. I did notice a small detail in the code: it should be "np.multiply(...).mean()" instead of "np.dot(...).mean()". The "np.dot(...)" function returns a single value, which is the sum of the element-wise products. This doesn't affect the final output, as it only scales the actual learning rate, so the program can still converge to the correct values for w and d.
How to synchronize the database and the model when both need to be updated?
You have to do some manual work to sync your model and database. Use forward engineer to generate the SQL script, edit the script by hand and run it against the db. Then reverse engineer to recreate your model.
Nice! Simple and easy!
thanks great video
thank you very much i love you from Thailand
Hey Johnny, Thank you for such an informative video, really helped me fixed my implementation. I do have a quick question on the location of the target network update (line 119). From my understanding it is outside the while loop because we don't want to train the network when we have yet to reach the terminal state, but wouldn't this contradict with the algorithm where it should be placed inside the while loop?
And just to be clear, having it outside the episode step while loop not only helps the model train and reach convergence faster. I just am unsure why the algorithm flow in textbooks and most implementations would still place the DQN training step inside the while loop.
You could stick strictly to the algorithm in the published papers, but people have taken liberty to do some adjustments within their own implementations. You can actually put the training step inside or outside of the while loop. I specifically mentioned this in my DQN Pytorch implementation series, checkout minute 5 in this video: th-cam.com/video/vYRpJo-KMSw/w-d-xo.html
thanks a lot
Hi! On the first step, I receive the error message "CondaValueError: The target prefix is the base prefix. Aborting." any ideas on what my issue is?
You got the error running this: "conda create -n gymenv"? Copy and paste from this reply and try again please.