A.I. Learns to Play Space Invaders | Reinforcement Learning

Nicholas Renotte

มุมมอง 11 858

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 11 ธ.ค. 2024

ความคิดเห็น • 45

@ShijuThomas1 3 ปีที่แล้ว ⁺¹
Nice work - so much fun to see / and can see that you're having fun with it Mr 'Bambi Eyes' - perfect nick too
@NicholasRenotte 3 ปีที่แล้ว ⁺¹
😂😆 I'm having wayyyy too much fun @Shiju! Hope you've been amazing!
@marlonscott160 3 ปีที่แล้ว ⁺¹
Once again Nicholas, well done! Thank you. With every one of your videos I become more and more a Deep Learning and Reinforcement Learning enthusiast!
@NicholasRenotte 3 ปีที่แล้ว
Love it! It's such an awesome space to be taking a look at @Marlon!
@goksel9908 2 ปีที่แล้ว ⁺¹
When i write seed = 0 like at 2:05,
AttributeError: module 'tensorflow' has no attribute 'set_random_seed'
its probably about tensorflow version, i get it. But i don't want change version. im in staying dorm and internet is like snail speed :)
i searched many things, many pages but i can't figured it out. Could you help please?
@anthonylwalker 3 ปีที่แล้ว ⁺³
Would definitely like to see a more in depth tutorial on this like your others!
@NicholasRenotte 3 ปีที่แล้ว
You got it, added to the list @Anthony!
@pratikkorat790 3 ปีที่แล้ว ⁺³
Last night i saw your story and i was wishing if he could have made the video and today taddaa....🤗
@NicholasRenotte 3 ปีที่แล้ว
YESS, tbh I've been trying to solve it since November last year and I finally got it working!! So happy you enjoyed it @Pratik, I'm going to do a full tutorial on it soon!
@ashleysami1640 3 ปีที่แล้ว ⁺⁴
Not a gamer or an advanced coder, but this was super cool to watch!
Nice work bambi eyes 😅
@NicholasRenotte 3 ปีที่แล้ว
🤣 thanks so much @Ash! Bambi’s 🦌 pretty glad you enjoyed the video ❤️
@computer_vision 2 ปีที่แล้ว
hlw sir , can you provide some pre train checkpoint so that we can get a starting boost
@code16595 3 ปีที่แล้ว ⁺¹
You are amazing bro.. i'm your fans now on your youtube channel. can't wait to see your next AI video content.
@NicholasRenotte 3 ปีที่แล้ว
Thanks sooo much! Much love @Toufan RA!
@pitter6636 3 ปีที่แล้ว ⁺²
I enjoyed this way too much. Is /refreshing and tragic/ to see the TensorBoard and notice that even with 40M episodes of training there is A TON of reward variability between episodes, so it plays better in average but still messes up a lot episode to episode. (I'm not asking you to do this), but i want to know your opinion on how consistent do you think the model will be with.. let's say 1 billion episodes of training, would we still see a lot of variability or it would start decreasing?.
@NicholasRenotte 3 ปีที่แล้ว
Yeah agreed, it seems crazy how unstable the training is but tbh I've seen this happen in almost every atari environment and anything more sophisticated than CartPole. After a billion episodes I think we'd probably be in the realm of an overfit model. After about 38m steps I started to see a drop off in performance, a bunch of research I read pointed to using Early Stopping to prevent this. It seems like there is an optimal state after which the model perhaps can't the environment complexity.
@welidbenchouche ปีที่แล้ว
I would like to see a wheeled mobile robot navigating through an environement avoiding obstacles, and i also would love to know if it's possible to build my own environement .
@shashankkumar851 3 ปีที่แล้ว ⁺¹
Can we implement it in tensor flow.js
@NicholasRenotte 3 ปีที่แล้ว ⁺¹
Haven’t tried as of yet! Tbf I don’t know how many RL libraries are available in JS yet. Would be a fair about of coding to write from scratch with TFJS though @Shashank!
@MelvinAdekanye 3 ปีที่แล้ว ⁺²
Missed opportunity. Should have been like: "My friend here AI is cracked at Space Invaders"
@NicholasRenotte 3 ปีที่แล้ว
Dammit 😂would've been golden!
@guyincognito1985 3 ปีที่แล้ว ⁺¹
Is there a code repo for this on Github?
How is the "AI" working here? Is it essentially: make random moves (or don't move) and try to maximize the score at the time of the player's death (or 3 deaths). Then repeat that over and over again (millions of times). Is the "AI player" actually aware of the dropping missiles like a human would be?
Makes me want to take out my Raspberry Pi and see how good I am at Space Invaders. I played it in the arcades when it first came out once or twice, but I played it the most on my dad's Osborne computer. I think it was written in BASIC?
Thanks!
@NicholasRenotte 3 ปีที่แล้ว
There's code for a similar model on GH but I never released the code used for this. Didn't get around to making a full blown tutorial for it. It uses the RL models from this: th-cam.com/video/nRHjymV2PX8/w-d-xo.html but the environments from this th-cam.com/video/hCeJeq8U0lo/w-d-xo.html.
You're correct, we're actually passing a series of images to the model so in theory it learns to "see" the missiles and other ships using Convolutional Neural Network layers. I tried playing myself before training and let me tell you, I have in no way improved from when I last played 10 years ago 🤣.
@datgatto3911 3 ปีที่แล้ว ⁺¹
Hi, It is me, again. Nice training, I like the way you tell about the testing among training steps. Subscribed! :D But do you plan to do more video on making RL from scratch, like explanning and making A2C, A3C, PPO,... Really like to see those tutorials and would like to discuss if having a chance.
@NicholasRenotte 3 ปีที่แล้ว ⁺¹
I wasn't but I can add it to the list if you'd like?
@barkatmessaouda3611 3 ปีที่แล้ว
hi sir i trained my own deep q learning model with keras RL for 2houers on cpu and i got as a maximum score 195 but on cpu it takes a lot of time can i train the model on colab gpu
@NicholasRenotte 3 ปีที่แล้ว
Sure can!
@sitalatha3312 8 หลายเดือนก่อน
For some reason (maybe due to it being a cloud and answering many requests at once, or due to network latency) the speed I get on Colab Pro version is just 27 ms/step but on PyCharm on my M1 MacBook Pro I get 6 or 7 ms/step.
@eleutheras ปีที่แล้ว
no aliens' bomb in model 6...🤔
@vikashchand. 3 ปีที่แล้ว ⁺³
Cool stuff man! 🔥 Love the storytelling explanations lol
Do a big project based tutorial bro! Maybe something with a real-world implementation of AI and ML on the web or something 😎👌
@atifasadkhan 3 ปีที่แล้ว ⁺¹
I agreed
@NicholasRenotte 3 ปีที่แล้ว ⁺¹
YOOO @Vik! Definitely, I've been doing a bunch of planning. How big of a project are we talking about here 😂? I'm thinking of doing more end-to-end stuff, kind of how I did the Sign Language model from Python all the way to a deployed Tensorflow JS web app. Anything else you'd like to see? Possibly some NLP based stuff? TH-cam title generator app?
@vikashchand. 3 ปีที่แล้ว ⁺²
@@NicholasRenotte NLP sounds cool! You create some really informative and unique stuff man! 🔥 As for the big project, end-to-end yes! Those are nice. If I'm not wrong though, I understand you have a finance and accounting background? How about an ML project in that, something like your sales forecasting in Excel tutorial but bigger? Or maybe some geographical data analysis stuff, some big data, and GIS stuff 🤘 , Or maybe integrate some planning analytics or advanced NLP stuff with React.js and/or Flutter as part of your Full-Stack series + end-to-end, that would be cool too!
@NicholasRenotte 3 ปีที่แล้ว ⁺¹
@@vikashchand. hmmm, these are awesome suggestions! So maybe some more business focused stuff, I've got a ton of content on that so can definitely roll it out!
@yizhakshachar ปีที่แล้ว ⁺¹
This artificial intelligence is not that sophisticated, because it only memorizes moves and does not know the principle of the game. Sophisticated artificial intelligence is supposed to calculate the speed of the firing rate, the movement of the invading spaceships, our spaceship and the bonus spaceship that needs to be intercepted and thus know where to move and when to shoot.
@sitalatha3312 8 หลายเดือนก่อน
That’s not how it works. It takes a lot of inputs and is very sophisticated on the inside but because it is not explicit in the code you cant see it. Good question though!
@facundogoiriz7323 3 ปีที่แล้ว ⁺¹
Thank you
@NicholasRenotte 3 ปีที่แล้ว ⁺¹
Anytime @Facundo! Enjoy it?
@schafkopfbam6684 3 ปีที่แล้ว
Cool Video! Can a DQN Agent perform that good as well on Space Invaders?
@erickgomez7775 ปีที่แล้ว
[puts on hat] it's hacker time!
@LeilaniMano 7 หลายเดือนก่อน
Human versus computer.
Calculator vs calculator.
@AK-ox3mv หลายเดือนก่อน
It tried peace with alien untill 20,000,000 steps then got angry👾
@atifasadkhan 3 ปีที่แล้ว ⁺¹
It was cool
@NicholasRenotte 3 ปีที่แล้ว
Thanks so much @Atif! 🙏
@jtreg 2 ปีที่แล้ว ⁺³
Too much of your head and sadly lacking in actually focusing on the code. Despite to superficial style it did include some useful signposts for absolute beginners. However, would have liked less 'selfie' and more on the code itself.

ต่อไป

เล่นอัตโนมัติ

Python Reinforcement Learning Tutorial for Beginners in 25 Minutes