AI DESTROYS The Centre Of The Universe | Super Mario Galaxy

AI Tango

มุมมอง 22 594

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 23 พ.ย. 2024

ความคิดเห็น • 152

@bitblit 9 หลายเดือนก่อน ⁺³⁰
This is a greater testament to your ingenuity and patience than it is of what can be done with AI. You somehow painstakingly created a decent reward function for a game that was never meant to have AI playing it. Mad props man, most have given up before reaching this point.
@aitango 9 หลายเดือนก่อน ⁺⁸
Thanks, that really means a lot!
@wrk-pc1qj 8 หลายเดือนก่อน ⁺¹
1@@aitango
@sketcher2459 9 หลายเดือนก่อน ⁺⁶⁷
that lil wall jump at 15:47 was cleeaaaannn
@jackatk 9 หลายเดือนก่อน ⁺⁶
Again at 19:21 🔥🔥👏
@Marksman560 9 หลายเดือนก่อน ⁺⁴⁰
"How I made Mario fight through the (fiery) pain? with MOAAR PAINNNN!" 😆
@aitango 9 หลายเดือนก่อน ⁺⁸
ahahah
@PolarTheFurry 9 หลายเดือนก่อน ⁺⁴⁶
Super cool to see your AI beat such a complex level. It was already pretty mindblowing seeing it beat the simpler one before, but this takes it to a whole new level! I wonder if it would eventually be able to play the whole game? It would probably need years of training, but I don't think it would be impossible with where AI is getting
@aitango 9 หลายเดือนก่อน ⁺²⁰
Really glad you liked it! Maybe eventually, as you mention that would take an incredibly long time! To get this to work however I did have to do a lot of handcrafting of rewards, so the time spent making that system would require a tonne of work
@somdudewillson 9 หลายเดือนก่อน ⁺⁷
@@aitangoIt might be possible to train a 'reward network' on gameplay footage of the game, and have it output how far through the level a particular frame of gameplay is.
@superrobotthunderjesus2332 4 หลายเดือนก่อน ⁺²
@@somdudewillson This would be extremely easy, considering all you would need would be multiple runs of the level, and the predicted value for the label could be ((current time)/(runtime of video)); you could even refine it; for every death, add that run to the training data of the reward model, but with the sign flipped. For every successful run that is faster, add that to the training data.
@TheFurry 6 หลายเดือนก่อน ⁺³
poor ai saw the lava tunnel and just thought "yeah, no"
@alansmithee419 9 หลายเดือนก่อน ⁺⁹
I've been waiting let's goooooo.
Honestly I think this is my favourite channel right now for actual demonstrations of AI learning. So few people are actually working on smaller-scale things like this for TH-cam and the like and I love seeing their progress over time.
@aitango 9 หลายเดือนก่อน ⁺³
Thanks, really great to hear! Was great to finally upload again, its been a while
@Ganonmustdie2 9 หลายเดือนก่อน ⁺¹³
After watching this AI struggle for over 100 hours on a single level, I can safely say we've found the new 3rd Game Grump
@TheFurry 6 หลายเดือนก่อน ⁺³
LOL this is brutal
@HideyHoleOrg 9 หลายเดือนก่อน ⁺⁵⁹
Your AI plays Mario better than I do. I have always seen Mario(and many other) AIs that try to reproduce a speed run. I would like to see one that looks good while doing it. Picks the path because it takes them right over the coins, bounce off the enemy, grab the mushroom, shoot the fireball. Can you make this happen?
@joeewert4503 9 หลายเดือนก่อน ⁺⁶
You could prob could reward it for picking up coins and mushrooms buts its still gonna look unnatural as its taking the path of least resistance.
@yourmomsboyfriend3337 9 หลายเดือนก่อน ⁺⁷
Reinforcement learning is not actually a great way to make a TAS. You would think it could be great, and you can even encourage it to speed up by decreasing the reward every time step, effectively incentivizing it to finish the episode as quickly as possible to save as much reward as possible. But in reality, in an environment as complex as this, there are billions of local optima that the model will fall into instead of choosing an optimal path, and you end up with something that clearly has a better route visible to a human. You’re much better off using a modified Monte Carlo Tree Search to essentially brute force the best path, as it would be much faster.
@PixlRainbow 9 หลายเดือนก่อน
@@yourmomsboyfriend3337there is a sort of hybrid approach called Curiosity-Driven Reinforcement Learning. A reward is given for checkpoint progress, but it is also given for encountering a previously unknown state. The main downside is that it's significantly more memory intensive than both approaches.
@timpl0168 9 หลายเดือนก่อน ⁺¹³
Are there any random factors in Super Mario Galaxy or will the AI succeed by just learning one fixed sequence of inputs?
@aitango 9 หลายเดือนก่อน ⁺¹⁷
I'm not actually sure if the game has any randomness or is completely deterministic. During training I force the AI to take random actions, partially to help explore but also to prevent it from just learning a fixed sequence of inputs
@somdudewillson 9 หลายเดือนก่อน ⁺³
It isn't particularily likely that the AI would be able to learn a fixed sequence of inputs - it doesn't really have the architecture necessary to store a bunch of data exactly and then read from that memory at specific times.
@marckiezeender 8 หลายเดือนก่อน ⁺¹
@@aitango if you use dolphin's save-state system to reset the level, then the rng is deterministically the same as the previous save-state
@Drawoon 7 หลายเดือนก่อน ⁺²
There's some great lessons to learn with this one. For example, if you think the most rational thing to do is to jump in a black hole instead of facing what's ahead, you should get your reward functions looked at. If you sort things out, it will get better and you can make it to the other side.
@OmegaChip 9 หลายเดือนก่อน ⁺¹²
Amazing, absolutely. I wonder how it might perform on a level that is a bit more open, maybe even including a boss.
Or the trial stars.
@aitango 9 หลายเดือนก่อน ⁺⁵
Open levels are something I really don't know how it would perform on since the AI tends to be pretty reliant on having a strong reward signal, whereas open levels would be hard to have this. Trial stars are something I have strongly considered, seems like it would be fun ahah
@nobafan7515 9 หลายเดือนก่อน ⁺¹
@@aitangowhat might work for open areas exploration? I can't imagine an algorithm rewarding for linear progress would be compatible.
@PixlRainbow 9 หลายเดือนก่อน ⁺³
@@nobafan7515 a curiosity driven approach may be possible, but this would require significantly more computer resources because of the need to maintain a database of explored states, and significantly more training time as the algorithm may sometimes need to exhaustively explore everything before it knows how to progress
@lukasimus984 9 หลายเดือนก่อน ⁺¹²
this is the most insane thing I have ever seen, no joke. Please never stop making Videos.
@aitango 9 หลายเดือนก่อน ⁺⁴
Really glad you liked it, thank you so much!
@Shadow64 9 หลายเดือนก่อน ⁺¹
This is dope, it shows how complex AI programming really is and you’ve clearly put a ton of work into it here.
@aitango 8 หลายเดือนก่อน
Thanks, really great to hear
@ggkproductions1632 9 หลายเดือนก่อน ⁺¹¹
But can it beat Mario Galaxy 2's Perfect Run?
@lwfawn 7 หลายเดือนก่อน ⁺²
But can it beat Goku?
@rasmuspedersen4891 9 หลายเดือนก่อน ⁺⁴
i'm wondering if it was having a hard time on the moving platforms and other moving objects because it has no visual memory, and therefore can't determine if an object is moving or not when analyzing the frame it's currently on? maybe have a bit of extra lower res data with the last frame stored in it could fix that?
@FlameLFH 9 หลายเดือนก่อน ⁺²
I wonder if the AI would have gotten through this level quicker if it had learned to spin attack mid-air to delay landings.
@agustinmendezposadas 6 หลายเดือนก่อน ⁺¹
The quality of your videos have really gone up, i cant believe you dont have more views
@xtruejudgementx5017 4 หลายเดือนก่อน ⁺¹
The way you describe the punishment system seems harsh
@DistortedMelodies_ 2 หลายเดือนก่อน
The AI choosing to kill itself rather than going throught the lava tunnel will never not be funny to me 😂
@RGBHD404 9 หลายเดือนก่อน ⁺¹
I was wondering, wouldn't giving all the items in a level a unique identifier then giving the AI a reward whenever it's loaded the first time be a way to simplify the programming of the AI in the long run?
@GachaRival 7 หลายเดือนก่อน
Now let’s see Galactus beat Champion’s Road.
@Ifx- 9 หลายเดือนก่อน ⁺¹
Surely the ai seeing colour would be useful to know where it’s going?
@aitango 9 หลายเดือนก่อน
Potentially, however it does massively increase the amount of information the AI needs to learn, meaning the speed I could run it at would be much slower. Might be something I look to try though.
@oricat101 8 หลายเดือนก่อน ⁺¹
The tracks you choose for the Training section are absolute bangers and fit the theme really well.
Would be cool to know how they're called tho
@superky4458 8 หลายเดือนก่อน
I think a fun yet very time consuming idea would be to throw the ai into the game and let it try to finish the game without or at least minimum input. but i don't want to imagine how long that would take
@LegendBegins 9 หลายเดือนก่อน ⁺¹
Very cool! Do you think it's just memorizing the level, or does it generalize to new content?
@aitango 9 หลายเดือนก่อน
It’s hard to say since it did have to learn to deal with lots different styles of levels for this one. I think it would struggle on a brand new level, but with an hour or so of training could easily pick it up
@gumbo64 9 หลายเดือนก่อน ⁺⁶
the lil swagger at 19:20 was so good lol
@Dave-rd6sp 9 หลายเดือนก่อน
I've wanted to see someone try one of these except the only reward function is "does this screen look familiar," rewarding it for finding novel areas in the game, and then training it against multiple games without alteration to see where it can and cannot make progress.
@PixlRainbow 9 หลายเดือนก่อน ⁺²
What you describe is called curiosity driven reinforcement learning. It is used in instances where the rewards are very sparsely distributed, and it helps encourage the algorithm to explore more aggressively.
@Dave-rd6sp 9 หลายเดือนก่อน
@@PixlRainbow Know of any videos on this?
@aitango 9 หลายเดือนก่อน
I've looked into these algorithms quite a lot... sadly in reality they are quite underwhelming. Not as in they don't work, but take a crazy amount of time to train. One of the first papers to use (Never Give Up) this tried to play some atari games, however trained it for 10 billion frames (that's not an exaggeration, its the exact number).
@PixlRainbow 9 หลายเดือนก่อน
@d6sp There is one titled "Curiosity-Driven Learning of Joint Locomotion and Manipulation Tasks" by Robotic Systems Labs at ETH Zürich. It appears that the key is to limit the curiosity to just specific elements or traits within the environment to limit the search space. In the case of the robot, the position of the robot is ignored and only the position of the box is tracked. Care also has to be taken not to give too much reward to exploration, or the algorithm will simply spend all its time exploring. However while this helps keep the model from getting stuck, it still takes much longer to train a model this way compared to a hand-tuned reward that closely matches the task.
@njtdfi 9 หลายเดือนก่อน
i love the models getting even nuttier you're like the vedal of AI playing games and it's underappreciated
@bigchungus7870 9 หลายเดือนก่อน
This AI be pulling out the speedrunner strats
@nightmaredarkflame6472 4 หลายเดือนก่อน
I wonder if making a reward for the incrementation of the Star counter would make the AI desire to collect them (assuming it makes it to a Star)
Also an idea to the health system came to me: Your system works well, but maybe add a multiplier to the progress reward based on the health? Nothing crazy high of course but it might just give the AI the incentive to both stay healthy and collect coins to heal so it gets more reward, thus making it last longer
I also noticed it was starting to learn to wait at the end! It can learn patience...err...kinda!
@angzarrpsyco 9 หลายเดือนก่อน
I would love to see an AI like this be fed a full resolution, live feed of the game. Of course this would require either a ludicrously powerful set up or a huge advancement in image reading speeds and image comprehension. It would so cool to have the AI be allowed to run for as long as possible going through the full game, with its end goal would probably be "Get to 100% as fast as possible" causing it to understand the game engine more deeply than we as humans currently do
@angzarrpsyco 9 หลายเดือนก่อน
Imagine the AI being given years of run time
@somdudewillson 9 หลายเดือนก่อน ⁺¹
You can sorta do this much more cheaply by first training a separate AI to compress the footage into a more meaning-dense form.
@plasmaballin 8 หลายเดือนก่อน
Title: "AI DESTROYS The Centre Of The Universe"
Video: The Center of the Universe destroys AI
@ryzonno4149 9 หลายเดือนก่อน
It would be really cool to see the ai attempt to learn to fight bosses
@gaggix7095 9 หลายเดือนก่อน ⁺⁶
Maybe it would be cool to inizialize the weight of the agent by training first the model on clone behavior from like your gameplay of the game for example, maybe it's blasphemous to say ahah
@aitango 9 หลายเดือนก่อน ⁺⁶
I always love getting AI to learn from scratch, however learning from demonstrations does have quite a bit of research behind it and is something I've looked into since it has quite the potential upside. One problem tends to be that they need a lot of input for that to work (probably 100s of hours), so that was discouraged me thus far
@jeynarl 9 หลายเดือนก่อน
4:30 if you do an AI on Mario 64 where it takes advantage of its de facto speed I'll say that pannenkoek better watch out cuz Galactus will probably start building up so much speed that it'll start hopping QPUs before we know it
@alligatore9253 9 หลายเดือนก่อน
11:59 I see the cheeky little subtitle there
@amberbryce9594 4 หลายเดือนก่อน
I would like to see a part three where an ai defeats bowser
@mariovelez578 8 หลายเดือนก่อน
Can you do a video on how to learn reinforcement learning like this? I want to create my own ANNs instead of using eg Tensorflow. I have looked at many resources on RL and it’s overwhelming I don’t know where to begin. Maybe a small intro tutorial series on the topic would be cool.
I absolutely love your videos!
@AcousticJammTheGamer 9 หลายเดือนก่อน
It's not over until it's over. You should make an AI beat Bowser.
@raphaelfrey9061 9 หลายเดือนก่อน
One day there will be an ai% speedrun where you have to program the best ai
@aitango 9 หลายเดือนก่อน ⁺¹
I hope to be the first one making some entries in that category haha
@AIShipped 8 หลายเดือนก่อน
I would love to see another video that goes into deeper detail about the algorithm and model. For exaple what model it is(a simple deep perceptron network?) when the output layer is(a softmax?) what is the graph thing on the right of the video(predicted reward as the output of the last layer?) if so, is the action chosen just the max of it? Also are you using ppo? And most important, how do you control the game, get the screen values, and get run the game alongside of your model(im assuming every tick of the game waits for the model to compute an action). Or did you dive into these details in a different video? That would be greatly appreciated!
@aitango 8 หลายเดือนก่อน
I actually have a video talking about a bit more about my setup and the network called “the evolution of my Mario kart AI”. It’s about a different game, but the network is the same. At some point I might do something even more detailed though
@jaredburk4986 9 หลายเดือนก่อน
Do you think that an extremely sophisticated AI with basically all of the possible moves in a game would be able to find a new glitch/exploit if it ran for way too long?
@lonelyPorterCH 9 หลายเดือนก่อน
Seems like its pretty similar to generative AI, the output quality is highly dependent on the human giving it the right prompts/rewards^^
@stekarkaytrio3268 8 หลายเดือนก่อน
I don't know if this is possible (or if you tried this) but rather than having dying be a set value of -reward if it were lose all reward I feel like it would stop Galactus from trying to die
plus it kinda makes sense if you die you gotta restart, you lose all progress (unless you got a checkpoint but still)
but again I don't know if you did this or if it's even possible
@moomanchicken6466 9 หลายเดือนก่อน
Do you know what the jump in progression at ~80 hours training was caused by on the graph at 17:00? Also have you considered using transfer learning from another model which has already been trained on a video feed from a game? I assume by training your model scratch that it would have to learn the low level attributes of what makes up an image before it can learn how those attributes relate to game progression, though idk much about your architecture. Interesting video as always 👌
@andyghkfilm2287 8 หลายเดือนก่อน
14:14 I mean this is basically how I play this part
@amr0733 9 หลายเดือนก่อน
Try adding memory inputs, it starts off with 0 and depending on itself and other inputs it changes.
@montymole297 9 หลายเดือนก่อน ⁺¹
Do you think you'll ever release one of these neural network setups?
@aitango 9 หลายเดือนก่อน
I'm definitely going to be releasing AI learning algorithm since I'm hoping to publish it in a conference sometime this year
@buzzbuzz1691 9 หลายเดือนก่อน ⁺¹
Lets go this is the best channel ever
@bigscoop91 5 หลายเดือนก่อน
You have the best A.I. I have seen...All your videos are great well done!
@JerryFlowersIII 9 หลายเดือนก่อน
obviously harder than it sounds, if only there was a reward just for exploring to encourage experimenting. I'm sure I'm way naive to the mysteries of the neural network a learning computer.
@PixlRainbow 9 หลายเดือนก่อน
It exists, it's called Curiosity-Driven Learning/ Curiosity-Driven Exploration. A key difficulty with this approach is that you have to maintain a database of all previously explored states, and this database can become quite large and sluggish for complex worlds or control schemes.
@AL_383 9 หลายเดือนก่อน ⁺¹
What are your (training) pc specs? great vid as always
@aitango 9 หลายเดือนก่อน ⁺⁴
I am running this on a desktop pc, with an rtx 4090, intel i9-14900k and 64gb ram
@AmaroqStarwind 9 หลายเดือนก่อน
I'd take more of a Toriel approach, holding its proverbial hand with some human gameplay.
@njtdfi 9 หลายเดือนก่อน
should really try some Tiny series language datasets and go back to the self-supervised q learning (meta's recent paper about A^* is a cool take on language search space methods). talk it through a game like pokemon.
@Bingocat4 9 หลายเดือนก่อน
Maybe I missed it in the video but why isn’t the twirl/spin jump allowed? Seems like it’d make the game a lot easier for the ai
@aitango 8 หลายเดือนก่อน
I was still having some trouble with using motion controls for the AI when I set this up. I think I’ve got it now though so might do some motion stuff in future videos
@SolusWhite 8 หลายเดือนก่อน
Love seeing the learning progress from just coding, once they can see through eyes to make judgments, they will vastly improve... also, you cut the part where it finished the level lol.
@ryleestatler7067 8 หลายเดือนก่อน
I wonder if it would be possible to get an ai to play portal
@duckdudette 9 หลายเดือนก่อน
Great video! Love the new end card
@aitango 9 หลายเดือนก่อน
Thanks!
@todorstojanov3100 9 หลายเดือนก่อน
Make it do the perfect run
@donskelz7771 9 หลายเดือนก่อน
Hey Tango, just curious but what was the size of training data after tens of hours of training?
@aitango 9 หลายเดือนก่อน
The AI experienced a total of 50 million frames of the game (these were each taken four frames apart though, meaning the AI played 200 million frames of the level)
@Crossant-0 6 หลายเดือนก่อน
make it fight king whomp
@Obstagoon862 9 หลายเดือนก่อน
The fact that we didn't get to see the AI beat the boss really triggers me.
@deatgr3623 8 หลายเดือนก่อน
I’m high right now and this is insane theres no way you made an ai that learned to kill itself because it was less painful than trying there gonna put you in the I have no mouth and I must scream chamber because you put them in the Mario galaxy lava torture chamber
@Robertson770 8 หลายเดือนก่อน
having 100% the game 3 times, I know the final level is no easy thing
@aitango 8 หลายเดือนก่อน
I honestly forget how hard some of these games were when making AI for them
@GameCorpCrew 9 หลายเดือนก่อน
I wonder if this could be trained to play the whole game and even beat speedrunners times?
@BetaTester704 8 หลายเดือนก่อน
Make it play the whole game on stream
@Normoe445 9 หลายเดือนก่อน
Damn, that intro was fire 🔥🔥🔥
@aitango 9 หลายเดือนก่อน
Glad you liked it haha
@while_coyote 9 หลายเดือนก่อน ⁺²
You should go into the actual tools you're using.
@cowcat8124 8 หลายเดือนก่อน
I just realized that the AI can't spin
@Ferny1415 8 หลายเดือนก่อน ⁺¹
Who the fuck is AL and how dod he get so good!?
@Pencil-gb3oz 9 หลายเดือนก่อน
this is just straight up cool
@aitango 8 หลายเดือนก่อน
Glad you like it!
@RayAkuma 9 หลายเดือนก่อน
Idk why people would be terrified about AI getting their job. Take my job. i got more time to do the things i like to do.
@MajoraLuigi1987 9 หลายเดือนก่อน
You forgot to beat bowser
@tekbox7909 9 หลายเดือนก่อน
A bit mean to punish speed running tricks
@aitango 9 หลายเดือนก่อน
I would’ve let it continue, but it was mostly dying as a side product of the speed runner strats
@ZamoreWhite-vd3qk 9 หลายเดือนก่อน ⁺¹
Ai tango can you please teach the ai to play Mario kart 7 I would really appreciate it if you have the time ❤🙏
@aitango 8 หลายเดือนก่อน ⁺¹
It would probably take a while to setup up, but if I can get in contact with some of the devs of a Nintendo switch emulator I could probably do a bunch of newer games
@anotheruser133 9 หลายเดือนก่อน
if i'm being honest, I want to see the AI play reckless as they can create funny moments. Also, cheating the wall by using the fire bars to jump up wasn't really an issue but fun. It should have only been when the AI dies they lose reward (or don't get reward yet I don't know AI better than you so...)
@frn6phantom794 9 หลายเดือนก่อน
This is really interesting. Makes me really want to try this as well with some other games. Is there any source code?
@David_TH54 8 หลายเดือนก่อน
What's the name of the song at the beginning?
@Peyatoe 9 หลายเดือนก่อน
1:02 I thought you said calculus 2.0 lol
@aitango 9 หลายเดือนก่อน
Love me some calculus haha
@decreer4567 9 หลายเดือนก่อน
Do you have source code for the environment?
@CrayonEater9845 9 หลายเดือนก่อน
How do you spawn the AI in a random location?
@aitango 9 หลายเดือนก่อน
I just went through the level and made savestates a a bunch of different points, then randomly choose one to spawn the AI at
@tanuki_raccoonYT 9 หลายเดือนก่อน
i was so lost but the video was interesting :)
@florismmsmit 9 หลายเดือนก่อน
Can you make ai plays Splatoon?
@anony-meo7774 8 หลายเดือนก่อน
Could it play call of duty zombies?
@aitango 8 หลายเดือนก่อน ⁺¹
Guess you’ll have to wait and find out :)
@PenguinBoi27 9 หลายเดือนก่อน
Amazing
@aitango 9 หลายเดือนก่อน
Thanks
@David-mv8fo 8 หลายเดือนก่อน
Why did you cut it at the end?
@marty2035 9 หลายเดือนก่อน
Nice!
@aitango 9 หลายเดือนก่อน
Thanks!
@krazualtera391 4 หลายเดือนก่อน
Train ai on tasbot
@mickobee 9 หลายเดือนก่อน
Is there somewhere I can view the code for this ?
@aitango 9 หลายเดือนก่อน
Not yet, but one day. The AI learning algorithm however is something I built myself and am hoping to publish soon
@mickobee 9 หลายเดือนก่อน
@@aitango oh I was just wanting to look at it from a student point of view lol
@AdaTheWatcher 9 หลายเดือนก่อน
Do you think AI could complete Rain World?
It's a very hard survival platformer few people even complete.
@Hydra27 9 หลายเดือนก่อน
Hydra 👍🏻
@aitango 9 หลายเดือนก่อน
😊
@winnerwannabe9868 8 หลายเดือนก่อน
1:26 "but instead of gobbling up candy, it's calculating rewards." LOL nerd.
@aitango 8 หลายเดือนก่อน ⁺¹
Never heard an AI be referred to as a nerd before hahaha
@原真はらまこと 9 หลายเดือนก่อน
整理のためにチャンネル登録解除しときますね
@aitango 9 หลายเดือนก่อน ⁺²
:(
@原真はらまこと 9 หลายเดือนก่อน
@@aitango sorry〜
@DanyaV-real 9 หลายเดือนก่อน ⁺²
Why say that though?

ต่อไป

เล่นอัตโนมัติ

CPUs troll AI with Items | Mario Kart Wii