I just seen an ad for your courses on another random video and it blew my mind , I thought I was watching one of your regular videos when I heard your voice . It wasn't untill I seen the skip button that I realized I was watching an ad . congratulations , I watched the entire ad .
💬 Can I use Unity ML-Agents to teach an AI to play Flappy Bird and beat my high score? Once again it's surprisingly easy to use Machine Learning in Unity! Only took me a couple of hours to set this up and apply it to a previously made game. 📦 Complete Unity Machine Learning Playlist: th-cam.com/play/PLzDRvYVwl53vehwiN_odYJkPBzcqFw110.html
Fantastic as always. Have you thought of any non-game related tasks that ML Agents would be useful for? I know there are ML platforms more appropriate for non-game scenarios, but I'm more comfortable in Unity, and since a trained model can run in a built executable, it would he neat to see Unity used for non-game applications that make use of a pretrained model. I've also thought about setting up an environment where the agent decides which state (as in state machine) to run, rather than directly control movement. That might result in a smoother looking NPC that is still running a trained model to determine state.
One thing that Unity is pushing for is Robotics, they made a whole bunch of tools so you can train your robot inside Unity and then use the trained model outside. Yup using ML to drive a state machine and combine it with normal movement is on my list to research, should be doable and provide interesting results.
I have always had a question about mlagents: they randomly select actions at the beginning of training. Can we incorporate human intervention into the training process of mlagents to make them train faster? Is there a corresponding method in mlagents? Looking forward to your answer.
I would like to see how to set up ai to handle randomness. I'm trying to train AI for poker, and my 'simple' version where they literally just call or fold the flop isn't producing the expected results. It may have something to do with the fact that the AI agents are all competing against one another with the same brain so what might be good one hand might be bad in other hands. (currently the AI is only looking at their hand, what the card ranks are, whether they are suited and whether they are paired) Currently doing PPO training because I haven't figured out the yml setup for SCO training (which is probably better for this task) What's funny is that for the first fee minutes it's showing expected results - calling AA and KK more often than other hands, but over time it stops playing AA and KK completely.
Hmm that might be an issue with your observations. When I first attempted to do an AI driver it would eventually fall into a local minimum where it would just instantly drive towards a wall. The issue was the observations weren't normalized so that was somehow getting the algorithm confused so the solution in that case was to set the hyperparameter normalized: true; which normalizes the observations and helped the car test more actions which worked.
When the RayPerceptionSensor2D component is added to the character, does it automatically collect observations, or is it still necessary to manually add the observations in the public override void CollectObservations(VectorSensor sensor) method in the agent's script?
Nice video once again! Would you consider making an AI that learns to play chess like AlphaZero? Or maybe some other more complex game (existing or invented by you) just to show off some cool things the AI can come up with. I just wanted to see it do something more meaningful like in the hide-and-seek example by OpenAI, hopefully we can achieve similar results without a huge gpu farm... Speaking of gpu farms, would it be possible to use Google Colab to train these models?
Yup I am planning to do some more complex examples as I learn more and more. Unity is currently in the alpha stage of their ML-Agents Cloud tool which will let you train your agents in the cloud
Hi Code Monkey. Absolutely Loving this new series. Could you give a bit of advice on how to handle a tank movement for AI please? (Acceleration+Reverse) And (LeftTurn+HoldTurn+RightTurn) these two I figured out. But I am having hard time figuring out how to put all these in an action sequence. Again I absolutely love your content. Thanks so much for making them.
You define 2 discrete actions, each with 3 possible values. Then on the code you match those values to those actions if (action[0] == 0) // Dont move if (action[0] == 1) // Accelerate if (action[0] == 2) // Brake if (action[1] == 0) // Dont turn if (action[1] == 1) // Turn Left if (action[1] == 2) // Turn Right I've got a video on a Car Driver AI coming next week which controls exactly like that.
A question on the parameters. You keep mentioning how you're changing the scenario for the model. Do you simply `--resume` training when you do the changes? I noticed that some of the parameters change based on step_number/total_steps. Do you just ignore that part or is there anything else to play around with those?
I love your videos I started game developement and unity ny watching your videos I am currently working on a snake ai using ml agents but i am having some issues resuming the training after changing the inputs on my agent How did you teach your flappy bird agent through various phases Could you pls make a video on that Truly love your content
It's all about how you design the training world. So in your case I would start training it without any snake growth, just teach it to move towards the food Then add some fixed obstacles, make sure it learns to avoid it Then make another training world with the final rules, make sure it know the tail of the snake is an obstacle that moves and should be avoided
Dear CodeMonkey, thanks alot for all of your great tutorial. Would you please help me understand how you got these 7 iterations of ever improving models? Am i correct to assume, that you spent time training on x amount of steps on a certain config (eg: extrin:0.0, bc:0.5, gai:0.5) until you liked the reward curve. After that you stopped and changed the config to like (eg: extrin:0.5, bc:0.3, gai:0.3) and then trained with a new run-id but used initialize-from to build upon tha last trained model? if that is not how you have done it, how did you improve on the existing training for every iteration with a changed config file to get these 7 comparable iterations? thank you very much!
Chess is probably tricky since its a relatively complex game but it can certainly be done. It just might take a lot more processing power than my standard machine.
@@CodeMonkeyUnity True, I heard somewhere that apparently there are (theoretically) more unique moves in Chess than there are stars in the sky. For this reason a "perfect" Chess AI would be physically impossible to create since it would basically require infinite complexity and processing power to calculate for all possible moves. That being said, I'm pretty sure if you trained a Chess AI by using the game data from chess grandmasters you would still end up with a rock solid AI.
Hi CodeMonkey, did you ever try using pure behavior cloning to train the model? I did try setting the RL reward signal strength to 0 and behavior cloning strength to 1, but it seems it is quite hard to make it work.
Hmm I've never tried just with imitation learning, if you do it will likely be very limited unless you manually play for thousands of playthroughs, it's really meant just to get the agent started and then teach it with normal reinforcement learning.
From your code (i.e., CollectObservations in BirdAgent.cs), I did not see how do you handle the Ray as an observation. Is it automatically operated in Unity? If that is the case, did you consider the dimension of observation in behavior parameters?
Hi CodeMokey. What a nice video!! However i have some questions. When i change unity environment for implementing curriculum learning, only thing i have to do is train agent in a various environment? I mean it doesn't need like '--initailize -from' or 'environment parameter, curriculum, completion criteria'?
You can do it manually by increasing the difficulty of the scenario and training with --initialize-from Or follow the instructions in the docs for how to create a curriculum environment
Hi, in the training, it looks like the game can only run for 1 episode and then stops at the game over menu. Does this mean we have to delete the game over menu during ML-agents training?
You don't have to delete it, just change the code to reload the current scene instead of going to a game over scene. When you're done training just change it back
I was wondering for some time, if you use ML and Unity simulation on server, does AI learn normally or does it only gather data and not improve on the fly (since no python is run) or you need to do something special?
Yes in order to train you need to use Python so just having the bot play by itself in a simulation won't train it. However Unity are currently working on a ML-Agents Cloud to do exactly that
Hi, I am recently trying to record the game playing data by myself. However, I found that the recorded data is extremely imbalanced. For instance, in Happybird, the frame with clicking jump is less than 1% of the frame with no jump. Could you please kindly suggest any idea on handling this? Or are there any tricks in MLAgent for handling this when recording the demonstration data? Many thanks ahead. ---Teddy
Not sure what you mean, the amount of frames doesn't matter, all the algorithm looks at is the state of the sensors when you do jump. I covered Imitation Learning here th-cam.com/video/supqT7kqpEI/w-d-xo.html
I enjoyed your amazing video. I'm going to watch the "AI Learning to play Fly Bird" video and follow it. I downloaded the package file from the site and imported it. However, the configuration file (.yaml) is invisible. Where can I find this?
Invisible how? In the explorer? support.microsoft.com/en-us/windows/view-hidden-files-and-folders-in-windows-97fbc472-c603-9d90-91d0-1166d1d9f4b5 You can modify the one included in the official samples
Uhh, i have some problem.When I run any ML-agent project, I see same error. Missing Profiler.EndSample (BeginSample and EndSample count must match): ApplyTensors Previous 5 samples: GC.Alloc ApplyTensors GC.Alloc Barracuda.PeekOutput FetchBarracudaOutputs In the scope: ApplyTensors FlappyBird ModelRunner.DecideAction root.DecideAction Unity.ML-Agents.dll!Unity.MLAgents::AcademyFixedUpdateStepper.FixedUpdate() help pls :)
@@Unknownuser12347 i need a bot that plays flappybird for me on bitcoinmaniagame platform to earn me electricity for jumping or laying low to pass thru gates.
@@Unknownuser12347 i have no clue whatsoever about coding, libraries and stuff. I must run the ai to play that game for me. Software and everything setup so i can run it on my pc :/. If you can help wanna talk more or discord, telegram?
The machine learning video is a complete step by step tutorial th-cam.com/video/zPFU30tbyKs/w-d-xo.html If you follow that one you should then be able to follow this overview
@@CodeMonkeyUnity thank you, thats perfect! amazing videos by the way!! one question, how do you train it, then train it some more? like u did with the pipes getting more complex.
It would take me way to long to write all the reasons in detail about why you shouldnt. So i will only post the most important ones: The ranked system became a joke that only inflated the ego of thousands by giving them elo that they do not deserve (ladder changes) High elo games quality got in such a terrible condition since the changes, because now you can win simply by getting all ex diamond players (dopa explained it better on his video, explaining how it affects actual challenger players) Not gonna mention the skin abuse and unfun champs released, because that its just a huge topic that needs a lot of information given In terms of programing, tell me one game, ONLY ONE, that has an entire channel (vandrill) that focuses on GAME BREAKING BUGS only about league, you may say "BUT ALL GAMES HAVE BUGS, NOONE IS PERFECT" , and you are completly rtight, having bugs is completly okay, the thing is, riot has the RESOURCES to fix them and choices not to, even worse, they just place a bug on top of another. for now thats all im gonan say for now, but just for the note, im actually planning on releasing an entire document (currently sitting at 119 pages about riot games imcompetence, obviously covering every thing in detail and with actual sources)
@@Micaniker Why would you ask if the next thing you would do is answeer like that? Mate people dont enjoy playing league most of the league stars play because it still gives them money or arew waiting for the next big gmae to come up
🌐 Have you found the videos Helpful and Valuable?
❤️ Get my Courses unitycodemonkey.com/courses or Support on Patreon www.patreon.com/unitycodemonkey
hello pal nice video . can you help me out wit a small project
This is the content I pay internet for
Code bullets was better
@@ElMatero6 Don't really care tbh. I just watch the content from the creators I enjoy
I just seen an ad for your courses on another random video and it blew my mind , I thought I was watching one of your regular videos when I heard your voice . It wasn't untill I seen the skip button that I realized I was watching an ad . congratulations , I watched the entire ad .
💬 Can I use Unity ML-Agents to teach an AI to play Flappy Bird and beat my high score?
Once again it's surprisingly easy to use Machine Learning in Unity! Only took me a couple of hours to set this up and apply it to a previously made game.
📦 Complete Unity Machine Learning Playlist: th-cam.com/play/PLzDRvYVwl53vehwiN_odYJkPBzcqFw110.html
pin this
Really interesting video! Great to see it in action after your setup and training.
Great video Code Monkey ! It shows an application on the raycasts answer you gave me in the previous vid !
"It's actually really quite simple, it only took me a few hours..." I'm laughing so hard right now 🤣🤣
7:10 AI be like: "Sometimes my genius is... It's almost frightening"
this is an awesome concept to show what the mlagents toolbox is capable of
Fantastic as always. Have you thought of any non-game related tasks that ML Agents would be useful for? I know there are ML platforms more appropriate for non-game scenarios, but I'm more comfortable in Unity, and since a trained model can run in a built executable, it would he neat to see Unity used for non-game applications that make use of a pretrained model. I've also thought about setting up an environment where the agent decides which state (as in state machine) to run, rather than directly control movement. That might result in a smoother looking NPC that is still running a trained model to determine state.
One thing that Unity is pushing for is Robotics, they made a whole bunch of tools so you can train your robot inside Unity and then use the trained model outside.
Yup using ML to drive a state machine and combine it with normal movement is on my list to research, should be doable and provide interesting results.
Really cool!
Bro I had played your all games and it is nice 😍😍😍.
I have always had a question about mlagents: they randomly select actions at the beginning of training. Can we incorporate human intervention into the training process of mlagents to make them train faster? Is there a corresponding method in mlagents? Looking forward to your answer.
I would like to see how to set up ai to handle randomness. I'm trying to train AI for poker, and my 'simple' version where they literally just call or fold the flop isn't producing the expected results. It may have something to do with the fact that the AI agents are all competing against one another with the same brain so what might be good one hand might be bad in other hands. (currently the AI is only looking at their hand, what the card ranks are, whether they are suited and whether they are paired)
Currently doing PPO training because I haven't figured out the yml setup for SCO training (which is probably better for this task)
What's funny is that for the first fee minutes it's showing expected results - calling AA and KK more often than other hands, but over time it stops playing AA and KK completely.
Hmm that might be an issue with your observations.
When I first attempted to do an AI driver it would eventually fall into a local minimum where it would just instantly drive towards a wall.
The issue was the observations weren't normalized so that was somehow getting the algorithm confused so the solution in that case was to set the hyperparameter normalized: true; which normalizes the observations and helped the car test more actions which worked.
When the RayPerceptionSensor2D component is added to the character, does it automatically collect observations, or is it still necessary to manually add the observations in the public override void CollectObservations(VectorSensor sensor) method in the agent's script?
Nice video once again! Would you consider making an AI that learns to play chess like AlphaZero? Or maybe some other more complex game (existing or invented by you) just to show off some cool things the AI can come up with.
I just wanted to see it do something more meaningful like in the hide-and-seek example by OpenAI, hopefully we can achieve similar results without a huge gpu farm...
Speaking of gpu farms, would it be possible to use Google Colab to train these models?
Yup I am planning to do some more complex examples as I learn more and more.
Unity is currently in the alpha stage of their ML-Agents Cloud tool which will let you train your agents in the cloud
@@CodeMonkeyUnity oh wow, that should be awesome when it's finally out
Hi Code Monkey. Absolutely Loving this new series. Could you give a bit of advice on how to handle a tank movement for AI please? (Acceleration+Reverse) And (LeftTurn+HoldTurn+RightTurn) these two I figured out. But I am having hard time figuring out how to put all these in an action sequence.
Again I absolutely love your content. Thanks so much for making them.
You define 2 discrete actions, each with 3 possible values.
Then on the code you match those values to those actions
if (action[0] == 0) // Dont move
if (action[0] == 1) // Accelerate
if (action[0] == 2) // Brake
if (action[1] == 0) // Dont turn
if (action[1] == 1) // Turn Left
if (action[1] == 2) // Turn Right
I've got a video on a Car Driver AI coming next week which controls exactly like that.
@@CodeMonkeyUnity Yaaayyyyyyyy I'll see what I can do and compare with your video to see if I got it right! 😁😁😁
A question on the parameters. You keep mentioning how you're changing the scenario for the model. Do you simply `--resume` training when you do the changes? I noticed that some of the parameters change based on step_number/total_steps. Do you just ignore that part or is there anything else to play around with those?
Is there an in-depth tutorial/course or this project available to download on patreon?
I love your videos
I started game developement and unity ny watching your videos
I am currently working on a snake ai using ml agents but i am having some issues resuming the training after changing the inputs on my agent
How did you teach your flappy bird agent through various phases
Could you pls make a video on that
Truly love your content
It's all about how you design the training world. So in your case I would start training it without any snake growth, just teach it to move towards the food
Then add some fixed obstacles, make sure it learns to avoid it
Then make another training world with the final rules, make sure it know the tail of the snake is an obstacle that moves and should be avoided
Dear CodeMonkey, thanks alot for all of your great tutorial. Would you please help me understand how you got these 7 iterations of ever improving models? Am i correct to assume, that you spent time training on x amount of steps on a certain config (eg: extrin:0.0, bc:0.5, gai:0.5) until you liked the reward curve. After that you stopped and changed the config to like (eg: extrin:0.5, bc:0.3, gai:0.3) and then trained with a new run-id but used initialize-from to build upon tha last trained model? if that is not how you have done it, how did you improve on the existing training for every iteration with a changed config file to get these 7 comparable iterations? thank you very much!
@codemonkey great video and really helpful
I'm pretty bad at Flappy bird. Nice Video
Server Build option is not there anymore in Build Settings for Unity 2022, any idea where to find it?
Make sure you install the Dedicated Server Module on your Unity version
What about training it play games like Tic Tac Toe or Chess? Would it be tough to teach it play chess?
Chess is probably tricky since its a relatively complex game but it can certainly be done. It just might take a lot more processing power than my standard machine.
@@CodeMonkeyUnity ohh kk👍🏼
@@CodeMonkeyUnity True, I heard somewhere that apparently there are (theoretically) more unique moves in Chess than there are stars in the sky. For this reason a "perfect" Chess AI would be physically impossible to create since it would basically require infinite complexity and processing power to calculate for all possible moves. That being said, I'm pretty sure if you trained a Chess AI by using the game data from chess grandmasters you would still end up with a rock solid AI.
This opens x tiny windows, and only the select window simulates. Not to mention the crashing
Hi CodeMonkey, did you ever try using pure behavior cloning to train the model? I did try setting the RL reward signal strength to 0 and behavior cloning strength to 1, but it seems it is quite hard to make it work.
Hmm I've never tried just with imitation learning, if you do it will likely be very limited unless you manually play for thousands of playthroughs, it's really meant just to get the agent started and then teach it with normal reinforcement learning.
From your code (i.e., CollectObservations in BirdAgent.cs), I did not see how do you handle the Ray as an observation. Is it automatically operated in Unity? If that is the case, did you consider the dimension of observation in behavior parameters?
The Ray sensor handles that automatically, you don't need to set the observation size manually.
@@CodeMonkeyUnity Thanks for the reply.
Hi CodeMokey. What a nice video!!
However i have some questions. When i change unity environment for implementing curriculum learning, only thing i have to do is train agent in a various environment?
I mean it doesn't need like '--initailize -from' or 'environment parameter, curriculum, completion criteria'?
You can do it manually by increasing the difficulty of the scenario and training with --initialize-from
Or follow the instructions in the docs for how to create a curriculum environment
Thank you for your help!!
Hi, in the training, it looks like the game can only run for 1 episode and then stops at the game over menu. Does this mean we have to delete the game over menu during ML-agents training?
You don't have to delete it, just change the code to reload the current scene instead of going to a game over scene. When you're done training just change it back
How many rtx3090s do you need to train your own AlphaZero in a , week, two weeks or a month?
can mlagents be used for other types of games like 2dplatform bosses/enemies or fighting games ai?
It can be used for anything, as far as the machine is concerned it's all just data, it's up to you to define what that data represents
I was wondering for some time, if you use ML and Unity simulation on server, does AI learn normally or does it only gather data and not improve on the fly (since no python is run) or you need to do something special?
Yes in order to train you need to use Python so just having the bot play by itself in a simulation won't train it.
However Unity are currently working on a ML-Agents Cloud to do exactly that
@@CodeMonkeyUnity Thanks!
Hi, I am recently trying to record the game playing data by myself. However, I found that the recorded data is extremely imbalanced. For instance, in Happybird, the frame with clicking jump is less than 1% of the frame with no jump. Could you please kindly suggest any idea on handling this? Or are there any tricks in MLAgent for handling this when recording the demonstration data? Many thanks ahead. ---Teddy
Not sure what you mean, the amount of frames doesn't matter, all the algorithm looks at is the state of the sensors when you do jump. I covered Imitation Learning here th-cam.com/video/supqT7kqpEI/w-d-xo.html
Seems the AI always learns the quickest way to die first
I enjoyed your amazing video.
I'm going to watch the "AI Learning to play Fly Bird" video and follow it.
I downloaded the package file from the site and imported it.
However, the configuration file (.yaml) is invisible.
Where can I find this?
Invisible how? In the explorer? support.microsoft.com/en-us/windows/view-hidden-files-and-folders-in-windows-97fbc472-c603-9d90-91d0-1166d1d9f4b5
You can modify the one included in the official samples
@@CodeMonkeyUnity thank you
but i can not find yaml file in package.
BTW, where did you define the elements for detectable tags in Ray? It would be nice if you can share more details for setting up.
You just assign the Tag to the Game Objects you want. Go to the Tags and Layers window
Is your bird going on and on forever? I also trained an AI to play flappy bird and it does pretty well but fails at 300-400 score.
Uhh, i have some problem.When I run any ML-agent project, I see same error.
Missing Profiler.EndSample (BeginSample and EndSample count must match): ApplyTensors
Previous 5 samples:
GC.Alloc
ApplyTensors
GC.Alloc
Barracuda.PeekOutput
FetchBarracudaOutputs
In the scope:
ApplyTensors
FlappyBird
ModelRunner.DecideAction
root.DecideAction
Unity.ML-Agents.dll!Unity.MLAgents::AcademyFixedUpdateStepper.FixedUpdate()
help pls :)
Best thing I learned from this is how to make server builds...
how to install mlagents_envs?
The same way you install normal mlagents th-cam.com/video/zPFU30tbyKs/w-d-xo.html
@@CodeMonkeyUnity so it's: "pip install mlagents_envs"?
the same is for gym_unity?
Why dont you make Tutorials for something like this :(
I've made a bunch of detailed ML tutorials unitycodemonkey.com/search.php?q=machine%20learning
Is it just me or does code monkey sound like Kermit The Frog?
AI learns to play Flappy Bird 🤔 impressive.
But can an AI learn to play Civilization 5?
Given enough data and enough training time it can learn anything. The most advanced AI can beat the best human DOTA players.
👌
I really need this on my pc to play same game just different collors for me. I will pay. Anyone? Pls
What do you need?
@@Unknownuser12347 i need a bot that plays flappybird for me on bitcoinmaniagame platform to earn me electricity for jumping or laying low to pass thru gates.
@@silvestervarga8503 And you want to change this and create it your bitcoingame
@@Unknownuser12347 i have no clue whatsoever about coding, libraries and stuff. I must run the ai to play that game for me. Software and everything setup so i can run it on my pc :/. If you can help wanna talk more or discord, telegram?
👍
ok im late sorry
It would be necessary to pump bots in some games )))
Can you help to make game
excellent video and very interesting, but a proper tutorial would have been way more useful, still none the wiser how to achieve this myself.
The machine learning video is a complete step by step tutorial th-cam.com/video/zPFU30tbyKs/w-d-xo.html
If you follow that one you should then be able to follow this overview
@@CodeMonkeyUnity thank you, thats perfect! amazing videos by the way!! one question, how do you train it, then train it some more? like u did with the pipes getting more complex.
ah initialize-from yeh? cheers
Bird always going to upwards. I really don't understand.
This is not AI but good BOT
What?
DO not support riot games company
It would take me way to long to write all the reasons in detail about why you shouldnt.
So i will only post the most important ones:
The ranked system became a joke that only inflated the ego of thousands by giving them elo that they do not deserve (ladder changes)
High elo games quality got in such a terrible condition since the changes, because now you can win simply by getting all ex diamond players (dopa explained it better on his video, explaining how it affects actual challenger players)
Not gonna mention the skin abuse and unfun champs released, because that its just a huge topic that needs a lot of information given
In terms of programing, tell me one game, ONLY ONE, that has an entire channel (vandrill) that focuses on GAME BREAKING BUGS only about league, you may say "BUT ALL GAMES HAVE BUGS, NOONE IS PERFECT" , and you are completly rtight, having bugs is completly okay, the thing is, riot has the RESOURCES to fix them and choices not to, even worse, they just place a bug on top of another.
for now thats all im gonan say for now, but just for the note, im actually planning on releasing an entire document (currently sitting at 119 pages about riot games imcompetence, obviously covering every thing in detail and with actual sources)
@@Micaniker Why would you ask if the next thing you would do is answeer like that?
Mate people dont enjoy playing league most of the league stars play because it still gives them money or arew waiting for the next big gmae to come up