Training an AI for WIPEOUT (MLAgents Unity Reinforcement Learning)

แชร์
ฝัง
  • เผยแพร่เมื่อ 6 ก.ย. 2024

ความคิดเห็น • 28

  • @AirpowerNow
    @AirpowerNow ปีที่แล้ว +2

    Is your code open source ?

    • @alexandresajus
      @alexandresajus  ปีที่แล้ว

      Yes! I pushed the code on GitHub here: github.com/AlexandreSajus/Total-Wipeout-AI

    • @DeadRabbitCanDance
      @DeadRabbitCanDance 9 หลายเดือนก่อน +1

      @alexandresajus Big thanks for excelent tutorial!

  • @fotiskapotos
    @fotiskapotos ปีที่แล้ว +4

    Great work ! I would love to see a longer form video with a more in depth exploration into how you make your environment and how you create your agent. Subscribed !

  • @RobTheDon1234
    @RobTheDon1234 ปีที่แล้ว +2

    I could see you going BIG if you do more vids like this, keep it up Brodie 👍🏾

  • @daryladhityahenry
    @daryladhityahenry หลายเดือนก่อน +1

    Wow nice!! I notice the walk is much better than almost other that I could find. Are you using imitation learning for that? Or pure reward n punishment? If it's pure reinforcement learning, I wonder how you can achieve quite great walking motion.
    Great vid!!!

    • @alexandresajus
      @alexandresajus  หลายเดือนก่อน +1

      Thanks! It's just reward and punishment here. The walk comes as default when you use the Walker agent from MLAgents in Unity, so it is a good starting point.

    • @daryladhityahenry
      @daryladhityahenry หลายเดือนก่อน +1

      @@alexandresajus I see.. Thanks for the info :). Love seeing things like this :D

  • @lukqaz
    @lukqaz 5 หลายเดือนก่อน +1

    Please make more videos like this

    • @alexandresajus
      @alexandresajus  5 หลายเดือนก่อน

      I would love to! This has been my dream project for 3 years. The problem is that these projects take a lot of work and frustration to make. Also, this was, weirdly, my worst-performing video.

  • @sergche3718
    @sergche3718 8 หลายเดือนก่อน +1

    Wow nice. 40 muscles and still it learns.
    Itsy also funny how it developed the main pushing leg and a supporting one behavior. Yes it's limping, but still.
    What would be the strategy to make it aware of the moving obstacles? Now it just tries to simply run fast, I suppose?

    • @alexandresajus
      @alexandresajus  8 หลายเดือนก่อน

      Yeah, this was an enjoyable project to look at. The moving obstacles were challenging to work with: on the swing, the AI does understand how to jump from the edges of the platforms to have a better chance; on the sweeper, he just learned to run as fast as possible, which could be better. I need a way to teach them timing: they should stop before the obstacle and wait for the right timing before moving forward. One way to do this is by adding as input information: how close am I to the start of the obstacle, where is the obstacle in its movement cycle. This information would allow the agent to understand that some timings are better than others. That's one solution. There have got to be better ones...

  • @bhallos_baus
    @bhallos_baus 5 หลายเดือนก่อน +1

    hey bro awesome work man..

  • @abisheksunil
    @abisheksunil ปีที่แล้ว +1

    Awesome! Love the humor that you put in there, and the agent converges pretty well. Is it PPO or SAC?
    Would definitely be up for more (some ideas: CNN-based RL or even better Multi-agent RL with similar complexity).

    • @alexandresajus
      @alexandresajus  ปีที่แล้ว +1

      Thanks a lot! I trained on PPO. Yes, I definitely want to do more RL projects using Unity. I'll find something...

  • @SlimsyBeetle
    @SlimsyBeetle ปีที่แล้ว +1

    Very cool

  • @Nebulaoblivion
    @Nebulaoblivion ปีที่แล้ว +1

    Pretty cool!

  • @Diego0wnz
    @Diego0wnz ปีที่แล้ว +1

    very cool video! I like RL, what/where did you study?

    • @alexandresajus
      @alexandresajus  ปีที่แล้ว

      Thanks! I did a Master of Engineering at CentraleSupélec in Paris, but the AI classes there were too theoretical. I learned RL primarily through the AI club we had at that Uni.

  • @keyhaven8151
    @keyhaven8151 2 หลายเดือนก่อน +1

    I have always had a question about mlagents: they randomly select actions at the beginning of training. Can we incorporate human intervention into the training process of mlagents to make them train faster? Is there a corresponding method in mlagents? Looking forward to your answer.

    • @alexandresajus
      @alexandresajus  2 หลายเดือนก่อน

      Excellent question. What is commonly done to choose human actions instead of random ones at the beginning of training is called "Imitation learning." MLAgents does provide documentation on imitation learning, but I have never explored it, and it is probably complex to implement:
      github.com/gzrjzcx/ML-agents/blob/master/docs/Training-Imitation-Learning.md

    • @keyhaven8151
      @keyhaven8151 2 หลายเดือนก่อน +1

      @@alexandresajus Thank you very much for your answer. I have looked at the link you sent and found that it is an old version of mlagents, which is different from multiple settings. For example, the new version does not have Brain and Academy's Broadcast Hub. So, what should we do in the new version? Thank you for your answer!

    • @alexandresajus
      @alexandresajus  2 หลายเดือนก่อน +1

      @@keyhaven8151 To be honest, I don’t know since I never tried imitation learning. Try to look up « imitation learning mlagents » online, I’m sure there are tutorials. Or use the older version of MLAgents

    • @keyhaven8151
      @keyhaven8151 2 หลายเดือนก่อน +1

      @@alexandresajus Thank you very much for your answer. I will try to find a solution! thank you!

  • @kamillatocha
    @kamillatocha 8 หลายเดือนก่อน +1

    french people

  • @Blooper1980
    @Blooper1980 7 วันที่ผ่านมา +1

    Ok Bye!