Training Stickmen to Walk With and Without Curiosity | ML Agents

แชร์
ฝัง
  • เผยแพร่เมื่อ 21 ก.ย. 2024
  • Github repo: github.com/Tho...
    2 stickmen trained using Unity's ML Agents. green trained 0 intrinsic reward, pink trained with strength 0.03 intrinsic reward.

ความคิดเห็น • 11

  • @revimfadli4666
    @revimfadli4666 9 หลายเดือนก่อน

    The Chad Curiosity vs Virgin No Curiosity

  • @123teixeira123
    @123teixeira123 10 หลายเดือนก่อน

    nice job man!

  • @mikhailhumphries
    @mikhailhumphries ปีที่แล้ว

    When will you come back and make more machine learning videos?

  • @traviswoolston8108
    @traviswoolston8108 9 หลายเดือนก่อน

    is it balancing as well?

  • @theantssousa
    @theantssousa 3 ปีที่แล้ว

    Very cool. Is this project on github? I'm trying to do something like that but I'm not getting it.

    • @zoelynn126
      @zoelynn126  3 ปีที่แล้ว

      Thanks! I've made the github repo public, so hopefully you will be able to find it all here: github.com/ThomasLynn/Stickman-Walker
      Good luck on your project!

    • @theantssousa
      @theantssousa 3 ปีที่แล้ว

      @@zoelynn126 Thanks!

  • @mcdenyer
    @mcdenyer 3 ปีที่แล้ว

    So you used curiosity strength at 0.03? How did you settle on this number?

    • @zoelynn126
      @zoelynn126  3 ปีที่แล้ว +2

      That's a good question. I was planning on making this video into a full video explaining it all, but I don't yet have the script writing skills to make it good.
      The way to find a good strength is: first get a good starting value (normally 0.01), then do a little trail and error
      I trained with 0.01 curiosity (which is in the middle of the typical range 0.001-0.1 from the docs) and got ~4 extrinsic reward (~40 metres / 10) and ~1 intrinsic reward, so ~20% of the total reward is from the curiosity.
      Normally 20% is pretty high, but I'm using curiosity to change how to stickman walks, instead of using it to explore an area passively, so it needs the high % for this to work.
      I tried a few values (in the order I tested: 0.01, 0.1, 0.05, 0.02, 0.03, kind of like a binary search)
      At strength 0.01, the curiosity was too small to affect training. Strength 0.02 sometimes did, sometimes didn't affect training. Strength 0.03 and 0.05 always led to walking instead of galloping. When I tried strength 0.1, it did learn to walk, but it learned so slowly that it ran out of training time before getting to the second obstacle.
      Basically, any strength 0.03 or above always gave walking, but the higher I went over 0.03, the slower it trained (about 80% training speed at strength 0.05, and ~40% training speed at 0.1).
      So the reason I use 0.03 in the video is because it worked, and it trained the fastest out of the 3 that did, giving it more time to perfect its walking.
      docs link: github.com/Unity-Technologies/ml-agents/blob/release_14_docs/docs/Training-Configuration-File.md#curiosity-intrinsic-reward
      also, I got the intrinsic strength information using Tensorboard. It's pretty useful, I would recommend it.

  • @frenziedcomet
    @frenziedcomet 3 ปีที่แล้ว

    I’m disappointed by the lack of talking

    • @zoelynn126
      @zoelynn126  3 ปีที่แล้ว

      Well, me too =/ still can't figure out how to script this one.
      Thought I'd upload something though, as it was getting to a point of me just staring at a crappy script for days on end.