Not Only Rewards But Also Constraints: Applications on Legged Robot Locomotion

แชร์
ฝัง
  • เผยแพร่เมื่อ 14 พ.ย. 2024

ความคิดเห็น • 12

  • @pulakgautam3536
    @pulakgautam3536 หลายเดือนก่อน

    Hi, awesome work! I believe the paper was accepted. Would it be possible to open source the code?
    Thank you

  • @TextZip
    @TextZip ปีที่แล้ว +2

    Hi, the video and the paper are really impressive. I wanted to know what simulation platform was used and if the code will be made public..?

    • @railabkaist9016
      @railabkaist9016  ปีที่แล้ว +1

      You can download the simulator at: raisim.com/
      The code will be made public after the paper is accepted

    • @zhihaibi884
      @zhihaibi884 3 หลายเดือนก่อน

      @@railabkaist9016 if the code will be made public?

    • @Cha0sGO-u3u
      @Cha0sGO-u3u 3 วันที่ผ่านมา

      @@railabkaist9016 I believe this impressive paper is accepted by TRO. Look forward to the source code of this awesome work being made public!

  • @snuffybox
    @snuffybox ปีที่แล้ว +2

    You should link the paper

    • @NowayJose14
      @NowayJose14 ปีที่แล้ว

      It's in the description, chief

    • @snuffybox
      @snuffybox ปีที่แล้ว

      @@NowayJose14 it wasn't when I commented

  • @SNUDonghyeonKim
    @SNUDonghyeonKim ปีที่แล้ว

    Hello, I hope you're well. While my question isn't directly related to the paper, may I inquire about it? I'm wondering how significant the reality-gap is. Can you consistently expect that every learned policies can be transferred seamlessly to the actual robot? Additionally, would the behavior of both the simulation and the real-world robot be nearly identical? Lastly, if one were to forgo the teacher-student structure, might there be a noticeable decrease in performance?

    • @user-pg6ym1th9
      @user-pg6ym1th9 ปีที่แล้ว +1

      Thank you for your interest in our work. What we wanted to claim in this work is to use both rewards and constraints when designing learning-based controllers for complex robotic systems. We used the teacher-student learning framework but it is not limited to it. Other methods such as vanilla learning [1] or asymmetric learning [2] can also be used. In our experience, the sim-to-real gap highly depends on the characteristics of the robotic system you are working on (e.g., actuator mechanism, system software latency, actuation bandwidth) rather than the learning algorithm itself. Based on the characteristics of your system, you should select appropriate methods to solve the sim-to-real gap (e.g., domain randomization, domain adaptation, actuator networks [1]). In our case (Raibo, Mini-cheetah), domain randomization was enough.
      [1] Hwangbo, Jemin, et al. "Learning agile and dynamic motor skills for legged robots." Science Robotics 4.26 (2019): eaau5872.
      [2] Nahrendra, I. Made Aswin, Byeongho Yu, and Hyun Myung. "Dreamwaq: Learning robust quadrupedal locomotion with implicit terrain imagination via deep reinforcement learning." 2023 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2023.

    • @SNUDonghyeonKim
      @SNUDonghyeonKim ปีที่แล้ว +1

      @@user-pg6ym1th9 Thanks for the detailed discussion! I'm working on a humanoid RL and was wondering if there could be a way to further reduce the reality gap. Thanks for your kind explanation. :)

  • @marshallmcluhan33
    @marshallmcluhan33 ปีที่แล้ว

    Oh thank god they don't just care about collecting more meaningless tokens.