Advanced 4. Monte Carlo Tree Search

แชร์
ฝัง
  • เผยแพร่เมื่อ 24 ก.ค. 2024
  • MIT 16.412J Cognitive Robotics, Spring 2016
    View the complete course: ocw.mit.edu/16-412JS16
    Instructor: MIT students
    This is the fifth advanced lecture in the MIT 16.412 Cognitive Robotics of Spring 2016, led by MIT students. Students took a deep dive into Monte Carlo Tree search, and its further application in Super Mario Brothers and Alpha Go.
    License: Creative Commons BY-NC-SA
    More information at ocw.mit.edu/terms
    More courses at ocw.mit.edu

ความคิดเห็น • 17

  • @mare4602
    @mare4602 5 ปีที่แล้ว +12

    wish it had better audio quality

  • @dsazz801
    @dsazz801 3 ปีที่แล้ว +1

    A nice presentation! Thank you guys :D

    • @garyaxl5056
      @garyaxl5056 3 ปีที่แล้ว

      @Ramon Major yup, have been using InstaFlixxer for years myself :)

  • @mr.meesicks1801
    @mr.meesicks1801 5 ปีที่แล้ว +2

    In 28:49 I am not sure if that's really true in general. In Gelly and Silver 2007 they show that using better simulation policies in UCT does not necessarily translate to better final MCTS performance.

    • @stephenmontague6930
      @stephenmontague6930 2 ปีที่แล้ว

      Yeah, the caveat he gave ("As long as it's not that much more expensive") is crucially important, because a heuristic based selection or rollout policy can be very detrimental - as shown in a number of papers (including recent work) - since heuristics take time that could have been used to process more rollouts, and heuristics may fail in complex scenarios where random exploration may succeed. Enhancements are certainly possible, but any approach should be well-tested.

  • @YairCat
    @YairCat 2 ปีที่แล้ว +3

    If I am not mistaken, because the blue node is the action of the opponent then you should not increase the amount of wins in the numerator and only the amount of times he/she played this move. 29:52

    • @cassoulucas
      @cassoulucas 6 หลายเดือนก่อน +1

      Yeah, it's a bit weird because i've seen both implementations online but only increasing the win count from the point of view of the player that made the move in a certain node yielded better results for me

  • @achillesarmstrong9639
    @achillesarmstrong9639 5 ปีที่แล้ว

    interesting

  • @yoyoshi2833
    @yoyoshi2833 8 หลายเดือนก่อน

    Marque-page : 47:00

  • @FullPotatoGaming
    @FullPotatoGaming 8 หลายเดือนก่อน

    32:30

  • @stephenkamenar
    @stephenkamenar 2 ปีที่แล้ว

    live audiences are so gross. so much coughing

  • @pnachtwey
    @pnachtwey 4 ปีที่แล้ว

    I doubt the person giving the lecture has ever written a chess or go program. His explanation of how to terminate the search was not good. Tic-Tac-Toe is a poor example problem. It is too simple. NIM would be a better example.
    I am surprised the algorithm uses a ln and sqrt function as these are time consuming.
    Is this all you need to know to use MCTS?
    The evaluation routine that is used to evaluate nodes is key. There is no point is search deep if you don't know what you are searching for or searching for the wrong thing.

    • @AntonPanchishin
      @AntonPanchishin 3 ปีที่แล้ว +1

      The Ln and Sqrt functions come from good math theory with regards to probability of regret, what's the chance that the best option by chance had a few too many losses early on in the sampling and a sub optimal option (a trap perhaps) had by chance a few wins. The theoretical calculation for that is UCB, which includes ln and sqrt. You can try other non-expensive calculations but I think you will find that althrough you can run more simulations, the simulations will not be as effectively used.

    • @AntonPanchishin
      @AntonPanchishin 3 ปีที่แล้ว +1

      Test out your theories here www.codingame.com/multiplayer/bot-programming/tic-tac-toe and get a really good insight into tradeoffs. Pure MCTS will do pretty well and you'll have a very hard time beating it. Those at the top of the leaderboard mostly use vanilla MCTS.

    • @pnachtwey
      @pnachtwey 3 ปีที่แล้ว +1

      @@AntonPanchishin Thanks for the link. I wrote an Othello program back in 1980 and entered it in the first man machine Othello tournament in Evanston, IL. I met a lot of the Chess pioneers there. Later I wrote my own chess program and entered into USCF chess tournaments but I never got it to play better than I could. I only had a 386 computer. Also, real life got in the way.

  • @naturaljapanese8772
    @naturaljapanese8772 4 ปีที่แล้ว +2

    The 2nd guy is soooo annoying