AI 101: Monte Carlo Tree Search

แชร์
ฝัง
  • เผยแพร่เมื่อ 7 มิ.ย. 2024
  • Support my videos on Patreon: / ai_and_games
    Tip Me At: www.paypal.me/AIandGames
    Like AI and Games on Facebook: / aiandgames
    Follows me on Twitter: / aiandgames
    --
    This AI 101 gives a brief overview of the logic behind the Monte Carlo Tree Search algorithm.
    If you need to refer back to the Foundation series of AI 101, check out this playlist:
    • Playlist
  • เกม

ความคิดเห็น • 34

  • @guillermoleon0216
    @guillermoleon0216 6 ปีที่แล้ว +40

    I love that you make this videos. I had a really hard time understanding MCTS when reading papers. Now that I finally got it I find your explanation very useful for newcomers. I'm currently writing my masters dissertation and I'm working on the GVGP area. Keep up the good work man.

    • @AIandGames
      @AIandGames  6 ปีที่แล้ว +4

      Thanks. I agree when I first started reading about MCTS I found it quite difficult to understand. Best of luck with the dissertation!

    • @ludelaire
      @ludelaire 4 ปีที่แล้ว +1

      I feel totally identified with this comment, as I'm currently in a similar situation (writing my master's final project) and this video helped me lots.

    • @mongoose4960
      @mongoose4960 3 ปีที่แล้ว

      Very cool, I‘m just getting started on my Master‘s Research, I hope it goes okay :)

  • @jasskaur8541
    @jasskaur8541 7 หลายเดือนก่อน +12

    explanation is great but the pace is too fast. Hard to focus.

  • @mortenbrodersen8664
    @mortenbrodersen8664 6 ปีที่แล้ว +1

    Great video. Thanks!

  • @redrum4486
    @redrum4486 2 ปีที่แล้ว

    really good explanation

  • @paulstevenconyngham7880
    @paulstevenconyngham7880 4 ปีที่แล้ว +11

    hey mate, love your work - for this video though recon it would have been cool to show more diagrams and or walk through a couple of examples. Kind of sounds like you are just reading out of textbook with titles in this video though :(

  • @ProfessionalTycoons
    @ProfessionalTycoons 5 ปีที่แล้ว

    great video

  • @legendgames128
    @legendgames128 10 หลายเดือนก่อน +1

    I feel like a variant of the MCTS would have it pit against real time players, so people could see it change over time. Like making a chess AI that learns from its mistakes.

  • @sockatume
    @sockatume 6 ปีที่แล้ว +3

    So, how closely related is this to human decision making? Parts of it look analogous to how I try to pick good moves in turn based games (especially when it’s an important turn and the game has got away from me a bit), but I’ve no idea whether that means there’s an actual connection.

    • @pixboi
      @pixboi ปีที่แล้ว +1

      I think its very similar, but our accuracy and computing speed of simulating possible outcomes is much lower, hence its inaccurate. A beginner chess player sees only the current state, and potentially what effects his decision now could have in the short future.

  • @Lungkisser
    @Lungkisser 6 ปีที่แล้ว +5

    Interesting stuff, even for a layman.

  • @Alex_dlc
    @Alex_dlc 6 ปีที่แล้ว +58

    Great. Idea, but it would be nice to see practical examples.

    • @AlexanderGoncharenko
      @AlexanderGoncharenko 5 ปีที่แล้ว +2

      Civilization uses them as far as I know

    • @theblue3554
      @theblue3554 4 ปีที่แล้ว

      late reply but alpha go uses this along with Neural Nets

    • @generichuman_
      @generichuman_ 3 ปีที่แล้ว +5

      @@AlexanderGoncharenko @Quantum MoOse lol, I think he means examples where you work out the math on paper, not known implementations of this algorithm

    • @TheRageCommenter
      @TheRageCommenter 3 ปีที่แล้ว

      @@theblue3554 was gonna say that too

  • @emil5355
    @emil5355 2 ปีที่แล้ว +2

    Isnt there a mistake from Simulation to Backpropagation? I think the amount of wins should be incremented for the white player as well. At the new position 0/1 black lost, but white won, so the upper node`s win variable should be incremented, shouldnt it?

  • @ANGRYBIRDSlayer
    @ANGRYBIRDSlayer 6 ปีที่แล้ว +1

    Thanks for the video, it would be awesome if you can also provide your sources in the description.

  • @pankaj_pundir
    @pankaj_pundir 3 ปีที่แล้ว

    Thanks for this video..

  • @Tamschi
    @Tamschi 6 ปีที่แล้ว +4

    I think I understood most of the video, but there's one thing I'm unsure about:
    Once the game advances in reality, is the existing tree normally thrown away or is part of it reused?

    • @AIandGames
      @AIandGames  6 ปีที่แล้ว +9

      That's a good question! The original versions of MCTS would discard the tree at the end of the whole playout phase. Given it had made the decision it needed and would then start all over again - which is a touch wasteful. There's been a lot of research since then into developing new variants of the UCT method that stores all or part of the tree such that it is more memory efficient, but also helps to minimise how many exploitation playouts are required to repeatedly establish how strong that corner of the tree is.

  • @morsiskoPC
    @morsiskoPC 5 ปีที่แล้ว +2

    The thing that makes me confused is the backpropagation step. Let's say you have two players, grey and black like in 5:11. The grey player won the simulation, however you increment only the number of games played back up to the tree. Shouldn't you also increment the number of wins everywhere for grey player? Because as players make moves alternately they both want to make best move, so it would be more intuitive to increment the number of wins, to select the tree path more often as the algorithm work (I saw it in many articles about MCTS)

    • @emil5355
      @emil5355 2 ปีที่แล้ว

      Yes exactly! I was thinking the same thing. Im glad I found your comment. I think you are right.

  • @user-bu8mg7uq3s
    @user-bu8mg7uq3s ปีที่แล้ว

    thanks

  • @vornamenachname906
    @vornamenachname906 2 ปีที่แล้ว

    where is the advantage for my network then ? what does the network learns from the results of the MCTS?

  • @antonidabrowski4657
    @antonidabrowski4657 3 ปีที่แล้ว

    Music in the background is too loud. It's distracting for me.

  • @andrewmadilecy5704
    @andrewmadilecy5704 2 ปีที่แล้ว +1

    I don't understand the random play part of the algorithm. How would randomly making advancements show any sort of advantage in a given position? For an example, chess: Making random moves is generally considered the worst method of playing the game as it can very easily throw away your advantage. So how does this concept of randomly simulating games result in any concept of advantage?

    • @emil5355
      @emil5355 2 ปีที่แล้ว +2

      Hi, take a look at Monte Carlo Experiments. It's a matter of mathematic principles. The premise is "the more random samples you gather in a specific event (in this case the event is winning from this position), the more accurate your idea of the right probability of that event occuring becomes.
      You should also consider that the machine is doing an incredibly high amount of simulations per second. I dont know the number here but maybe millions?
      And thus the probability of getting a win from a certain position gets more and more accurate, the more random simulations are done.
      I hope I could help a bit.

    • @aesopm9200
      @aesopm9200 ปีที่แล้ว

      the reason it *might* work is that we are building up the tree, and that part isn't random. so we have a smart first few moves (the tree) and then random. at first it will be very bad. after many many tries, the tree builds up and it starts to be better. this is not intuitive.
      additionally, we don't have to do pure random, although afaik that does work often. we certainly need something that is fast. we might bias with a certain probability towards moves that usually make sense. at some point we might cut it off and use a heuristic to evaluate the position (this could be necessary for sure in a game that is unbounded). for example, counting stones could be a score heuristic in Go. but we might make a few random moves first.
      a mate in 2 might be detected by random play. in Ms Pacman quite a few random moves might survive. and in contrast if we are trapped we would notice that moving at random quite quickly.

  • @BritishProudnShit
    @BritishProudnShit 3 ปีที่แล้ว

    You sound a lot like Craig Ferguson