Markov Chains - VISUALLY EXPLAINED + History!

Kapil Sachdeva

มุมมอง 11 194

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 28 พ.ค. 2024
In this tutorial, I explain the theoretical and mathematical underpinnings of Markov Chains. While I explain all the fundamentals, the focus is on the properties of Markov chains that can be leveraged to apply the Law of Large numbers when doing Bayesian Inference.
I also am sharing the interesting story of what led to the invention of what we now know as Markov Chains. Please use the timestamps below to skip over the historical aspects if you are not interested in that.
Chapters:
00:00 Introduction & Recap
01:40 What is meant by independent sampling?
02:54 Historical aspects and event that led to the invention of Markov Chains
09:07 The rest of the tutorial
The example used in this tutorial is taken from the lecture notes prepared by Dr. Rachel Fewester. She is a professor of Statistics at the University of Auckland, NZ.
You will find the lecture notes here in form of 2 chapters-:
www.stat.auckland.ac.nz/~fews...
www.stat.auckland.ac.nz/~fews...
Some of the animations in the tutorial were created using manim (the toolkit authored by 3Blue1Brown). I used the community version - github.com/manimcommunity/manim/ - and I want to express my gratitude towards all the hard work that 3Blue1Brown and Manim Community has put into this library.
#markovchains
#montecarlo
#bayesianstatistics
วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 50

@AbhishekKumar-hk6fs 9 หลายเดือนก่อน ⁺²
I'm awestruck with the level of simplicity you had throughout the video.
@KapilSachdeva 9 หลายเดือนก่อน
🙏
@siishasha2176 ปีที่แล้ว ⁺⁵
much better than my professor's explanations! please kindly make more of these sir! you are a treasure in education!
@KapilSachdeva ปีที่แล้ว ⁺²
🙏
@rohansidankar2385 6 หลายเดือนก่อน ⁺²
I've gone through a few of your videos for my exam preparation, and I really liked the way you blend history with math. Understanding the history and the personalities behind these mathematical concepts makes me think more about the intuition and discovery involved. I also love the way you teach math.
Thank you for your efforts in creating these videos; they are truly a great source of learning.
@KapilSachdeva 6 หลายเดือนก่อน
🙏
@griffinbur1118 6 หลายเดือนก่อน ⁺¹
One of the best examples of internet pedagogy I've ever seen. Excellent.
@KapilSachdeva 6 หลายเดือนก่อน
🙏
@bikinibottom2100 ปีที่แล้ว ⁺¹
Being able to make statistics that captivating is a gift, you have to be very intelligent to articulate ideas in this manner
@KapilSachdeva ปีที่แล้ว
🙏
@brofessorsbooks3352 2 ปีที่แล้ว ⁺³
Thank you for always including lovely references to your videos!
@KapilSachdeva 2 ปีที่แล้ว
🙏🙏
@michaelzumpano7318 ปีที่แล้ว ⁺¹
Dr. Sachdeva, this was the best introduction to some of the more difficult aspects of Markov chains that I have seen. You made so many good decisions about what to include and not include. I’m glad you referred back to matrices after developing the algebra and I appreciated the historical discussion about dependent probabilities and convergence to the law of large numbers. I especially appreciated the mention of irreducibility as a necessary condition of equilibrium - a very important discriminator that gets no attention in most introductions. I’m glad you didn’t get too wrapped up in Monte Carlo in this session. I’m going to visit your notes, Dr. Fewster’s notes, and view the rest of your videos. Thank you for your work!
@KapilSachdeva ปีที่แล้ว
Thanks for the appreciation Michael. Glad that you found it helpful.
Just to be clear I am not a “Dr.” :)
@avijitnandy6662 9 หลายเดือนก่อน ⁺¹
You are the best. This channel should flourish like anything. Not sure why so less subscribers. So much effort has been put into each video.
@KapilSachdeva 9 หลายเดือนก่อน
🙏
@avijitnandy6662 9 หลายเดือนก่อน
Can you discuss Brownian Motion in one video, or can you provide me a video source for the same
@KapilSachdeva 9 หลายเดือนก่อน
@@avijitnandy6662 th-cam.com/video/5jBVYvHeG2c/w-d-xo.html see this one for history and conceptual explanation
From prob theory perspective see this:
stats.libretexts.org/Bookshelves/Probability_Theory/Probability_Mathematical_Statistics_and_Stochastic_Processes_(Siegrist)/18%3A_Brownian_Motion/18.01%3A_Standard_Brownian_Motion
@sirdancealot6204 5 หลายเดือนก่อน ⁺¹
Thank you very much for the excellent explanations and materials! They are very helpful to get familiar with the complex subject
@KapilSachdeva 5 หลายเดือนก่อน
🙏
@BrunoMorabito-uy3wq ปีที่แล้ว ⁺²
Amazing explanation!!!! Thank you
@KapilSachdeva ปีที่แล้ว
🙏
@robertoslepetys9230 ปีที่แล้ว ⁺¹
Thank you very much Dr.Sachdeva, your videos are very clear well formulated, rich and you are a fantastic teacher! Very inspiring videos, thank you for your effort to make it, it is a great contribution to a better world.
@KapilSachdeva ปีที่แล้ว
🙏 thanks for the kind words.
@AruneshKumarSinghPro ปีที่แล้ว ⁺¹
Nice explanation as always. There is one mistake at 22:00. Instead of p(xi = j|x0=i) it should be p(x1= j|x0=i). I referred notes.
@KapilSachdeva ปีที่แล้ว
🙏 Yes, I am aware of the typo :( but glad you looked at the notes.
@mohiniprashantkulkarni6779 ปีที่แล้ว ⁺¹
very nice explanation sir
@KapilSachdeva ปีที่แล้ว
🙏
@EugeneEmile ปีที่แล้ว ⁺¹
Great stuff!
@KapilSachdeva ปีที่แล้ว
🙏
3 หลายเดือนก่อน
Hello, can you to show examples of Markov chain codes written in fortran?
@TheCriticsAreRaving ปีที่แล้ว ⁺¹
Excellent video! Onto MCMC!
@KapilSachdeva ปีที่แล้ว
🙏
@yeo2octave27 2 ปีที่แล้ว ⁺²
again, thanks so much for the series! I'm still learning and might not be clear about the content, but at 22:27, for the calculation of the distribution of random variable X at t=1, using the sum of probabilities, it should be X_1 = j, instead of X_i = j right?
@KapilSachdeva 2 ปีที่แล้ว ⁺³
Good catch. Indeed, I made a typo. Much appreciated. At least someone is paying attention :) … thanks again!
@sklkd93 ปีที่แล้ว ⁺¹
Thanks for another phenomenal tutorial.
Question - if a Markov chain can reach the same equilibrium regardless of the initial vector, does that mean the transition matrix to the nth power converges to a matrix such that all vectors multiplied by it will result in the same output vector?
In other words, this matrix’s column space is just a single point?
I’m having some trouble wrapping my head around this. Would that be a zero dimensional matrix? Or am I mistaken in my earlier logic?
@KapilSachdeva ปีที่แล้ว
Your earlier logic is correct about transition matrix resulting in the same output vector.
I am not sure if we can call it a zero dimensional matrix.
@proxyme3628 ปีที่แล้ว ⁺¹
I wish you can create another video on Page Rank as the Markov Chain
@KapilSachdeva ปีที่แล้ว
🙏 will look into it.
@yli6050 ปีที่แล้ว ⁺¹
Should the prob after the summation be P(X1 = j | Xo = I) ? (First line)
@KapilSachdeva ปีที่แล้ว
yes. I made a typo. There is another guy who pointed it out. See in the another comment on this video
@ssshukla26 2 ปีที่แล้ว ⁺²
Hah... Now I know what I never knew... Thanks...
@KapilSachdeva 2 ปีที่แล้ว
😀 So what was it that you did not know?
@ssshukla26 2 ปีที่แล้ว ⁺¹
@@KapilSachdeva What led to the invention of the Markov Chain theory. I was aware of the concept now I have more broader picture.
@KapilSachdeva 2 ปีที่แล้ว ⁺³
I was not sure if I should include these historical aspects. Almost had removed it from the video before publishing but then just took a chance. I love history (.. any kind) but was/am afraid if I am imposing what I love on the audience. Do u have any feedback? … of course it is subjective as some people would like it and whereas some would consider it waste of time.
@krgonline ปีที่แล้ว ⁺²
@@KapilSachdeva By all means you should. Your comments on the scientists' traits are very perceptive
@KapilSachdeva ปีที่แล้ว ⁺¹
🙏
@bnglr ปีที่แล้ว
you forgot to conclude what Markov Chain has to do with the independence assumption
@KapilSachdeva ปีที่แล้ว ⁺¹
Ah, perhaps I did not say it explicitly. Here is an attempt to clarify.
A quite many statistical inference procedures (eg Bayesian inference) depend on sampling from a distribution. Sampling is essential to use law of large numbers (if you do not know what is law of large numbers the above line and next lines will not make sense)
Now how could we sample from “any arbitrary” distribution. Over the years many sampling algorithms have been designed but they do not work well when your data is high dimensional and/or require some knowledge of data to fine tune it. Most of these sampling algorithms assume samples are independent of each other.
Now a markov chain can be viewed as a series of samples in which a given sample depends on the previous state of the distribution. “This means that there is a way to navigate in the high dimensional space.” [Previous line may be difficult to understand depending on your background]
Sampling algorithms typically use substitute or proxy distributions to sample. So one idea here is that could I use “markov chain” as the proxy distribution?
But then original (many many moons ago :)) thinking was that law of large numbers could be applied when samples are independent.
The tutorial shows that once a markov chain has reached stationarity if you run it multiple times you would visit the states (or record samples) in accordance with their plausibility. This means we would then be able to apply law of large numbers. This further means that law of large numbers is not only applicable when samples are independent.
I know above explanation could be difficult to understand as it depend on the background on multiple things …. Connecting dots here is a tad bit challenging so please feel free to ask any follow up question.
@fadyelgawly6522 ปีที่แล้ว
@@KapilSachdeva
So Dr.Sachdeva to sum up and see if I understand it in the proper way or not
We collect say a million sample from a distribution and get its mean (as monte carlo law) and then we repeat this process multiple times, and before we say that every sample (which is a million each time) is independent from the other one but all the samples together make a gaussian distribution about the expected value of the original distribution
so we now say that after the markov chain reaches stationary the samples (which is a million each time) are dependent and I can expect from each sample what the next sample will be using the distribution matrix which is updated along with the state probability PI after running it multiple times and now i can take this state probability (PI) and take its largest value as my expectation value for the original distribution (as for example here we have 4 states so the highest one is approximately my original distribution expected value and the others are less roughly to be my expected value of my original distribution by their probability values order)
Am I right here sir or not and if not please clarify more for me and I am really thankful for your support in advance

ต่อไป

เล่นอัตโนมัติ

Metropolis-Hastings - VISUALLY EXPLAINED!