you are amazing! Every time when i have troubles understanding some concept and need to search the web, if i see that you have a video on this topic, is already solved for me.
Hi, thanks for the great video series. Question: for importance sampling, do we already assume we know what g is, such that given some x, we can compute g(x)? If we do, then couldn't we compute its mean analytically? If not, how do we resolve g(x) in the g(x)/f(x) term?
Hi, thanks for your comment. Yes, you’re right that we need to know g(x) (at least up to a constant of proportionality). The beauty of importance sampling is that even if g(x) is unnormalised, however - and so we cannot compute the particular expectation - we can still use importance sampling to estimate it. Also, whilst I’ve illustrated how importance sampling works for a 1D example (where computing integrals is typically quite straightforward), importance sampling can be used in higher dimensional integration where it is usually not possible to compute integrals analytically. Hope that helps! Best, Ben
@@ianpan0102 hi genius. I meant concrete example for multivariate problem. I meant an actual case where he actually does a freaking sampling without knowing the other distribution. And not just talk on how it's useful there. Even an 8 year old would understand that first part. If you have nothing informative to say, don't comment as if you know it.
As he explained, we have f(x) but we don't have g(x), so we will approximate it by sampling from other distribution which is h(x)/z. Then we will compute W
@@alikassem812 i think we have g(x), but sometimes is very difficult to do the math of g(x), so we use importance sampling. In this case g(x) is simple, but in bayesian statistics in general is a very complex function
@@thidmg agree- seems too easy and useless at the first look but I guess it's more a numerical issue why importance sampling has its right to exist and then it gets super useful
Great series of videos, complements very well the book by Casella! Would you please be able to add a link to the Mathematica animation code for both importance and rejection sampling? They would be tremendously useful!
I'm still a little confused on why, if we know the ratio of g(x)/f(x) then we can compute E_g[x] directly? Can you give an concrete example/illustration of a multivariate continuous/discrete case?
Professor Ben Lambert congratulations on the videos. Great didactics. My suggestion is to always keep the footer of videos clean (no written information). Because a lot of people watch their videos with subtitles, it often gets confused the subtitles and their writing and makes it difficult to learn. Again, I congratulate you on the initiative to democratize information here on TH-cam.
but what happens if i actually wanna sample from that difficult-to-sample distribution instead of computing things like expectation etc etc? like, i want the points themselves.
what if we are interested in another statistic like the median (r(x)) or quantile (h(x)) of g(x)? we just change 'x' in the derivation to 'h(x)' or 'r(x)'?
hi, the expectation of a constant is always that constant, "the constant is constant" it does not varies anytime, it's always that constant number, in this example is 1 but can be any constant number like 1, 2, 3, 4...
You actually lost many people at 4:00 when you said "we only have access to the fair dice and somehow we want to work out the mean of the biased dice". Well... If we don't have access to the dice and it's probability distribution how do we even know what kind of dice are we dealing with? If we have its distribution then why are we bothering using the other ones samples???
Hi! I am trying to work out model selection trough information criterion, please can you help in explaining the intuition behind Akaike Information criterion (AIC) also Bayesian Information criterion(BIC) and if you could start from Kullback Leibler divergence and how to derive AIC(or BIC) from it , and how AIC/BIC pick the best model,it will be deeply appreciate. Thanks in advance!!
you are amazing! Every time when i have troubles understanding some concept and need to search the web, if i see that you have a video on this topic, is already solved for me.
Hi, thanks for the great video series. Question: for importance sampling, do we already assume we know what g is, such that given some x, we can compute g(x)? If we do, then couldn't we compute its mean analytically? If not, how do we resolve g(x) in the g(x)/f(x) term?
Hi, thanks for your comment. Yes, you’re right that we need to know g(x) (at least up to a constant of proportionality). The beauty of importance sampling is that even if g(x) is unnormalised, however - and so we cannot compute the particular expectation - we can still use importance sampling to estimate it. Also, whilst I’ve illustrated how importance sampling works for a 1D example (where computing integrals is typically quite straightforward), importance sampling can be used in higher dimensional integration where it is usually not possible to compute integrals analytically. Hope that helps! Best, Ben
Right, that cleared things up a lot. Thank you for your time!
@@SpartacanUsuals If we cannot compute the integral of g analytically, in general are we able to compute the integral of f?
@@SpartacanUsuals can you give an concrete example though . If you know ..
@@ianpan0102 hi genius. I meant concrete example for multivariate problem. I meant an actual case where he actually does a freaking sampling without knowing the other distribution. And not just talk on how it's useful there. Even an 8 year old would understand that first part. If you have nothing informative to say, don't comment as if you know it.
Hi Lambert, it would be great to include sequential importance sampling in the series.
But how we get or calculate the ratio between f and g? the true distribution f is unknown right?
As he explained, we have f(x) but we don't have g(x), so we will approximate it by sampling from other distribution which is h(x)/z. Then we will compute W
@@alikassem812 i think we have g(x), but sometimes is very difficult to do the math of g(x), so we use importance sampling. In this case g(x) is simple, but in bayesian statistics in general is a very complex function
@@thidmg yea, you are right. Thanks
@@thidmg agree- seems too easy and useless at the first look but I guess it's more a numerical issue why importance sampling has its right to exist and then it gets super useful
Great series of videos, complements very well the book by Casella! Would you please be able to add a link to the Mathematica animation code for both importance and rejection sampling? They would be tremendously useful!
I really like the way you explain things. Thanks a lot.
Ben, you are a life saver.
at 9:18 there is an f(x) missing in the sum, isn't there? Because it is the expectation of f..?
dude it would be really helpful if you could reference the page in your book that the video refers too .
I'm still a little confused on why, if we know the ratio of g(x)/f(x) then we can compute E_g[x] directly? Can you give an concrete example/illustration of a multivariate continuous/discrete case?
Professor Ben Lambert congratulations on the videos. Great didactics. My suggestion is to always keep the footer of videos clean (no written information). Because a lot of people watch their videos with subtitles, it often gets confused the subtitles and their writing and makes it difficult to learn. Again, I congratulate you on the initiative to democratize information here on TH-cam.
The best explain on importance sampling... good animation
but what happens if i actually wanna sample from that difficult-to-sample distribution instead of computing things like expectation etc etc? like, i want the points themselves.
what if we are interested in another statistic like the median (r(x)) or quantile (h(x)) of g(x)? we just change 'x' in the derivation to 'h(x)' or 'r(x)'?
Thank you so much, with the help of your video I finally understand the idea behind IS
You saved my life. Thanks.
Great video! Thanks for putting effort into this, it really helped! Greetings from Spain.
Well done ! I like the graphics that shows convergence.
is there anybody knows the software's name which is the uploader was using?
The part where he was showing the automatically plotting graphs is called Mathematica. It's made by Wolfram.
Hi, I really enjoyed that video. But why is the expectation value of 1 always equal to 1?
hi, the expectation of a constant is always that constant, "the constant is constant" it does not varies anytime, it's always that constant number, in this example is 1 but can be any constant number like 1, 2, 3, 4...
@@thidmg Ohh.. thank you very much. That explanation did not come to my mind. Thank you and stay healthy!
Well and very clearly explained! Thank you!
You actually lost many people at 4:00 when you said "we only have access to the fair dice and somehow we want to work out the mean of the biased dice". Well... If we don't have access to the dice and it's probability distribution how do we even know what kind of dice are we dealing with? If we have its distribution then why are we bothering using the other ones samples???
this is perfect, thank you
Beautiful video.
This would really help for someone to understand why importance sampling is used for sampling rays in ray tracing. Thanks!!
Hi! I am trying to work out model selection trough information criterion, please can you help in explaining the intuition behind Akaike Information criterion (AIC) also Bayesian Information criterion(BIC) and if you could start from Kullback Leibler divergence and how to derive AIC(or BIC) from it , and how AIC/BIC pick the best model,it will be deeply appreciate. Thanks in advance!!
nice explained,thank you
you are a genius
great video