Inverse Transform Sampling : Data Science Concepts

แชร์
ฝัง
  • เผยแพร่เมื่อ 5 ต.ค. 2019
  • Let's take a look at how to transform one distribution into another in data science!
    Note: I should have included a lambda in front of the exponential PDF. I mistakenly forgot it. I appreciate the comments which helped me realize this mistake.
    ---
    Like, Subscribe, and Hit that Bell to get all the latest videos from ritvikmath ~
    ---
    Check out my Medium:
    / ritvikmathematics

ความคิดเห็น • 139

  • @shivamathghara2870
    @shivamathghara2870 4 ปีที่แล้ว +79

    pdf of exponential is (lambda)*e^(-lambda x)

  • @gomolemartifex
    @gomolemartifex 4 ปีที่แล้ว +23

    This video just transformed my life

  • @sepidet6970
    @sepidet6970 4 ปีที่แล้ว +21

    That was a great intuitive explanation of inverse Transform Sampling. It seems so easy to me after watching this video,. Thanks a lot.

  • @bhaskarroy8753
    @bhaskarroy8753 ปีที่แล้ว +3

    Great video. It made the underlying concept crystal clear. Thanks a lot, Ritvik.

  • @shueibsharif9955
    @shueibsharif9955 2 ปีที่แล้ว +2

    I can't thank you enough. You have been of help in many subjects from time series analysis to this. I would like to see EM algorithm, latent class models, and hidden Markov models in the future.

  • @mjf1422
    @mjf1422 4 ปีที่แล้ว +22

    Thank you so much for doing these videos.

    • @fyaa23
      @fyaa23 4 ปีที่แล้ว

      I can't agree with you more.

  • @kissmeimhuman
    @kissmeimhuman ปีที่แล้ว

    I watched a few videos on this and yours was by far the clearest. Thank you.

  • @Bksemsem
    @Bksemsem 9 หลายเดือนก่อน +1

    I really want to thank you because your clear explanation helped me get an A in my statistical programming exam. You are a hero.

  • @andyak93
    @andyak93 3 ปีที่แล้ว

    nice! Thanks for the work. Like the way you explained concepts in a straightforward and smooth way. Please keep it up ! :)

  • @thiagobarreto9056
    @thiagobarreto9056 3 ปีที่แล้ว +4

    Just saw two 20 minutes long videos before this, none made me understand this at all. Then, I saw this 10 minutes long video of yours and it made this subject so much clearer than before. Amazing professor, congratulations!

    • @ritvikmath
      @ritvikmath  3 ปีที่แล้ว +1

      Great to hear!

  • @EubenM
    @EubenM 3 ปีที่แล้ว +5

    You solved a big curiosity I had. I learned about the power of MonteCarlo analysis and how easy it is to get a uniform distribution from Excel, but knew I would always need more specific distributions. So the question was how to get any distribution from a set of randomly generated numbers from the usual Excel Rand() generator. Thanks for the brilliant and easy demonstration! Congrats for your terrific work!

    • @ritvikmath
      @ritvikmath  3 ปีที่แล้ว +3

      Great to hear!

  • @elias043011
    @elias043011 ปีที่แล้ว +1

    You have brilliantly and simply explained a topic that I have been struggling with for a whole semester. Thank you so much! :)

    • @ritvikmath
      @ritvikmath  ปีที่แล้ว

      Glad it was helpful!

  • @samersheichessa4331
    @samersheichessa4331 3 ปีที่แล้ว +1

    Just fantastic ! keep it up man great videos and great explanation

  • @rmiliming
    @rmiliming ปีที่แล้ว

    Thanks a lot! your videos on DS and Stats is the best!

  • @Maikpoint11
    @Maikpoint11 3 ปีที่แล้ว +1

    Super helpful, thank you very much!

  • @ronborneo1975
    @ronborneo1975 2 ปีที่แล้ว

    Quite an amazing explanation. Well done!!

  • @YingleiZhang
    @YingleiZhang หลายเดือนก่อน

    Brilliant teacher! I guess it is a sort of gift.

  • @dedecage7465
    @dedecage7465 ปีที่แล้ว

    This was super pedagogical, thank you very much.

  • @emilioalfaro4365
    @emilioalfaro4365 ปีที่แล้ว

    very clear explanation, thanks for sharing!

  • @lilmoesk899
    @lilmoesk899 4 ปีที่แล้ว +3

    Thanks for the video! I'm still struggling with this, but your explanation definitely helped!

  • @phuongdinh3769
    @phuongdinh3769 10 หลายเดือนก่อน

    Trying to wrap my head around this in class but to no avail. Thank you so much for your amazing explanation

  • @fredericoamigo
    @fredericoamigo 2 ปีที่แล้ว

    Excellent explanation! Keep ut the good work!

  • @sebastianmathalikunn
    @sebastianmathalikunn 2 ปีที่แล้ว

    Hi Ritvik, great videos! would be interested to have a set of videos explaining variational bayes, ELBO etc. in order to perform bayesian optimisation on hyper-parameters

  • @markusnascimento210
    @markusnascimento210 ปีที่แล้ว

    Greatly explained! Thanks!

  • @fionnmcglacken35
    @fionnmcglacken35 3 ปีที่แล้ว

    Brilliant, thank you so much.

  • @shubhamthakur3461
    @shubhamthakur3461 2 ปีที่แล้ว

    Great Explaination! Thanks so much :)

  • @katiedunn7369
    @katiedunn7369 2 ปีที่แล้ว

    very helpful, thanks for this video!

  • @malhajed
    @malhajed 2 ปีที่แล้ว

    I love your explanation always produce the best please don’t stop

  • @ramn9071
    @ramn9071 2 ปีที่แล้ว

    Well explained .. thanks. One minor suggestion .. if there is a way you can make the video screen capture friendly or leave a screen capture slides to the video, that would be super helpful. Thanks for the clear presentation.

  • @Juanlufg
    @Juanlufg 3 ปีที่แล้ว

    Thank you for this, it has helped me a lot! :)

  • @adishumely
    @adishumely 3 ปีที่แล้ว

    great video! thanks!

  • @trollingenstrae2207
    @trollingenstrae2207 3 ปีที่แล้ว

    great explanation, thanks a lot!

  • @realimaginary5328
    @realimaginary5328 2 ปีที่แล้ว

    Excellent. !

  • @therockbottom2539
    @therockbottom2539 4 หลายเดือนก่อน

    Love how calm you are. I'm shitting myself when I have to explain topics like these to someone.

  • @berkayyucel1538
    @berkayyucel1538 4 ปีที่แล้ว +1

    That was awesome. Thank you !!!!!

  • @musondakatongo5478
    @musondakatongo5478 4 ปีที่แล้ว

    Well explained. Thanks a mil

  • @Busterblade20
    @Busterblade20 3 ปีที่แล้ว

    Thank you so much. You help me a lot with a homework I have.

  • @HadiAhmed7546
    @HadiAhmed7546 2 ปีที่แล้ว

    Thanks a lot bro, so helpful

  • @learnphysics6455
    @learnphysics6455 3 ปีที่แล้ว

    Gem level bhau

  • @praburocking2777
    @praburocking2777 ปีที่แล้ว

    great explanation

  • @MegaNightdude
    @MegaNightdude 3 ปีที่แล้ว +2

    Brilliant!!!!

  • @teojunwei2000
    @teojunwei2000 3 ปีที่แล้ว +10

    hi, is there an error with the PDF function? f(x) = lambda * exp^(-lambda)(x)? thank you for this video!

  • @joaopedroxavier8474
    @joaopedroxavier8474 3 ปีที่แล้ว +1

    Thanks for the video! I was struggling to understand the motivation behind it, but your explanation has made it much easier for me :)

    • @ritvikmath
      @ritvikmath  3 ปีที่แล้ว

      Glad it helped!

    • @EubenM
      @EubenM 3 ปีที่แล้ว

      João, veja meu comentário acima para um exemplo de aplicação.

  • @lancelofjohn6995
    @lancelofjohn6995 3 ปีที่แล้ว

    Nice lecture!

  • @deepanshu7714
    @deepanshu7714 5 หลายเดือนก่อน

    u r best teacher ever

  • @mostafaalkady6556
    @mostafaalkady6556 2 หลายเดือนก่อน

    Great explanation! Thanks.

    • @ritvikmath
      @ritvikmath  2 หลายเดือนก่อน

      Glad you enjoyed it!

  • @ec-wc1sq
    @ec-wc1sq 3 ปีที่แล้ว

    thanks, this is a great video!

  • @aryang5511
    @aryang5511 ปีที่แล้ว +1

    Great video, it really helps me out a lot. One thing I still dont really understand is why we might do this. As in, why would we use the inverse transformation method to find the exponential random variable instead of just using the exponential PDF directly if we have lamda?

  • @UrBigSisKey
    @UrBigSisKey 2 ปีที่แล้ว

    this is great thank you so much :)

  • @emanelsheikh6344
    @emanelsheikh6344 ปีที่แล้ว

    Thank you 🙏

  • @ahmetkarakartal9563
    @ahmetkarakartal9563 2 ปีที่แล้ว

    you saved my life

  • @dwightsablan3571
    @dwightsablan3571 3 ปีที่แล้ว +1

    Thank you, this helped a ton! :)

  • @aytekin8669
    @aytekin8669 3 ปีที่แล้ว +1

    thanks for good explanation about Inverse Transform sampling !

    • @ritvikmath
      @ritvikmath  3 ปีที่แล้ว

      Glad it was helpful!

  • @phalanxz11_
    @phalanxz11_ 4 ปีที่แล้ว +3

    Can you please do a video about Copulas? For example in a (credit) risk management context

  • @yaningwang8629
    @yaningwang8629 ปีที่แล้ว +1

    omg you saved my stats degree, much thanks

  • @farhadbatmanghelich278
    @farhadbatmanghelich278 2 ปีที่แล้ว

    Thanks!

  • @riaddjaid7428
    @riaddjaid7428 2 หลายเดือนก่อน

    thank you so much sir, I would like to know which probability distributions commonly used that we use inverse method with.

  • @sheeta2726
    @sheeta2726 ปีที่แล้ว

    Thank you!!!!!!!!!!!

  • @caiyunwurslin2468
    @caiyunwurslin2468 2 ปีที่แล้ว

    Thank you. Our instructor did not explain it and just gave the theorem. I was confused like I have three heads.

  • @maximegrossman2146
    @maximegrossman2146 3 ปีที่แล้ว

    excellent

  • @yusufelnady3381
    @yusufelnady3381 3 ปีที่แล้ว

    Thank you man

  • @OscarBedford
    @OscarBedford 10 หลายเดือนก่อน

    What is the role of lambda? I've seen other videos that don't include it, so now I'm curious. Amazing explanation btw!

  • @roayadiamond
    @roayadiamond 4 ปีที่แล้ว +2

    He is going to be a fabulous professor

    • @ritvikmath
      @ritvikmath  4 ปีที่แล้ว

      Haha I appreciate the kind words :)

  • @tj9796
    @tj9796 2 ปีที่แล้ว

    Great video. Could you do one on copulas, building on this one?

  • @annabelseah920
    @annabelseah920 2 ปีที่แล้ว

    perfect!

  • @kobi981
    @kobi981 หลายเดือนก่อน

    Very nice video! thank you!
    The uniform should be (0,1] without 0 right? so the ln will be defined.

  • @jindai5850
    @jindai5850 4 ปีที่แล้ว

    Yo Ritvik not sure if you still remember me we talked during orientation (I was the guy work with Tasty). We had a class last week about MCMC and I was confused about certain parts and TH-cam directed me to this video lol. Great job man keep it up. Hope we can catch up when things get back to normal after the pandemic

  • @stipepavic843
    @stipepavic843 2 ปีที่แล้ว

    this guy is epic!!!

  • @liamobrien8610
    @liamobrien8610 4 ปีที่แล้ว +7

    Great video! Your exponential density is missing it's normalizing constant, though. Since your CDF is correct, no harm , no foul, but it might confuse some people.

    • @algrant33
      @algrant33 3 ปีที่แล้ว +2

      Yep, I'm looking for the lambda*e^ -(lambda*x).

  • @unnikrishnanadoor
    @unnikrishnanadoor 4 ปีที่แล้ว

    I have a question if we graph the inverse function of that exponential function how it will looks like? whether it looks similar to graph of uniform distribution? otherwise how this can be equal?

  • @hashbrowncookie8444
    @hashbrowncookie8444 3 ปีที่แล้ว

    So if I had some other distributions apart from exponential one, I just need to derive its inverse, and set the number of simulations I will like to do with a U that is unif from 0 to 1? I just need clarification in that part.

  • @whoami6821
    @whoami6821 4 ปีที่แล้ว +2

    could you make more advance time series tutorial? really like your videos and i'm struggling in grad level time series course

    • @ritvikmath
      @ritvikmath  4 ปีที่แล้ว

      More time series vids coming up soon!

  • @geoffreyanderson4719
    @geoffreyanderson4719 2 ปีที่แล้ว

    I have question about the math, on how to derive other inverse transformations especially for datasets that predict number of clicks on a web page for example. Some of them are tricky and might even need estimation by iterative numerical methods or ML, because the Poisson is simple to find the inverse function for. And then how do you put the inverse transform into an sklearn pipeline exactly? Here's why I ask this: Sometimes I am using a Generalized Linear Model which provides a convenient link function already built-in, but we are not always going to just use a linear model as we might need to use for example the large feature vectors that an NLM model is producing to describe some text. GLM is not necessarily the only tool to consider. Besides for random sampling, Transforms are also good for ML preprocessing and postprocessing pipelines to help your model learn easier. The log(Y) and e(Y) are the Poisson distributions transformations when your response Y is a count. Quasipoisson and Negative Binomial are good for count data when the mean and variance are not staying equal as the Poisson requires, but instead are showing some overdispersion or underdispersion. There's also zero inflation model which combines a logistic model and a Poisson model together in sort of an ensemble to help pre-predict the count = 0 case when 0 appears a lot more often than plain old Poisson can account for alone.

  • @rachidwatcher5860
    @rachidwatcher5860 4 ปีที่แล้ว +1

    Thx body u the best

  • @chonglizhao2699
    @chonglizhao2699 3 ปีที่แล้ว

    If I understand correctly, the reason why uniform distribution is used because its output range from 0 to 1. Just out of curiosity, can we use beta distribution to replace uniform distribution?

  • @ivansaiji
    @ivansaiji 4 ปีที่แล้ว

    Not very proficient in statistics, but in sum, if I do the transformation and have the final function, given a number u that is randomly generated from a uniform distribution, I will get an equivalent randomly generated number that falls under an exponential distribution?
    great video, I will subscribe and continue to watch them!

  • @AhmedMohamed-dd4ef
    @AhmedMohamed-dd4ef 7 หลายเดือนก่อน

    Question : Hi, i have rainfall data as a 2d matix/frame of the UK every 5 minutes so the data is spatially and temporarily correlated. The data has severely positive skewness. Around 90% of pixels or points are less than 10 and 10% between 10-128. When i train a cnn, it is only predict rainfall of low values because of the data imbalance. I would like to transform to uniform distribution. I tried log transformation which compressed the data but still there is imbalance. Do you know how to convert to a uniform distribution so all of the values have the same chance to be predicted? It is a regressio task to predict the next 12 frames of rainfall. The data is represented by only one continuous variable, rainfall intensity. Many thanks

  • @lm58142
    @lm58142 ปีที่แล้ว

    Thanks for sharing. Just one small comment....pdf of the exponential is lambda*e^(-lambda*x).

  • @gavinresch1144
    @gavinresch1144 ปีที่แล้ว

    Hey - great video! I think you might have forgotten the lambda in front of the exponential for the exponential PDF. If you calculate the CDF from what you have written you will get a 1/lambda factor.

    • @ritvikmath
      @ritvikmath  ปีที่แล้ว

      Yup you’re definitely right !

  • @tianjoshua4079
    @tianjoshua4079 3 ปีที่แล้ว +1

    Great video. Quick question: at the end of the video, you said we could swap 1- u for u. That means 1 - u = u, which translates into u = 1/2. Yet u is a random variable, it is not necessarily 1/2, right? What am I missing?

    • @ritvikmath
      @ritvikmath  3 ปีที่แล้ว +4

      Good question! We are not swapping 1-u for u in an algebraic sense (in which case you would be absolutely correct). Rather, we note that u is a uniform random variable between 0 and 1. Therefore 1-u is also a uniform random variable between 0 and 1. Thus, it does not matter (in terms of probability) whether we use 1-u or u. And using just u makes the formula look a bit nicer.

    • @tianjoshua4079
      @tianjoshua4079 3 ปีที่แล้ว

      @@ritvikmath Oh. I understand. RVs are not really variables. When it comes to RVs, what matters is not the specific value of the RV, yet it is the distribution of the RV that matters. Since u and 1-u are both RVs with the same distribution, they are interchangeable.

  • @7ignatios
    @7ignatios 4 ปีที่แล้ว

    Can you do (or recommend) a video on Granger Causality?

    • @ritvikmath
      @ritvikmath  4 ปีที่แล้ว

      Thanks for the suggestion! I'll look into it

  • @adityasaini491
    @adityasaini491 4 ปีที่แล้ว +1

    That subtle pen flip at 5:49.. Damnn

    • @EubenM
      @EubenM 3 ปีที่แล้ว

      LOL

    • @adityasaini491
      @adityasaini491 3 ปีที่แล้ว

      @@EubenM You replied :DD Great videos man! Your channel is awesome :DD

  • @nicnicco
    @nicnicco หลายเดือนก่อน

    Are there any resources I can look at to understand why it's valid to assume that p(T(U)

  • @scarlettwang2643
    @scarlettwang2643 4 ปีที่แล้ว

    if the distribution we want is not the exponential distribution, are the steps are still the same?

  • @blonderuna
    @blonderuna 4 ปีที่แล้ว +2

    Hi! I loved the video, but I've got a question. What are the cases when the CDF is not invertible? And what are the strategies then? Should we try to make the CDF invertable by interpolating it or should we use another random variate generation technique? Thank you in advance! Happy New Year.

    • @ritvikmath
      @ritvikmath  4 ปีที่แล้ว +3

      Happy new year! And great question, indeed this technique is good only if you can find the inverse of the CDF, so if that is not possible, interpolation is a great idea as long as the fit is "good enough"

    • @jonatangarcia9285
      @jonatangarcia9285 ปีที่แล้ว

      You can use the generalized inverse of the function. This is a function g such that g(y) is the infimum of the x such that F_X(x) >= y. Since F_X is a continuous function from the right this is always a minimum. So this function is such that F(g(y)) =y, it works like the inverse and the difference is that if there are other values with the same image you take the least of them and you can always do that. This is the same function to calculate quaintiles, so Q_{0.5} = g(0.5). Take in account that g(0) = -infinity and g(1) = infinity, to get the values right. More information here en.wikipedia.org/wiki/Probability_integral_transform

  • @user-pl4ix5hp8i
    @user-pl4ix5hp8i 4 ปีที่แล้ว

    Having a hard time understanding EWMA and GARCH model ,can you make some videos introducing them?thx

    • @ritvikmath
      @ritvikmath  4 ปีที่แล้ว +1

      GARCH is coming up soon!

  • @skyful9
    @skyful9 2 ปีที่แล้ว

    In data science can we transform weibull distribution into Gamma or poison distribution?

  • @grjesus9979
    @grjesus9979 3 ปีที่แล้ว +1

    Then, why is important the uniform pdf?. I mean you could sample directly from one distribution to another just by putting the value returned from the CDF of the first pdf as input to the inverse CDF of pdf you want to arrive at. Am I wrong?

  • @zhoucyrus5797
    @zhoucyrus5797 4 หลายเดือนก่อน

    there is an error for the pdf of the exponential distribution, the lambda is missing.

  • @nishitshukla4139
    @nishitshukla4139 4 ปีที่แล้ว +3

    Lets say u = 0.25. Then 1 - u = 0.75, right? Could someone explain how 1- u = u in the uniform distribution?

    • @awangsuryawan7320
      @awangsuryawan7320 3 ปีที่แล้ว

      Up

    • @user-gy7uu9gt8n
      @user-gy7uu9gt8n 3 ปีที่แล้ว

      Actually the magic for this inverse transform to work is the equation P(T(U)

  • @annali9577
    @annali9577 3 ปีที่แล้ว

    this is super clear and I can go to bed very happy

  • @PavelSTL
    @PavelSTL 4 ปีที่แล้ว

    Was hoping to hear more about motivations for WHY i need to know this method for DS. "that's how computer gives you random samples from a distribution" is not enough to care about it. What about cases where maybe I don't have a pdf or it cannot be integrated or I get only proportionality of pdf (like in Bayesian model) so I can't just plug in the variable into the proportional pdf and get accurate samples..... maybe that's when I need to use this method.....

    • @ritvikmath
      @ritvikmath  4 ปีที่แล้ว

      I appreciate the feedback!

  • @kerguule
    @kerguule 4 หลายเดือนก่อน

    I don't get it why the exponential distribution is called memoryless? Yes, I know that that lambda or hazard rate is constant but isn't that just the speed or rate of the probability (not the actual probability because the lambda can be more than 1). From the exponential PDF, you can clearly see that the chances in the early phase are bigger than in the later phases so why is it called memoryless? If I sampled time to failures, should I get more numbers early than later because of that decreasing curve?

  • @hp-qx7tf
    @hp-qx7tf 15 วันที่ผ่านมา

    beauty

  • @piyushsinha3344
    @piyushsinha3344 2 ปีที่แล้ว

    in order to find the inverse of CDF, we just find the value of x..why? in other word, how come x is the inverse of CDF?

  • @RagaveshDhandapani
    @RagaveshDhandapani 4 ปีที่แล้ว

    Thanks a lot. Can u make a video on generalised normal distribution and inverse to uniform. Please

  • @athantas
    @athantas 4 ปีที่แล้ว

    what if the function is not invertible? any way to deal with that?

    • @kocur4d
      @kocur4d 3 ปีที่แล้ว

      no, this method works only with invertible functions. You need other sampling methods for those. like MCMC or variants.

  • @martinschulze5399
    @martinschulze5399 4 ปีที่แล้ว

    its not a ''datascience'' method (which sounds like it comes from modern era). it is known as smirnov method who lived around 1900 and likely known before

  • @kevincannon2269
    @kevincannon2269 2 หลายเดือนก่อน

    TLDR: The distribution of the CDF of _any_ PDF is uniform, so if you want to sample from a PDF that has an invertible CDF, you can sample from the uniform distribution and convert it to the desired distribution with the inverse of the CDF.

  • @konstantinkulagin
    @konstantinkulagin 5 หลายเดือนก่อน

    I probably missed this moment: why transformation to CDF actually gives you desired distribution?