Probability Distributions Made Easy: Top 3 to Know for Data Science Interviews

แชร์
ฝัง
  • เผยแพร่เมื่อ 26 ส.ค. 2024

ความคิดเห็น • 11

  • @chihirobabuska4422
    @chihirobabuska4422 2 ปีที่แล้ว +9

    Hi Emma, thanks for your wonderful video.
    In your Binomial example, I would like to point out that click through rate follows a normal distribution due to Central Limit Theorem.
    Assuming the total number of clicks follows a Binomial(n, p), which means that there are total n impressions in consideration, and whether each impression ends up as a click is a Bernoulli(p) variable. In other words, there are only two outcomes for each impression, and with probability p it ended up as a click.
    The click through rate is the average of the results of all the above n Bernoulli variables. By CLT, the average of all these Bernoulli variables follows a normal distribution.
    After all, click through rate is a continuous variable, while a Bernoulli distribution is a discrete distribution with only 2 outcomes.

  • @beyondtheclouds95
    @beyondtheclouds95 ปีที่แล้ว

    the churn example is gold!

  • @danielrad7991
    @danielrad7991 2 ปีที่แล้ว +7

    In the first example (Avg time spent per user per day), the sample size is 10. Can we assume normality, given our sample size is too small?

    • @stella123www
      @stella123www ปีที่แล้ว

      same question. I think if the n=10, CLT doesn't apply. She probably meant when sample size is larger than 30, the samples' avg time spent per user per day is normally distributed

    • @yunyihuang9476
      @yunyihuang9476 ปีที่แล้ว

      n is 1000 in this example

  • @songsong2334
    @songsong2334 2 ปีที่แล้ว

    Thanks, Emma for the great video! If we map the distribution to the AB test distribution, for binary outcomes it will be binomial distribution. At the same time, will other cases all be normal distribution according to the Central limit theorem? I do not have enough practical experience in AB Testing, would love to know how we decide how different distributions are used in the AB test. Why do we have to specify a T-test or a Z-test?

  • @emmysway96
    @emmysway96 4 หลายเดือนก่อน

    I think the green and blue parameters are swapped for the normal distribution diagram.

  • @keshavgupta308
    @keshavgupta308 2 ปีที่แล้ว

    Hii Mam Remember me
    I love the way you taught us everything 😍😍🤗🤗

  • @raghavmittal2397
    @raghavmittal2397 4 หลายเดือนก่อน

    Hi Emma, I have a doubt - How would one calculate the average time spend per user per day?
    Say we select a random sample of 10 users as in your example in the video. For those 10 users we have data on the time spent per day for each of the users. Now a user might have multiple time spent per day values depending on if they were active on several days. So for a particular user we calculate the average time spent per day by that user and then take the average time spent for the 10 users using average of individual averages?

    • @raghavmittal2397
      @raghavmittal2397 4 หลายเดือนก่อน

      Other way of approaching this question can be - randomly selecting say 10 dates, and calculating the average of the total time spent per user per day for those days, and repeating the process 1000 times. Will this method work?

  • @lenka4662
    @lenka4662 ปีที่แล้ว

    Hi Emma 你好 本土国内大学数学专业 未留过学的 希望竞争国外的数据科学家有希望吗