Data Science Interview Mastery: How to Solve Sampling and Simulation Problems with Ease!

แชร์
ฝัง
  • เผยแพร่เมื่อ 24 ก.ค. 2024
  • Simulate Biased Coin and Fair Coin | Coin problems in Data Science Interviews | Data science Interview questions | Statistics Interview questions
    In this video we will go over sampling problems using biased and fair coins. These questions are very commonly asked in Data Science interviews.
    🟢Get all my free data science interview resources
    www.emmading.com/resources
    🟡 Product Case Interview Cheatsheet www.emmading.com/product-case...
    🟠 Statistics Interview Cheatsheet www.emmading.com/statistics-i...
    🟣 Behavioral Interview Cheatsheet www.emmading.com/behavioral-i...
    🔵 Data Science Resume Checklist www.emmading.com/data-science...
    ✅ We work with Experienced Data Scientists to help them land their next dream jobs. Apply now: www.emmading.com/coaching
    // Comment
    Got any questions? Something to add?
    Write a comment below to chat.
    // Let's connect on LinkedIn:
    / emmading001
    ====================
    Contents of this video:
    ====================
    0:00 Intro
    0:24 Fair coin from biased coin
    2:18 Improving efficiency
    5:11 N numbers from biased coin
    8:59 Biased coin from fair coin

ความคิดเห็น • 27

  • @insigh01
    @insigh01 3 ปีที่แล้ว +2

    I'm always looking forward to your videos Emma! Your explanations and examples are always clear and thorough. Thanks for what you do!

  • @Ericq1118
    @Ericq1118 3 ปีที่แล้ว +3

    I really like the real data science interview series! thank you for the sharing!

  • @pro100olga
    @pro100olga 3 ปีที่แล้ว

    Such a great explanation! I have read similar explanations before but did not fully understand them, and here you explained it very comprehensible. Thanks!

  • @gengtony1204
    @gengtony1204 2 ปีที่แล้ว +1

    Very informative video covering lots of content clearly. Just adding a bit on the rejection sampling: in the case we get the results > 5 (useless sample), we just throw the coin three times again and if the results are

  • @Carolina-fn1zw
    @Carolina-fn1zw 3 ปีที่แล้ว

    Emma you're the best for explaining things! Thanks for your videos

    • @emma_ding
      @emma_ding  3 ปีที่แล้ว +1

      Thanks Carolina! Glad you like them!

  • @crossvalidation1040
    @crossvalidation1040 2 ปีที่แล้ว

    The best explanation to this type of questions!

  • @michaelq4261
    @michaelq4261 2 ปีที่แล้ว

    Yes, the coin problems!

  • @Han-ve8uh
    @Han-ve8uh 3 ปีที่แล้ว +1

    Hi Emma, loved these examples of using previous ideas to answer next questions. Where can i find more of such?
    I also wish to clarify some concepts in the "improving efficiency" section. I understand the theory at 3:25 that the sum of prob of 1st case in the 3 pairs = sum of prob of 2nd case in the 3 pairs. However, what exactly does this mean in terms of how we make decisions in the experiment? When you mention this sum=sum concept, it makes me think I have to keep on flipping until I can see all of the 1st (or 2nd) combi in each of 3 pairs, meaning 3x4=12 flips to make a decision of heads/tails.
    However my intuition is that you actually only need 1 out of any of the 6 combis to reach a (head/tail) decision, that if you get any combi in the 1st 3, that's heads decided, and if any combi in the 2nd 3, thats tails decided. If my guess is true, then I don't see how the 2 concepts of "putting some outcomes into 3 pairs" and "sum 1st element across pairs = sum 2nd element across pairs" is important. I'm thinking there is no need to identify 3 pairs. As long as out of all the outcomes, I can find at least 1 pair (2 diff combis with same prob), then I can assign 1 combi in the pair to head and other to tail already, so i don't see how this identifying 3 pairs concept helps.
    Could you comment whether my thinking is wrong and how the 3 pairs and sum=sum translates to decision making in the experiment?
    I see the general idea of this improving efficiency is to reuse inconclusive cases from previous experiments, can we stretch this further to 8 flips to also make HHHH and TTTT useful? Like applying "improving efficiency" onto the "improving efficiency" solution recursively forever to reduce the rejection probability even further from the current 0.81^2 + 0.01^2 = 0.6562?

    • @emma_ding
      @emma_ding  3 ปีที่แล้ว +2

      1. Those questions come from researching and organizing of questions when I prepare for interviews. You can take a look at reddit, classdoor or some other forums for interview questions.
      2. You intuition of only needing 1 pair (2 permutations) out of the 6 is correct. We want to use as many pairs as possible to increase the efficiency.
      3. Using 8 flips will further improve the efficiency. Intuitively, you are recycling the information in the previous tosses instead of throwing them away this will make your strategy more efficient.

    • @sitongchen6688
      @sitongchen6688 ปีที่แล้ว

      @@emma_ding Hi Emma, thanks for sharing! I have the same question in mind. that means we will need 12 flips to decide one head or tail events, is it correct? thanks!

  • @LouisChiaki
    @LouisChiaki 3 ปีที่แล้ว +2

    Sounds very close to talking about entropy :D

  • @SuperLOLABC
    @SuperLOLABC 3 ปีที่แล้ว +1

    Hi Emma, can you suggest where one can prepare for behavioural interviews and stats & probability interview. Also how common do you think non data structure Python questions are?

    • @emma_ding
      @emma_ding  3 ปีที่แล้ว +3

      I have listed materials I used in this post. towardsdatascience.com/how-i-got-4-data-science-offers-and-doubled-my-income-2-months-after-being-laid-off-b3b6d2de6938. For non data structure coding, you can refer to towardsdatascience.com/the-ultimate-guide-to-acing-coding-interviews-for-data-scientists-d45c99d6bddc.

    • @SuperLOLABC
      @SuperLOLABC 3 ปีที่แล้ว

      @@emma_ding Thanks for the reply! I saw in one of your other video that you scheduled an interview a month from getting in contact with a recruiter. I'm curious to know if I can schedule a technical phone screen 4 weeks from a recruiter call and an on-site 5-6 weeks from the technical phone screen, or is the timeline too long?

  • @jiayiwu4101
    @jiayiwu4101 3 ปีที่แล้ว

    Great video! Thank you! But even if we toss 4 times, we can not guarantee to get 4 different results. How would 2^m be enough?

    • @emma_ding
      @emma_ding  3 ปีที่แล้ว

      It's the number of permutations. For 2 tosses you get 4 permutations and for 3 tosses you get 8. To represent (at least) N numbers, you need ceil(log2(N)) tosses.

    • @jiayiwu4101
      @jiayiwu4101 3 ปีที่แล้ว

      @@emma_ding I see. Thank you!

  • @wuliwala3761
    @wuliwala3761 2 ปีที่แล้ว

    For the biased coin problem with extreme p, for example p=0.9, if the new way (4 coins) is used as you suggested, I got P(throwing away results)=1- (.9^3*.1*2+.9^2*.1^2*2+.1^3*.9*2)=0.8362, which is surprisingly even worse than using 2 coins. I used the other way and get the same result 9*.1*2+.81^2+.01^2=0.8362. Where am I wrong?🙃

  • @firesongs
    @firesongs 2 ปีที่แล้ว

    When you get HT and TH from the Biased coin, why does HT mean that it is Heads and TH mean that it is Tails?

  • @qifeizhang4834
    @qifeizhang4834 3 ปีที่แล้ว

    you jumped from 2 toss to 4 toss, why not consider 3 toss? is it because of efficiency?

    • @Han-ve8uh
      @Han-ve8uh 3 ปีที่แล้ว

      @qifei zhang I attempted 3 toss (reuse last toss instead of throwing away 2 tosses described in video) efficiency calculation above, do you agree with that reasoning?

  • @iriswang8401
    @iriswang8401 3 ปีที่แล้ว +5

    We can glue two biased coins together to get a fair coin lol. Well, this is a brain teaser answer.

  • @anandvyavahare2031
    @anandvyavahare2031 2 ปีที่แล้ว +1

    I will never be able to crack these type of questions...