Data Science Interview Mastery: How to Solve Sampling and Simulation Problems with Ease!
ฝัง
- เผยแพร่เมื่อ 24 ก.ค. 2024
- Simulate Biased Coin and Fair Coin | Coin problems in Data Science Interviews | Data science Interview questions | Statistics Interview questions
In this video we will go over sampling problems using biased and fair coins. These questions are very commonly asked in Data Science interviews.
🟢Get all my free data science interview resources
www.emmading.com/resources
🟡 Product Case Interview Cheatsheet www.emmading.com/product-case...
🟠 Statistics Interview Cheatsheet www.emmading.com/statistics-i...
🟣 Behavioral Interview Cheatsheet www.emmading.com/behavioral-i...
🔵 Data Science Resume Checklist www.emmading.com/data-science...
✅ We work with Experienced Data Scientists to help them land their next dream jobs. Apply now: www.emmading.com/coaching
// Comment
Got any questions? Something to add?
Write a comment below to chat.
// Let's connect on LinkedIn:
/ emmading001
====================
Contents of this video:
====================
0:00 Intro
0:24 Fair coin from biased coin
2:18 Improving efficiency
5:11 N numbers from biased coin
8:59 Biased coin from fair coin
I'm always looking forward to your videos Emma! Your explanations and examples are always clear and thorough. Thanks for what you do!
I really like the real data science interview series! thank you for the sharing!
Such a great explanation! I have read similar explanations before but did not fully understand them, and here you explained it very comprehensible. Thanks!
Very informative video covering lots of content clearly. Just adding a bit on the rejection sampling: in the case we get the results > 5 (useless sample), we just throw the coin three times again and if the results are
Emma you're the best for explaining things! Thanks for your videos
Thanks Carolina! Glad you like them!
The best explanation to this type of questions!
Yes, the coin problems!
Hi Emma, loved these examples of using previous ideas to answer next questions. Where can i find more of such?
I also wish to clarify some concepts in the "improving efficiency" section. I understand the theory at 3:25 that the sum of prob of 1st case in the 3 pairs = sum of prob of 2nd case in the 3 pairs. However, what exactly does this mean in terms of how we make decisions in the experiment? When you mention this sum=sum concept, it makes me think I have to keep on flipping until I can see all of the 1st (or 2nd) combi in each of 3 pairs, meaning 3x4=12 flips to make a decision of heads/tails.
However my intuition is that you actually only need 1 out of any of the 6 combis to reach a (head/tail) decision, that if you get any combi in the 1st 3, that's heads decided, and if any combi in the 2nd 3, thats tails decided. If my guess is true, then I don't see how the 2 concepts of "putting some outcomes into 3 pairs" and "sum 1st element across pairs = sum 2nd element across pairs" is important. I'm thinking there is no need to identify 3 pairs. As long as out of all the outcomes, I can find at least 1 pair (2 diff combis with same prob), then I can assign 1 combi in the pair to head and other to tail already, so i don't see how this identifying 3 pairs concept helps.
Could you comment whether my thinking is wrong and how the 3 pairs and sum=sum translates to decision making in the experiment?
I see the general idea of this improving efficiency is to reuse inconclusive cases from previous experiments, can we stretch this further to 8 flips to also make HHHH and TTTT useful? Like applying "improving efficiency" onto the "improving efficiency" solution recursively forever to reduce the rejection probability even further from the current 0.81^2 + 0.01^2 = 0.6562?
1. Those questions come from researching and organizing of questions when I prepare for interviews. You can take a look at reddit, classdoor or some other forums for interview questions.
2. You intuition of only needing 1 pair (2 permutations) out of the 6 is correct. We want to use as many pairs as possible to increase the efficiency.
3. Using 8 flips will further improve the efficiency. Intuitively, you are recycling the information in the previous tosses instead of throwing them away this will make your strategy more efficient.
@@emma_ding Hi Emma, thanks for sharing! I have the same question in mind. that means we will need 12 flips to decide one head or tail events, is it correct? thanks!
Sounds very close to talking about entropy :D
Hi Emma, can you suggest where one can prepare for behavioural interviews and stats & probability interview. Also how common do you think non data structure Python questions are?
I have listed materials I used in this post. towardsdatascience.com/how-i-got-4-data-science-offers-and-doubled-my-income-2-months-after-being-laid-off-b3b6d2de6938. For non data structure coding, you can refer to towardsdatascience.com/the-ultimate-guide-to-acing-coding-interviews-for-data-scientists-d45c99d6bddc.
@@emma_ding Thanks for the reply! I saw in one of your other video that you scheduled an interview a month from getting in contact with a recruiter. I'm curious to know if I can schedule a technical phone screen 4 weeks from a recruiter call and an on-site 5-6 weeks from the technical phone screen, or is the timeline too long?
Great video! Thank you! But even if we toss 4 times, we can not guarantee to get 4 different results. How would 2^m be enough?
It's the number of permutations. For 2 tosses you get 4 permutations and for 3 tosses you get 8. To represent (at least) N numbers, you need ceil(log2(N)) tosses.
@@emma_ding I see. Thank you!
For the biased coin problem with extreme p, for example p=0.9, if the new way (4 coins) is used as you suggested, I got P(throwing away results)=1- (.9^3*.1*2+.9^2*.1^2*2+.1^3*.9*2)=0.8362, which is surprisingly even worse than using 2 coins. I used the other way and get the same result 9*.1*2+.81^2+.01^2=0.8362. Where am I wrong?🙃
When you get HT and TH from the Biased coin, why does HT mean that it is Heads and TH mean that it is Tails?
you jumped from 2 toss to 4 toss, why not consider 3 toss? is it because of efficiency?
@qifei zhang I attempted 3 toss (reuse last toss instead of throwing away 2 tosses described in video) efficiency calculation above, do you agree with that reasoning?
We can glue two biased coins together to get a fair coin lol. Well, this is a brain teaser answer.
I will never be able to crack these type of questions...