Data Science Interview Mastery: How to Solve Sampling and Simulation Problems with Ease!

Emma Ding

มุมมอง 16 009

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 24 ก.ค. 2024
Simulate Biased Coin and Fair Coin | Coin problems in Data Science Interviews | Data science Interview questions | Statistics Interview questions
In this video we will go over sampling problems using biased and fair coins. These questions are very commonly asked in Data Science interviews.
🟢Get all my free data science interview resources
www.emmading.com/resources
🟡 Product Case Interview Cheatsheet www.emmading.com/product-case...
🟠 Statistics Interview Cheatsheet www.emmading.com/statistics-i...
🟣 Behavioral Interview Cheatsheet www.emmading.com/behavioral-i...
🔵 Data Science Resume Checklist www.emmading.com/data-science...
✅ We work with Experienced Data Scientists to help them land their next dream jobs. Apply now: www.emmading.com/coaching
// Comment
Got any questions? Something to add?
Write a comment below to chat.
// Let's connect on LinkedIn:
/ emmading001
====================
Contents of this video:
====================
0:00 Intro
0:24 Fair coin from biased coin
2:18 Improving efficiency
5:11 N numbers from biased coin
8:59 Biased coin from fair coin

ความคิดเห็น • 27

@insigh01 3 ปีที่แล้ว ⁺²
I'm always looking forward to your videos Emma! Your explanations and examples are always clear and thorough. Thanks for what you do!
@Ericq1118 3 ปีที่แล้ว ⁺³
I really like the real data science interview series! thank you for the sharing!
@pro100olga 3 ปีที่แล้ว
Such a great explanation! I have read similar explanations before but did not fully understand them, and here you explained it very comprehensible. Thanks!
@gengtony1204 2 ปีที่แล้ว ⁺¹
Very informative video covering lots of content clearly. Just adding a bit on the rejection sampling: in the case we get the results > 5 (useless sample), we just throw the coin three times again and if the results are
@Carolina-fn1zw 3 ปีที่แล้ว
Emma you're the best for explaining things! Thanks for your videos
@emma_ding 3 ปีที่แล้ว ⁺¹
Thanks Carolina! Glad you like them!
@crossvalidation1040 2 ปีที่แล้ว
The best explanation to this type of questions!
@michaelq4261 2 ปีที่แล้ว
Yes, the coin problems!
@Han-ve8uh 3 ปีที่แล้ว ⁺¹
Hi Emma, loved these examples of using previous ideas to answer next questions. Where can i find more of such?
I also wish to clarify some concepts in the "improving efficiency" section. I understand the theory at 3:25 that the sum of prob of 1st case in the 3 pairs = sum of prob of 2nd case in the 3 pairs. However, what exactly does this mean in terms of how we make decisions in the experiment? When you mention this sum=sum concept, it makes me think I have to keep on flipping until I can see all of the 1st (or 2nd) combi in each of 3 pairs, meaning 3x4=12 flips to make a decision of heads/tails.
However my intuition is that you actually only need 1 out of any of the 6 combis to reach a (head/tail) decision, that if you get any combi in the 1st 3, that's heads decided, and if any combi in the 2nd 3, thats tails decided. If my guess is true, then I don't see how the 2 concepts of "putting some outcomes into 3 pairs" and "sum 1st element across pairs = sum 2nd element across pairs" is important. I'm thinking there is no need to identify 3 pairs. As long as out of all the outcomes, I can find at least 1 pair (2 diff combis with same prob), then I can assign 1 combi in the pair to head and other to tail already, so i don't see how this identifying 3 pairs concept helps.
Could you comment whether my thinking is wrong and how the 3 pairs and sum=sum translates to decision making in the experiment?
I see the general idea of this improving efficiency is to reuse inconclusive cases from previous experiments, can we stretch this further to 8 flips to also make HHHH and TTTT useful? Like applying "improving efficiency" onto the "improving efficiency" solution recursively forever to reduce the rejection probability even further from the current 0.81^2 + 0.01^2 = 0.6562?
@emma_ding 3 ปีที่แล้ว ⁺²
1. Those questions come from researching and organizing of questions when I prepare for interviews. You can take a look at reddit, classdoor or some other forums for interview questions.
2. You intuition of only needing 1 pair (2 permutations) out of the 6 is correct. We want to use as many pairs as possible to increase the efficiency.
3. Using 8 flips will further improve the efficiency. Intuitively, you are recycling the information in the previous tosses instead of throwing them away this will make your strategy more efficient.
@sitongchen6688 ปีที่แล้ว
@@emma_ding Hi Emma, thanks for sharing! I have the same question in mind. that means we will need 12 flips to decide one head or tail events, is it correct? thanks!
@LouisChiaki 3 ปีที่แล้ว ⁺²
Sounds very close to talking about entropy :D
@SuperLOLABC 3 ปีที่แล้ว ⁺¹
Hi Emma, can you suggest where one can prepare for behavioural interviews and stats & probability interview. Also how common do you think non data structure Python questions are?
@emma_ding 3 ปีที่แล้ว ⁺³
I have listed materials I used in this post. towardsdatascience.com/how-i-got-4-data-science-offers-and-doubled-my-income-2-months-after-being-laid-off-b3b6d2de6938. For non data structure coding, you can refer to towardsdatascience.com/the-ultimate-guide-to-acing-coding-interviews-for-data-scientists-d45c99d6bddc.
@SuperLOLABC 3 ปีที่แล้ว
@@emma_ding Thanks for the reply! I saw in one of your other video that you scheduled an interview a month from getting in contact with a recruiter. I'm curious to know if I can schedule a technical phone screen 4 weeks from a recruiter call and an on-site 5-6 weeks from the technical phone screen, or is the timeline too long?
@jiayiwu4101 3 ปีที่แล้ว
Great video! Thank you! But even if we toss 4 times, we can not guarantee to get 4 different results. How would 2^m be enough?
@emma_ding 3 ปีที่แล้ว
It's the number of permutations. For 2 tosses you get 4 permutations and for 3 tosses you get 8. To represent (at least) N numbers, you need ceil(log2(N)) tosses.
@jiayiwu4101 3 ปีที่แล้ว
@@emma_ding I see. Thank you!
@wuliwala3761 2 ปีที่แล้ว
For the biased coin problem with extreme p, for example p=0.9, if the new way (4 coins) is used as you suggested, I got P(throwing away results)=1- (.9^3*.1*2+.9^2*.1^2*2+.1^3*.9*2)=0.8362, which is surprisingly even worse than using 2 coins. I used the other way and get the same result 9*.1*2+.81^2+.01^2=0.8362. Where am I wrong?🙃
@firesongs 2 ปีที่แล้ว
When you get HT and TH from the Biased coin, why does HT mean that it is Heads and TH mean that it is Tails?
@qifeizhang4834 3 ปีที่แล้ว
you jumped from 2 toss to 4 toss, why not consider 3 toss? is it because of efficiency?
@Han-ve8uh 3 ปีที่แล้ว
@qifei zhang I attempted 3 toss (reuse last toss instead of throwing away 2 tosses described in video) efficiency calculation above, do you agree with that reasoning?
@iriswang8401 3 ปีที่แล้ว ⁺⁵
We can glue two biased coins together to get a fair coin lol. Well, this is a brain teaser answer.
@anandvyavahare2031 2 ปีที่แล้ว ⁺¹
I will never be able to crack these type of questions...

ต่อไป

เล่นอัตโนมัติ

Top 5 Statistics Concepts in Data Science Interviews: P-value, Confidence Interval, Power, Errors