Hi Nate, my solution is:- select (count(*))/(select count(*) from fb_active_users where country='USA') as monthly_share from fb_active_users where status='open' and country='USA';
Hi Nate, now I'm beginning to practice the steps on solving the problem, not just go directly and dive to the coding. your steps is very logical and can be very advantageous in more complex question. Divide and conquer is the focus to come up with a logical solution. Thanks for continuous sharing your knowledge.
@@stratascratch BTW I'm incline to subscribe to your platform for Premium account to learn in deep on SQL solution for data science and maybe python as well. Does StrataScratch team help out if member is having difficulty solving any of hard problem in there ?
@@PATRICKCHUAD Hey that's great. yes, we do help with any questions you have. There's a discussion board for each question. If you leave a question, we can answer them for you. It takes about 2-3 days for an answer. Check out the free questions to see the video solutions and discussion forum before upgrading. See if you like it!
@@stratascratch Yes I have visited the site already several time and also tried to solve the problem in there. That is why I want to upgrade to premium to know if my solution is correct and also to know other solution possible to solve the problem. I pretty much like the platform. I maybe get the 1 year premium plan since it is a lot cheaper compared to the monthly one.
Hi Nate, awesome video! Your content is really helping me with my upcoming interview with Facebook. I was wondering, would it have been possible to just do a left join with two subqueries instead? Or are you just optimizing for runtime
You could do that but it would be really inefficient and unnecessary. This query is actually already pretty inefficient. I wouldn't even use a subquery to answer this. I would just use the case statements in the numerator and perform a count(*) for the denominator. That is probably the most efficient way to solve it. If you go to the platform, there is a tab called "solution from other users" where you'll find other ways users have solved the question. Some are really clever and concise. Goodluck on FB!
@@mrblahblihblih It's not that important in an interview. Not THE most efficient that is. Your code should make sense and not have a bunch of unnecessary sections/clauses. You'll usually have an opportunity in the end to optimize it so you can tell the interviewer how you'd write more efficient code.
From what I recall, their video screening (1st round) is a SQL test that's on their platform called BlueJeans (or something like that). Basically it's a glorified notepad. You can't execute any code. The in-person rounds are whiteboard. Hope that helps!
I did it this way : with usa_people_count as ( select count(*) as total_people_count from fb_active_users where country = 'USA' ), active_people_count_in_usa as ( select count(*) as active_people_count from fb_active_users where country = 'USA' and status = 'open' ) select cast(active_people_count as decimal) / total_people_count as share from usa_people_count, active_people_count_in_usa;
I wrote my code just before you explained your solution and the only thing I changed is I passed the step with doubling user_id checking, instead I counted them with distinct prefix "count(distinct user_id)", everything else was the same (including nested select).
Hi Nate, I didn't get it why you didn't select the distinct user during your code. as count(case when status='open' then user_id else null) if a user showed twice, it will be 2 for his count
An attendance log for every student in a school district attendance_events : date | student_id | attendance • A summary table with demographics for each student in the district all_ students : student_id | school_id | grade_level | date_of_birth | hometown Using this data, you could answer questions like the following: • What percent of students attend school on their birthday? • Which grade level had the largest drop in attendance between yesterday and today?
Is casting the ratio as a float something that needs to be done only in certain SQL environments? The mysql editor on leetcode does produce a ratio when dividing an integer by another integer. Maybe this is because I used the round function though to go out 2 decimal places?
It depends on the data type of the number. If they are both integers and you're dividing them, then you'll get an integer as the output, regardless of SQL engine being used. But if one of the numbers is a numeric or float then you'll get back a float/numeric. I would be wary of sql editors on platforms like leetcode, hackerrank, and stratascratch because there's a lot of things going on in the background that are outside of normal sql behavior. For example, with stratascratch we pass the sql output to python so that we can run our solution validation algorithm to see if you got the right answer. The conversion from sql to python isn't always clean and isn't always tied to how sql would normally behave...that's why you see our float outputs rounded to the nearest 3rd decimal. Hope this makes sense.
Hey @ Nate, great video’s and thanks so much for the practical scenarios. I have been asked about Mapping in a data analysis interview and was told alot of mapping will be part ofthe job. Any videos on that any time soon ? Thanks
@@stratascratch thank you, Hopefully better luck next time. Mapping data base. I didn’t understand and probably thats why I didn’t pass. The company had Apache Hive. Im guessing it had to do with creating tables not just query from them. Is that a normal task of a data analysts ?
@@MrJohn2brown To me mapping means creating tables where you're mapping one table to another. Basically it's an intermediate table that connects 2 other tables together. Sometimes there's a lot of clean up and logic required to map 2 tables together so a mapping/intermediate table can be used. Or another definition could be creating a table that has data from multiple other tables together. Like if you have a user table but some info comes from engagement tables and other info comes from a payments table. That could be considered a mapping table too. Both require creating databases. Hope that helps.
Here's my solution: select count(case when status = 'open' then user_id else null end)*1.0/count(*)*1.0 as active_user_share from fb_active_users where country = 'USA'
User_id 82: country Australia, User_id 34: Donald Ross country China in the table. But when you filtered out the country, both of those users are shown in the output. Could you please explain it?
Thank you and thanks for watching these videos. I try to make these videos from the POV of the interviewee or at least how I would want an interviewee to answer the question if I was the interviewer.
how to reply if the interviewer ask, " if you are solving the question in intervals, will you be doing the same while on the job?" Just want to think of all possible questions can be asked in the coding interview.
Yea total valid point. In an interview, I would just split up the logic into steps and then write the code in steps. Then pause so that the interviewer can evaluate the code. This way they can catch any problems before you write out the entire solution. This framework has saved me several times on interviews =)
Hi Nate, Does the SQL implementation (Oracle, MySQL, PostgreSQL..) matters when interviewing with FB? Which one would you recommend for FB particularly
It doesn't at all. You basically just get a notepad when interviewing with FB. None of your code will execute so you can write in any SQL flavor you want.
Hi Nate, thanks for the awesome videos. At 7:13, I dont understand how count(NULL) will NOT count that particular row. I was going to use SUM(CASE(WHEN status = open, 0, 1)) but will a COUNT skip a row having a NULL value? thanks
That's basically how the functions are meant to work. count() won't count any NULL rows and sum() will count all rows whether Null or not. In your case the sum() + Case statement is how you should approach it if you want to count only opens. Here's more info discuss.codecademy.com/t/when-do-we-use-count-or-sum/351543
My Solution: select us_active_users/us_total_users::decimal share_active_users from (select count(case when status='open' and country='USA' then user_id else null end) us_active_users, count(case when country='USA' then user_id else null end) us_total_users from fb_active_users)a;
Hi Nate, my solution is:-
select (count(*))/(select count(*) from fb_active_users where country='USA') as monthly_share from fb_active_users
where status='open' and country='USA';
That is just wonderful!
Nate your content is gold! Thanks for your detailed walkthrough of the problem and solution. It is a huge help for data science interviewees like me!
Awesome, thank you!
Hi Nate, now I'm beginning to practice the steps on solving the problem, not just go directly and dive to the coding. your steps is very logical and can be very advantageous in more complex question. Divide and conquer is the focus to come up with a logical solution. Thanks for continuous sharing your knowledge.
Thanks for continuing to watch the videos. I'll keep pumping more and more out!
@@stratascratch BTW I'm incline to subscribe to your platform for Premium account to learn in deep on SQL solution for data science and maybe python as well. Does StrataScratch team help out if member is having difficulty solving any of hard problem in there ?
@@PATRICKCHUAD Hey that's great. yes, we do help with any questions you have. There's a discussion board for each question. If you leave a question, we can answer them for you. It takes about 2-3 days for an answer. Check out the free questions to see the video solutions and discussion forum before upgrading. See if you like it!
@@stratascratch Yes I have visited the site already several time and also tried to solve the problem in there. That is why I want to upgrade to premium to know if my solution is correct and also to know other solution possible to solve the problem. I pretty much like the platform. I maybe get the 1 year premium plan since it is a lot cheaper compared to the monthly one.
Seems no paypal option for payment. Is it correct ?
This was really helpful! Thank you very much. Please do more such videos!
Your content is amazing. I love the way you have explained the solutions.
Really nice video. Thank you so much!
Hi Nate, awesome video! Your content is really helping me with my upcoming interview with Facebook. I was wondering, would it have been possible to just do a left join with two subqueries instead? Or are you just optimizing for runtime
You could do that but it would be really inefficient and unnecessary. This query is actually already pretty inefficient. I wouldn't even use a subquery to answer this. I would just use the case statements in the numerator and perform a count(*) for the denominator. That is probably the most efficient way to solve it. If you go to the platform, there is a tab called "solution from other users" where you'll find other ways users have solved the question. Some are really clever and concise. Goodluck on FB!
@@stratascratch got it thanks! Would you say writing the most efficient query is important in the interview?
@@mrblahblihblih It's not that important in an interview. Not THE most efficient that is. Your code should make sense and not have a bunch of unnecessary sections/clauses. You'll usually have an opportunity in the end to optimize it so you can tell the interviewer how you'd write more efficient code.
Hi nate your vedios boost my confidence man Thanks a Lot
Glad to have helped! Good luck on your interviews if you have any lined up
@@stratascratch I want to do my master's on data engineering any suggestion
hey guys, does the fb data analyst interviews has whiteboard or IDE for sql test?
From what I recall, their video screening (1st round) is a SQL test that's on their platform called BlueJeans (or something like that). Basically it's a glorified notepad. You can't execute any code. The in-person rounds are whiteboard. Hope that helps!
@@stratascratch yes..that is right, thanks alot 😊
I did it this way :
with usa_people_count as (
select count(*) as total_people_count from fb_active_users where country = 'USA'
), active_people_count_in_usa as (
select count(*) as active_people_count from fb_active_users where country = 'USA' and status = 'open'
)
select cast(active_people_count as decimal) / total_people_count as share from usa_people_count, active_people_count_in_usa;
I wrote my code just before you explained your solution and the only thing I changed is I passed the step with doubling user_id checking, instead I counted them with distinct prefix "count(distinct user_id)", everything else was the same (including nested select).
That looks like it works! Great job and thanks for following along with the video.
Hi Nate, I didn't get it why you didn't select the distinct user during your code. as count(case when status='open' then user_id else null)
if a user showed twice, it will be 2 for his count
That's a great point! I should have caught that edge case but I didn't. We should definitely add a distinct so we don't double count.
Hey Nat, by looking the question, i am not able to find where question is asking about the ratio..???
Great video!
Thank you for watching!
An attendance log for every student in a school district attendance_events :
date | student_id | attendance
• A summary table with demographics for each student in the district all_
students : student_id | school_id | grade_level | date_of_birth | hometown
Using this data, you could answer questions like the following:
• What percent of students attend school on their birthday?
• Which grade level had the largest drop in attendance between yesterday
and today?
Is casting the ratio as a float something that needs to be done only in certain SQL environments? The mysql editor on leetcode does produce a ratio when dividing an integer by another integer. Maybe this is because I used the round function though to go out 2 decimal places?
It depends on the data type of the number. If they are both integers and you're dividing them, then you'll get an integer as the output, regardless of SQL engine being used. But if one of the numbers is a numeric or float then you'll get back a float/numeric. I would be wary of sql editors on platforms like leetcode, hackerrank, and stratascratch because there's a lot of things going on in the background that are outside of normal sql behavior. For example, with stratascratch we pass the sql output to python so that we can run our solution validation algorithm to see if you got the right answer. The conversion from sql to python isn't always clean and isn't always tied to how sql would normally behave...that's why you see our float outputs rounded to the nearest 3rd decimal. Hope this makes sense.
@@stratascratch Awesome answer, thanks!
Hey @ Nate, great video’s and thanks so much for the practical scenarios. I have been asked about Mapping in a data analysis interview and was told alot of mapping will be part ofthe job. Any videos on that any time soon ? Thanks
Hi Tahir. Great job on your analysis interview so far. Can you elaborate on mapping? What do you mean exactly?
@@stratascratch thank you, Hopefully better luck next time. Mapping data base. I didn’t understand and probably thats why I didn’t pass. The company had Apache Hive. Im guessing it had to do with creating tables not just query from them. Is that a normal task of a data analysts ?
@@MrJohn2brown To me mapping means creating tables where you're mapping one table to another. Basically it's an intermediate table that connects 2 other tables together. Sometimes there's a lot of clean up and logic required to map 2 tables together so a mapping/intermediate table can be used. Or another definition could be creating a table that has data from multiple other tables together. Like if you have a user table but some info comes from engagement tables and other info comes from a payments table. That could be considered a mapping table too. Both require creating databases. Hope that helps.
@@stratascratch I see, so its like joining on temp tables. Thanks Nate
@@MrJohn2brown Similar yes. That's my guess without actually seeing the question or exercise you were required to do =)
Here's my solution:
select count(case when status = 'open' then user_id else null end)*1.0/count(*)*1.0 as active_user_share
from fb_active_users
where country = 'USA'
User_id 82: country Australia, User_id 34: Donald Ross country China in the table. But when you filtered out the country, both of those users are shown in the output. Could you please explain it?
What's the code you used? Not sure why you're getting that
Solid thanks man
Hi Nate i just wanna say thanks for the videos! Your data science interview question with solutions are really helpful and engaging:)
Thank you and thanks for watching these videos. I try to make these videos from the POV of the interviewee or at least how I would want an interviewee to answer the question if I was the interviewer.
how to reply if the interviewer ask, " if you are solving the question in intervals, will you be doing the same while on the job?"
Just want to think of all possible questions can be asked in the coding interview.
Thanks a lot for that!
I am afraid if in real interviews we can test parts of logic or carry out code increment? What do we do in that case?
Yea total valid point. In an interview, I would just split up the logic into steps and then write the code in steps. Then pause so that the interviewer can evaluate the code. This way they can catch any problems before you write out the entire solution. This framework has saved me several times on interviews =)
Which portal you use for practise?
I'm using www.stratascratch.com
How can I apply for a Data Base on facebook?.
Hi Nate,
Does the SQL implementation (Oracle, MySQL, PostgreSQL..) matters when interviewing with FB?
Which one would you recommend for FB particularly
It doesn't at all. You basically just get a notepad when interviewing with FB. None of your code will execute so you can write in any SQL flavor you want.
@@stratascratch
Thank you Nate, this is very helpful
Hi Nate, thanks for the awesome videos. At 7:13, I dont understand how count(NULL) will NOT count that particular row. I was going to use SUM(CASE(WHEN status = open, 0, 1)) but will a COUNT skip a row having a NULL value? thanks
That's basically how the functions are meant to work. count() won't count any NULL rows and sum() will count all rows whether Null or not. In your case the sum() + Case statement is how you should approach it if you want to count only opens. Here's more info discuss.codecademy.com/t/when-do-we-use-count-or-sum/351543
you are not allowed to explore the data
yeah exactly. during the interview you are just given a preview, and a static CoderPad environment with no way to run and test output
My Solution:
select us_active_users/us_total_users::decimal share_active_users
from
(select
count(case
when status='open' and country='USA'
then user_id else null end) us_active_users,
count(case
when country='USA'
then user_id else null end) us_total_users
from fb_active_users)a;