22: Recommendation Engine (YouTube, TikTok) | Systems Design Interview Questions With Ex-Google SWE
ฝัง
- เผยแพร่เมื่อ 24 มิ.ย. 2024
- To try everything Brilliant has to offer-free-for a full 30 days, visit brilliant.org/Jordanhasnolife/ . You’ll also get 20% off an annual premium subscription.
Remember, girls love ML
This video was sponsored by Brilliant - วิทยาศาสตร์และเทคโนโลยี
Hey Jordan, I recently joined Google as an SSE and I wanted to express my sincere gratitude for your system design videos, especially the ones comparing multiple solutions. Those comparisons were exactly what the interviewers were looking for in my feedback.
Legend!! Congrats man and enjoy the new role!
Congrats on the sponsor bro! Keep up the good work
Incredible ad read 😂
Just found you channel few days ago. And i'm watching most of your prev videos. Liked the one Message Brokers.
Thank you for these videos!
TIL what embedding is! Congrats on the sponsor BTW!!!
Hey Jordon, great content. Thank you for making these videos in depth. QQ- Do you think we can use graph databases such as neo4j instead of neighbor index for faster reads.
I think that you could, but consider this - for every vector (which is an arbitrary set of points), you'd need to create an edge to other vectors, so that you can traverse the graph. How do you decide which ones to do that for? Even then, let's imagine you could - you'd still have to run a breadth first search to find the closest vectors. I'd think that pre-caching your answers here will just about always be the fastest option.
can you please share your slides as well. it will be really helpful.
Planning on doing this in bulk after finishing my current series, this will be in the next 1-3 months.
Great video! I see that you used a heap for new entries into the closest neighbor index. Isnt insertion time into a heap the same O(Logn) as would be in a db index which uses B+ trees? Do understand that in the index we might need to replace multiple rows vs using a heap that wont happen. Is that the optimization here? Trying to understand how this speeds up things.
The optimization is that this is in memory
Great video Jordan! Learned a lot from this video. One question on Recommendation Service -> Neighbor index flow at 41:18 . Since we are sharding Neighbor index by entity_id, the recommendation services, in case of cash miss, has to scatter and gather right? Entity 12, 13, 62 (examples in the slide) could be in different partition
They would have to fetch the neighbors for their last x watched videos. So for each of those x videos, all of its neighbors will be on the same node, but otherwise we may have to hit up to x different partitions.
Great video Jordan! Can you do one for ACID based system like Digital wallet or Bank ? or a combination of both like bank to wallet and wallet to wallet may be ?
Where do you see the challenge here? At least to me, this initially just feels like you'll need ACID databases, or two phase commit when making a transaction between two accounts on different partitions.
Techlead catching well deserved strays
Hey Jordan. What books would you recommend I read? I have already finished DDIA.
Hey! I'd probably start reading some white papers! As for which ones, there are like 10-20 tools on LinkedIn who only post links of other people's content on their pages, hopefully one of them is decent
Hi Jordan, can you please share your ipad notes. Maybe they are not perfect but they serve as some sort of reference to revise
Planning on doing this in bulk after finishing my current series, this will be in the next 1-3 months.
@@jordanhasnolife5163 Hi, don't mean to rush you but I have some important interviews in coming weeks and having your notes will really help me prep better. Can you share them in any form. I understand there can be mistakes or typos in them but I want to be able to quickly revise all the overarching concepts and designs
@@vipulspartacus7771 Hi Vipul - understand your rush here, it will take me a few hours to properly export everything, which is the reason for the delay. I haven't sat down and done it. Additionally, once I do, I'd like to publicize that a bit, as I hope that they can help me build my following if we're being fully transparent here. My original slides contain all of the same information.
@@jordanhasnolife5163 Sure Jordan, I understand, I look forward to it. Once again, really appreciate the content
To try everything Brilliant has to offer-free-for a full 30 days, visit brilliant.org/Jordanhasnolife/ . You’ll also get 20% off an annual premium subscription.
Hey Jordan, why an 'In-memory-broker' instead of a broker like Kafka?
Hey! For the sake of this video, it probably doesn't have to be, but I'd say check out my video on how to design youtube
Hey Jordan! Are you looking to adopt by any chance? Jk Love your content ❤
Are you a full time TH-camr now?
Nope, still working for better or for worse lol
Happy to adopt - you any good at cooking?
does this design account for popularity/ trendiness of a given entity? For example if a random video from an unknown creator becomes suddenly extremely popular
(happens a lot on tiktok) it should be recommended whereas an hour previous it was unpopular and irrelevant thus should not have been recommended
It does not, and good point! I think for something like this you'd want to see the Top K video, and basically keep a cache of which videos are "trending" in the last x hours to apply a score boost.
Steps 2b and 2a should be switched.
Sure, doesn't really change the time complexity either way but I can see that if scoring is expensive
@@jordanhasnolife5163 Scoring is usually O(n!*(e^n)*log(n)), where n is the number of items to be scored. We need to filter in advance.
@@jordanhasnolife5163 Scoring is usually O((N!)(e^N)log(N)). We need to save where we can.
agree tech lead is a sham that shafted his followers (as a millionaire)
I might do it too (as a non millionaire)
@@jordanhasnolife5163 yes jordan responded to me! it would be an honor to be your victim, sempai