Recommendation Engine Design Deep Dive with Google SWE! | Systems Design Interview Question 20

แชร์
ฝัง
  • เผยแพร่เมื่อ 28 พ.ย. 2024

ความคิดเห็น • 42

  • @surendralamichhane8429
    @surendralamichhane8429 2 ปีที่แล้ว +5

    You are a true gem, already been following lot of your contents in youtube. You articulate the system design cases and solutions so well with in-depth explanations of the technologies used.

  • @Lens_lores
    @Lens_lores 2 หลายเดือนก่อน +1

    I have been asked recommendation systems in system design interview because I worked in a team that does it. Really helpful

  • @clara.hwajoon
    @clara.hwajoon 10 หลายเดือนก่อน +2

    Hi! what are the ranking models exactly taking in and putting out? I thought that from the retrieval, nodes containing the lists of candidates would have scores associated to each candidate based on the neighbors being close. And then filtering happens with bloom filters. thank u!

    • @jordanhasnolife5163
      @jordanhasnolife5163  9 หลายเดือนก่อน

      Hey Clara!
      So in the retrieval phase, we hit our embedding caches to find all of the possible suggestions that we may give to a user. Then we filter, to ensure that we don't show the user any content that they've seen, or shouldn't be shown to them for whatever reason.
      In the ranking phase, we have room to rank a more limited search space of potential items (because we limited them before) according to user specific criteria. If we retrieved items that already had a score, it would mean that we would either need a copy of the embeddings per user (probably impossible) or that we couldn't account for any user specific information when making our rankings (not ideal).
      Hope this makes sense!

    • @clara.hwajoon
      @clara.hwajoon 9 หลายเดือนก่อน

      @@jordanhasnolife5163Thanks for the answer + thanks for the help! love your videos.

  • @tyson96
    @tyson96 2 ปีที่แล้ว +3

    This video randomly popped up in my recommendations (pun intended) and it was a very good recommendation. Really like the detailed explanation. Follower += 1

    • @jordanhasnolife5163
      @jordanhasnolife5163  2 ปีที่แล้ว +1

      Thanks Rahul! Funny to see the system in action haha

  • @alvinryder0718
    @alvinryder0718 10 หลายเดือนก่อน +2

    Have You thought about how you will work with a recommendation system if we have addition data on users such as age/ location/handset/Time based recommendation and their activity as well as adding different types of tags such as Content/ category/ genre/ actors/ director/ content creator etc.

    • @jordanhasnolife5163
      @jordanhasnolife5163  10 หลายเดือนก่อน +1

      How would you say this would deviate from our traditional recommendation system pipeline? At least to me, these just seem like more features with which to make decisions!

  • @harshbhatt_
    @harshbhatt_ 8 หลายเดือนก่อน +1

    If the embedding model is trained offline, how will it recommend most up to date data? (ie a new video got created 10 mins ago) In that case, I'm a bit confused on how it's real-time? Also do we not need to use Spark Streaming anywhere in this design? Thanks!

    • @jordanhasnolife5163
      @jordanhasnolife5163  7 หลายเดือนก่อน

      It probably won't. That being said, you can combine those suggestions with things like subscriptions (see Twitter design) to deliver videos of creators that you subscribe to to your news feed

  • @connorunderwood2743
    @connorunderwood2743 หลายเดือนก่อน +1

    Why would the recommendation cache not grab data from our items db? isn't the hadoop file storage not preprocessed from our spark job yet, so that data isn't valuable for recommendations?

    • @jordanhasnolife5163
      @jordanhasnolife5163  หลายเดือนก่อน

      Well I would imagine we'll have to search the items DB when we look at our most recently interacted-with items, get their neighbors, and then fetch the items corresponding to those IDs, unless the items themselves are denormalized and stored in the neighbor index.

  • @harshbhatt_
    @harshbhatt_ 8 หลายเดือนก่อน +1

    One more question: Does the embedding model + index only look at the item embeddings? (videos recently watched) or does it take user information as well (I set my preferences as liking action movies)
    Also how would it worked for a cold start user? (user does not have any "'last 5 videos watched" to get 1000 nearest neighbors?

    • @jordanhasnolife5163
      @jordanhasnolife5163  7 หลายเดือนก่อน

      Embedding model doesn't have any user info or else we'd have to duplicate all the embeddings, for the first open we can just show popular videos or something

    • @jordanhasnolife5163
      @jordanhasnolife5163  7 หลายเดือนก่อน

      Embedding model doesn't have any user info or else we'd have to duplicate all the embeddings, for the first open we can just show popular videos or something

  • @ayanSaha13291
    @ayanSaha13291 4 หลายเดือนก่อน +1

    Great video Jordan! Nicely explained.

  • @aryamansinha2932
    @aryamansinha2932 7 หลายเดือนก่อน +1

    hello.. all model training seems offline with the 2nd approach..so how is it real time?

    • @jordanhasnolife5163
      @jordanhasnolife5163  7 หลายเดือนก่อน

      The real time part is just asking for more videos. That being said, I suppose you could use stream processing to create embeddings of newly created videos on the fly and then include those as candidates for new suggestions.

  • @LeoGoldenLabrador
    @LeoGoldenLabrador 2 ปีที่แล้ว +1

    Can you a video to talk more about KAFKA? why should we use it and when in SD?

    • @jordanhasnolife5163
      @jordanhasnolife5163  2 ปีที่แล้ว +1

      I recommend watching my stream processing video - it's a log based message broker.

  • @सायत्की
    @सायत्की 2 ปีที่แล้ว +1

    Hey just a silly question are you same guy in your dp , if yes you are a gem and a meme legend too

    • @jordanhasnolife5163
      @jordanhasnolife5163  2 ปีที่แล้ว

      No I'm not haha, I just also enjoy the meme - it may be time for me to change it tbh

  • @anahata.kaalki
    @anahata.kaalki 4 หลายเดือนก่อน +3

    I saw comments and thought it's a good video but It was a huge waste of time. Maybe interesting for people who have Zero idea about tech.

  • @AidamanTV2
    @AidamanTV2 2 ปีที่แล้ว +2

    You're the Yin to my Neetcode :)

  • @Prem-xe8hk
    @Prem-xe8hk 2 ปีที่แล้ว +1

    Can you please elaborate more on the neighbouring index?? and how the vector model linked to this index? Anyways Great content (y)

    • @jordanhasnolife5163
      @jordanhasnolife5163  2 ปีที่แล้ว +2

      It's a map from (vector) to [closest neighboring vectors]. Feel free to elaborate on your question and I'll do my best to help!

  • @Gerald-iz7mv
    @Gerald-iz7mv ปีที่แล้ว

    what are item change events?

    • @jordanhasnolife5163
      @jordanhasnolife5163  ปีที่แล้ว

      Mind giving me a timestamp to reference?

    • @Gerald-iz7mv
      @Gerald-iz7mv ปีที่แล้ว

      @@jordanhasnolife5163 in the architecture diagram here at timestamp 306 sec

    • @Shufjskkskf
      @Shufjskkskf ปีที่แล้ว +1

      @@jordanhasnolife5163 top right in diagram at 5:13

    • @jordanhasnolife5163
      @jordanhasnolife5163  ปีที่แล้ว

      @@Shufjskkskf Ah ok - just describes any changes made to the set of items that we could possibly be recommending (adds, removes, edits of an existing item) as resulting from user action

    • @Gerald-iz7mv
      @Gerald-iz7mv ปีที่แล้ว

      @@jordanhasnolife5163 could that be a good usecase to store it in a feature store? could the user activity also be stored in a feature store? in case of a music recommendation system are those infos regarding: song titles, text, images, artists, music genre, audio analytics?

  • @himanshukriplani6867
    @himanshukriplani6867 6 หลายเดือนก่อน +1

    I personally liked your ipad notes and prev method more than this.

    • @jordanhasnolife5163
      @jordanhasnolife5163  6 หลายเดือนก่อน +1

      My iPad notes is my current method - that's why I'm remaking them

  • @muhammadhabibullah1205
    @muhammadhabibullah1205 2 ปีที่แล้ว +1

    great content!