The *Right* Way to do Batch Job Data Joins | Systems Design Interview 0 to 1 with Ex-Google SWE

แชร์
ฝัง
  • เผยแพร่เมื่อ 22 ม.ค. 2025

ความคิดเห็น • 20

  • @kunalsinghal3558
    @kunalsinghal3558 9 หลายเดือนก่อน +3

    "10k subscriber mark which I should hit and probably I don't know one to two years from now" . Congrats boi 28.7k subs done. Miles to go ahead.

    • @jordanhasnolife5163
      @jordanhasnolife5163  9 หลายเดือนก่อน

      🚀🚀

    • @yaswanth7931
      @yaswanth7931 8 หลายเดือนก่อน

      @@jordanhasnolife5163 30.7k 🚀, 2k in 2 weeks

  • @recursion.
    @recursion. ปีที่แล้ว +4

    Hey man, would be interested to make a video about your learning journey about the technologies, languages, framework and other system design topics? Just wondering how a massive chad sigma 10x programmer journey would look like...

    • @jordanhasnolife5163
      @jordanhasnolife5163  ปีที่แล้ว +1

      Ah well, 10k should be coming up soon... Perhaps then :)

    • @recursion.
      @recursion. ปีที่แล้ว +2

      @@jordanhasnolife5163 Are you asking me to buy 300 subscribers? [insert "bombastic side eye dog pic"]

    • @jordanhasnolife5163
      @jordanhasnolife5163  ปีที่แล้ว +3

      @@recursion. I'm certainly not asking you not to! (*cute anime girl expression*)

    • @recursion.
      @recursion. ปีที่แล้ว +1

      @@jordanhasnolife5163 I don't know what you want (300 sub otw🤤)

    • @recursion.
      @recursion. ปีที่แล้ว

      @@jordanhasnolife516380 less subs 😩😩 (please make my wish come true tho)

  • @sahilguleria6976
    @sahilguleria6976 5 หลายเดือนก่อน +1

    Hello Sir, I’ve got to be honest sort then shuffle is still shuffling my brain. I’m not seeing how sorting is helping here. From what I understand, after sorting, we hash the key to decide its partition and then add the key-value pair. But wouldn’t the merge process be the same even if we skipped the sorting foreplay? Or am I missing some magical sorting powers here?
    Or Are we merging after the sort itself and then this sorted and merged value are then again merged on the partition?

    • @jordanhasnolife5163
      @jordanhasnolife5163  5 หลายเดือนก่อน

      Think about the time complexity of sorting merged lists versus sorting unmerged lists. In the shuffle phase we're still sending the data in sorted form to each reducer.

  • @sahilguleria6976
    @sahilguleria6976 5 หลายเดือนก่อน +1

    What does it mean that the merging can be done entirely on disk? My little mind just can’t seem to comprehend it!

    • @jordanhasnolife5163
      @jordanhasnolife5163  5 หลายเดือนก่อน

      You don't need to load the whole dataset in memory. Just a one entry at a time from each data set that you're merging.

  • @ReadTheUnderstory
    @ReadTheUnderstory 10 หลายเดือนก่อน +1

    Just curious, where did you source your information for this particular video? DDIA doesn't do a super clear job explaining how the different joins work imo, and other online sources I've looked at are equally vague.

    • @jordanhasnolife5163
      @jordanhasnolife5163  10 หลายเดือนก่อน +4

      I worked pretty much exclusively on big data pipelines for a bit haha. But besides that, most of these lessons from DDIA you can think through yourself. When would it help me to perform a sort merge join? If my data isn't already sorted, what would be the penalty there?

  • @htm332
    @htm332 ปีที่แล้ว +3

    Solid video my guy. What resources do you use to learn this stuff?

    • @jordanhasnolife5163
      @jordanhasnolife5163  ปีที่แล้ว +2

      DDIA, random TH-cam videos/websites, my own experience

    • @htm332
      @htm332 ปีที่แล้ว +1

      @@jordanhasnolife5163 care to share any of those YT channels/sites?

  • @user-se9zv8hq9r
    @user-se9zv8hq9r ปีที่แล้ว +3

    so whens the onlyfans starting