What happens when ID column reaches its max value - Dissecting GitHub Outage

แชร์
ฝัง
  • เผยแพร่เมื่อ 22 ธ.ค. 2024

ความคิดเห็น • 69

  • @tejasthorat
    @tejasthorat 4 หลายเดือนก่อน +20

    TH-cam Short Worked !!!

  • @baibhabmondal1740
    @baibhabmondal1740 2 ปีที่แล้ว +6

    Thinking of taking up your masterclass. Scouting some more videos before taking the call. So far so good! Finally someone who sees past the interviews and is a true advocate of CS!

    • @AsliEngineering
      @AsliEngineering  2 ปีที่แล้ว +3

      Thanks. I love CS to the core; and hopefully, I see you in my future cohort :)

  • @mzubair10
    @mzubair10 10 หลายเดือนก่อน +1

    Thanks for making such quality content and making it available to everyone free of cost Arpit. The most valuable resource any human can have is time and someone like you spending your personal time to help others improve, is a significant and noble sacrifice. Thank you Sir!!

  • @siddu6003
    @siddu6003 11 หลายเดือนก่อน +9

    17:07 bruh ☠️

    • @snlagr
      @snlagr 2 หลายเดือนก่อน

      😂😂

  • @anirbanb1908
    @anirbanb1908 2 ปีที่แล้ว +20

    Would you not do "Insert into users select * from users_old order by id desc". It is more probable that our application READS the recent writes more compared to the old ones.

    • @AsliEngineering
      @AsliEngineering  2 ปีที่แล้ว +4

      Yeah for sure. Users can start seeing the recent items quicker. Thanks.

    • @karanverma9197
      @karanverma9197 11 หลายเดือนก่อน

      How much approximate time would it take to copy 2 billion rows to the new table?

    • @guytonedhai
      @guytonedhai 9 หลายเดือนก่อน +1

      @@karanverma9197 Depends on the number of columns, etc. But we ran a migration for 100 million records, each row being around 3000 bytes on a 16 core machine with 64 GB ram. This took us around 10 minutes.

  • @vasubabujinagam5977
    @vasubabujinagam5977 4 หลายเดือนก่อน +1

    use negative range of int till you fix your code. Create a trigger to auto increment on every insert
    FYI this is a quick workaround and you will have enough time to fix the issue

  • @ankit-jangid
    @ankit-jangid 10 หลายเดือนก่อน

    well said, Arpit. we learn from the dissection more as its more practical. Thanks for sharing the approaches.

  • @harshsingh-eq5ix
    @harshsingh-eq5ix ปีที่แล้ว

    Absolutely amazing Dissection and easy to understand as well.

  • @aniksaha6599
    @aniksaha6599 2 ปีที่แล้ว +6

    Also arpit i think a quick code change till the data is being migrated if userid less than 2^32 Select from oldtable else select from new table

    • @AsliEngineering
      @AsliEngineering  2 ปีที่แล้ว +1

      Yeah. That's a solid hack to apply 🤘

    • @adityadubey5204
      @adityadubey5204 3 หลายเดือนก่อน

      Ye best hai 🙌

  • @ashifismail9606
    @ashifismail9606 4 หลายเดือนก่อน

    Bro, you are a GEM !

  • @abhishekkoranga1378
    @abhishekkoranga1378 2 ปีที่แล้ว +1

    2nd Approach is beauty

  • @NitishChandani-q8g
    @NitishChandani-q8g 9 หลายเดือนก่อน

    Thanks for covering this in great detail!
    Too many ads though breaks the attention span

  • @ajith8730
    @ajith8730 2 ปีที่แล้ว +2

    Awesome explanation 👌🏻 Thanks Arpit. Waiting for next outage😂

  • @akashshirale1927
    @akashshirale1927 2 ปีที่แล้ว +1

    Very Informative video.Thanks for sharing.

  • @akashyadagouda896
    @akashyadagouda896 ปีที่แล้ว

    Good explanation.
    Really faced this issue

  • @Vikashkumar-js8hb
    @Vikashkumar-js8hb ปีที่แล้ว +1

    Hi..I have one doubt. At 22:55 if you have already altered the auto increment to start from 2147483648, then if we execute the last line to copy old data, won't it start adding from the new increment value i.e. from 2147483648, 2147483649 ..etc and there will be no value from 0 to 2147483647...plz explain

    • @babu_g19
      @babu_g19 ปีที่แล้ว

      There would be no data at the time of table creation but slowly data will be added from the users_old table so later on we will have data from 0 to 2147483647

    • @najimali32
      @najimali32 5 หลายเดือนก่อน

      No auto increment will always the take the larger value. If you insert any id below it will let you insert until & unless it is duplicate key.

    • @Spike_104
      @Spike_104 24 วันที่ผ่านมา

      no cause you are also inserting their ids from user_old to user_new auto increment only works as default if nothing is given

  • @tusharabbott
    @tusharabbott 2 ปีที่แล้ว

    Would love to see more like this. Very informative video.

  • @SreekantShenoy
    @SreekantShenoy ปีที่แล้ว +1

    Good info Arpit! I think one question remains how the 'old_users' and 'new_users' table will work along since now the records would be apart (incase required). Creating a view named 'users' fetching from both the tables should be good right?

  • @harshitanand7349
    @harshitanand7349 9 หลายเดือนก่อน

    Great video Arpit. I mean as a developer no one will expect us to fix this as it’s mostly done by DBA but yeah knowing the concept was insightful.
    So next time if Duplicate Key exception arises 🤣🤣🤣 I’ll have this scenario in my mind as well.

  • @windupbirdgaming
    @windupbirdgaming ปีที่แล้ว

    Hey Arpit, Few queries:
    1. Since we are duplicating billions of row from one table to another in the second approach. What if the server does not have the disk space to store duplicate data for both tables.
    2. How to handle this in sharded database case. Will the same approach be able to take care of sharded tables. Like the distribution of rows on different machines etc.

  • @VaibhavVarshney-d4f
    @VaibhavVarshney-d4f หลายเดือนก่อน

    want to know when this happens, can we move to sharding? will it take more time in this case?

  • @alexsanjose
    @alexsanjose 2 ปีที่แล้ว

    Can you create a new table as a UNION of the existing, new tables and rename appropriately to solve the issue quickly?

  • @VishnuVO
    @VishnuVO 10 หลายเดือนก่อน

    What if we have references to the user ID in other tables?

  • @kushalkamra3803
    @kushalkamra3803 2 ปีที่แล้ว

    Awesome, thank you

  • @roshnigupta9296
    @roshnigupta9296 2 ปีที่แล้ว +1

    I am wondering what will happen when the insertion is in progress from users_old to users and there is a query for the data which isn't migrated in that moment of time, IMO, the 2nd approach only holds good, if data from the users' table(in this scenario) is not queried at a higher rate for select operations, else there is another outage to handle for data not found, even though its there just sitting in another table(users_old) 😊

    • @AsliEngineering
      @AsliEngineering  2 ปีที่แล้ว +1

      You first create a blank table with a new name, migrate the data and then rename it to the table name.
      My bad. I should have used the name users_new instead of users_old. Woould have been much clearer.

  • @paulsofts7468
    @paulsofts7468 9 หลายเดือนก่อน

    Hi Arpit,
    I really enjoyed the tutorial. Just out of curiosity, the data migration operation from users_old to users table would potentially range from several hours to days, and in between them there may be a chance of multiple insert operations. So, does it maintain the sequence in the table over the ID column?

  • @adityadubey5204
    @adityadubey5204 3 หลายเดือนก่อน

    Can we use negative ids ?

  • @akashagarwal6390
    @akashagarwal6390 2 ปีที่แล้ว

    silly question - if its a signed int with min value < 0, why the first row insertion started with 1?

  • @PuneetKumar-il4bv
    @PuneetKumar-il4bv ปีที่แล้ว

    Buddy we can't run alter table on such huge tables for that there is a tool percona memory management for online alter. Pt online schema change alters in background.

    • @AsliEngineering
      @AsliEngineering  ปีที่แล้ว

      I know and this is just a demo. Also, percona just creates a shadow table and switches.

  • @sahanahr6261
    @sahanahr6261 2 ปีที่แล้ว +1

    How long will it take to copy from user_old to user table of 2 million records?
    Bcz my doubt is if it takes longer time
    Might cause data inconsistency right as application code search for some record in newly created users table but record from user_old table in not yet migrated

    • @mayankmukundam6404
      @mayankmukundam6404 10 หลายเดือนก่อน

      Migrate first & then renaming would solve this, I guess

  • @guytonedhai
    @guytonedhai 9 หลายเดือนก่อน

    I think renaming till will only work if you have no foreign key reference on the table.

    • @AsliEngineering
      @AsliEngineering  9 หลายเดือนก่อน

      yes. most companies are scale do not have them to allow no downtime migrations.
      most handle this at a transactional level.

  • @saurabhsuman4960
    @saurabhsuman4960 หลายเดือนก่อน

    How the read will work if copy will take days, please explain this part too

    • @AsliEngineering
      @AsliEngineering  หลายเดือนก่อน

      You do an inplace migration.

  • @samarthtyagi3142
    @samarthtyagi3142 9 หลายเดือนก่อน

    what would happen if after 100 we try to add a row with id 95 ?
    would mysql allow it? or only values greater than last row should be added?

    • @najimali32
      @najimali32 5 หลายเดือนก่อน

      It will allow it, If the 95 is not preset in the table then it will allow it.

  • @Gurpreegill1962
    @Gurpreegill1962 9 หลายเดือนก่อน

    Beekeeper Studio is amazing. I like sequel ace tho

  • @sibyabin
    @sibyabin 2 ปีที่แล้ว

    Dont you think the user to user_old rename can also slowdown considering it already had 2B records. ?

    • @AsliEngineering
      @AsliEngineering  2 ปีที่แล้ว +1

      It's just a rename operation. Lightning fast. And not dependent on the amount of data.

  • @prashantshubham
    @prashantshubham 2 ปีที่แล้ว

    Great video, small query when we are altering the table from int to bigint the application level code will also require changes correct? Otherwise failures will be imminent.

    • @AsliEngineering
      @AsliEngineering  2 ปีที่แล้ว

      Why should there be any change in application code, so long as your programming language can handle that integer range? Can you elaborate on possible errors you are concerned about.

    • @prashantshubham
      @prashantshubham 2 ปีที่แล้ว +2

      @@AsliEngineering for example if we are using `int` for ids in Java shouldn't we change `id` data type now to `BigInteger` corresponding to the SQL schema changes?

    • @AsliEngineering
      @AsliEngineering  2 ปีที่แล้ว +2

      @@prashantshubham I assumed you were using Python. But if you are using a type not compatible with the type you are suing then application needs an alteration.

  • @subhaspaul495
    @subhaspaul495 2 ปีที่แล้ว +1

    taking UUID into consideration can solve this issue well

    • @AsliEngineering
      @AsliEngineering  2 ปีที่แล้ว +2

      UUID index very poorly hence is avoided at scale.
      If we consider to use UUID after this outage then a lot of application code and other peripheral systems would need to change of they assumed integer IDs.
      But I see the point in moving into a ID with a larger width and solving the problem.

    • @subhaspaul495
      @subhaspaul495 2 ปีที่แล้ว

      @@AsliEngineering understand your point thanks for correcting me
      I also like to know that how can we optimize uuid for better performance ??
      Pls, share your thoughts

    • @AsliEngineering
      @AsliEngineering  2 ปีที่แล้ว +1

      @@subhaspaul495 you cannot optimise. UUIDs are massive as compared to integers. 4x big. So it bloats up index by a factor of 4.
      Making your db less performant. Personal experience 😀

    • @subhaspaul495
      @subhaspaul495 2 ปีที่แล้ว

      ​@@AsliEngineering got it thanks
      so how would we store ids uniquely across all DBs and also for better performance ??
      my thoughts would be to store 2ids
      one(_id: INT) is for PK and another as a unique identifier (use UUID here) and use this uuid for relation mapping
      is this a good way to do it??

    • @AsliEngineering
      @AsliEngineering  2 ปีที่แล้ว +1

      @@subhaspaul495 as soon as you store big ID in index it will hamper the performance. the best way to do this is use normal integer ID for local indexing and the use the partition key to mathematically find which DB the entry resides.

  • @manavdarji5806
    @manavdarji5806 2 ปีที่แล้ว

    Very informative video. Thanks for sharing.
    Is this a feasible solution? Add an additional column and make id and that newly added column as composite key? I think this will affect the application layer as well, but would love to know your thoughts.

    • @AsliEngineering
      @AsliEngineering  2 ปีที่แล้ว

      Application layer affected for sure.
      In any case you table is altered, be it ID or other column being added. So time taken remains the same. No potential benefit, rather this requires you to change your app code.

    • @manavdarji5806
      @manavdarji5806 2 ปีที่แล้ว

      @@AsliEngineering Makes sense. Thanks!

  • @_ipublic
    @_ipublic 11 หลายเดือนก่อน +1

    I’m giving my name to you❤

  • @barebears289
    @barebears289 2 ปีที่แล้ว +1

    Interesting