UUIDs are Bad for Performance in MySQL - Is Postgres better? Let us Discuss

แชร์
ฝัง
  • เผยแพร่เมื่อ 27 ก.ย. 2024

ความคิดเห็น • 67

  • @hnasr
    @hnasr  3 ปีที่แล้ว +12

    Learn the fundamentals of database systems to understand and build performant backend apps
    Check out my udemy Introduction to Database Engineering
    husseinnasser.com/courses

  • @rogersteele2835
    @rogersteele2835 3 ปีที่แล้ว +34

    This channel has been a wealth of knowledge for understanding backend processes

  • @yvestrudeau2761
    @yvestrudeau2761 3 ปีที่แล้ว +83

    Thanks for discussing my article :) Pg will be faster at inserting rows in the table itself which is organized by rowid. It has to maintain a btree for the uuid PK values so it will eventually struggle like InnoDB, maybe a bit later in term of number of rows.

    • @hnasr
      @hnasr  3 ปีที่แล้ว +3

      Good work Yves!! And thanks for clarifying postgres behavior

    • @hnasr
      @hnasr  3 ปีที่แล้ว +2

      Interesting! Thanks for sharing

    • @justcreate1387
      @justcreate1387 ปีที่แล้ว

      Since using a uuid for the primary key is not optimal, what is the most optimal/scalable type of primary key?

    • @ricardoc748
      @ricardoc748 ปีที่แล้ว +2

      @Just Create before using random or autogen pks, consider applying DB best practices for choosing your PK. If there's any data that can guarantee uniqueness, you should consider using them. But if you decide that you must go into artificial PKs, don't optimize inexistent optimization problems with your PKs.
      Spend more time working on a more reliable data structure and a more simple or logical data organisation. You can gain performance or by making things easier to query with a good data structure. This can, for instance, avoid the need for unecessary JOINs and complex subqueries.

    • @justcreate1387
      @justcreate1387 ปีที่แล้ว

      @@ricardoc748 thank you for the words of wisdom!!!

  • @duyquang6
    @duyquang6 3 ปีที่แล้ว +2

    very well explained. this article show that using UUID will break important property of page cache, which is spatial locality

  • @BaishaliGhosh13
    @BaishaliGhosh13 3 ปีที่แล้ว +14

    Your knowledge of RDBMS is amazing. Can you share links to other channels who explain NoSQL databases like MongoDB in a similar manner?

    • @hnasr
      @hnasr  3 ปีที่แล้ว +4

      Thanks! check out Percona they make great content on databases

  • @mosespeter9711
    @mosespeter9711 2 ปีที่แล้ว +2

    Using the MySQL function UUID_TO_BIN(uuid,1) would give you almost same performance as integer. I've been using it for a while now and it's nice

  • @kamesharumugam
    @kamesharumugam 3 ปีที่แล้ว +1

    Thanks @hussein and Yves for this excellent information :) loved this info, reading more about clustered indexing now.

  • @shayanahmad5439
    @shayanahmad5439 3 ปีที่แล้ว +1

    I love your videos.
    I feel if you summarized everything in points or even highlighted the words in blog and made some diagrams on top of this, it would not only lessen the time of the video but also make it more digestible. I seriously hate to speed up your videos because your way of explanation is the USP.

    • @cwgabel
      @cwgabel 3 ปีที่แล้ว +1

      Yes. Hussein has a unique approach to explaining that, depending on your level of understanding of the subject matter at hand can cause either instant boredom or thoughts of losing your mind. I keep my shortcut keys handy for faster/slower video speeds just for those occasions. I use a Shrestha Pujan's chrome extension to tweak the speed step by +-10%. It's nice because you aren't limited to 200% either...

  • @autohmae
    @autohmae 3 ปีที่แล้ว +1

    If I understand it correctly modern GUID in Windows are the same as the UUID RFC, but the RFC has a reserved part for older GUID Microsoft use to have (Microsoft didn't invent it, but the RFC came later and for compatibility it was added).
    No, the biggest project where we communicate IDs with the outside world is also running MySQL (it wasn't my choice, I didn't start the project. PostgreSQL would also have been my choice), it's similar to TH-cam video IDs, base70 or something of a pseudo random number and it's not the Primary Key of the table.

  • @IBITZEE
    @IBITZEE 2 ปีที่แล้ว

    Hussein... you're the man!!! great info...
    I'd like very much to ear from you about the page-mechanics in the databases (mysql, pg, mssql, acess/mdb, etc.)
    Also the way rows are stored... how the data types are 'squeeze' in the record/row...
    also how BLOBs are stored... ?along with the record... in a related-hidden-table... etc.

  • @chairlovawitabat
    @chairlovawitabat 3 ปีที่แล้ว +2

    So the lesson here is that MySQL is best used if your database stores data in a roughly sequential manner, and PostresSQL best for random insertions (UUID insertions)?

    • @sadhlife
      @sadhlife 3 ปีที่แล้ว

      postgres handles both situations pretty well

  • @rachydamine9458
    @rachydamine9458 3 ปีที่แล้ว

    Very well explained, thank you Hussein

  • @KangJangkrik
    @KangJangkrik 3 ปีที่แล้ว +1

    Btw I want to see you trying Vue for the first time, Hussein :)

  • @andrewmartin3671
    @andrewmartin3671 3 ปีที่แล้ว +1

    I always felt that a GUID can be assumed to be unique over all the entities in your application, and UUID can be assumed to be unique over all entities over all applications in the human world. I've know the probability of clashes amongst UUIDs is vanishingly small, bit I can never bring myself to ignore it.

    • @hansenchrisw
      @hansenchrisw 3 ปีที่แล้ว +1

      Deep down you know that the UUID generator that you’re using could have a bug in it that makes odds of collision much higher

  • @luisgarciaalanis
    @luisgarciaalanis 2 ปีที่แล้ว

    Ulids are sequential random guid, they should always be inserted at the end of the table, an alternative to guids

  • @henrikschmidt8178
    @henrikschmidt8178 ปีที่แล้ว

    But if you add all uuids to a unique index that is stored in an unsorted list in postgres, doen´t it require the database to read the whole unique index list to compare every uuid in the list to know that it is not already in there? And in that case sorting it once by making a btree sounds like a lot less work to me.

  • @ventsislavpenev3602
    @ventsislavpenev3602 2 ปีที่แล้ว

    What do you think about the option with auto_increment as PK and a separate unique non-clustered index , with UUID for the "generared-id, id for the outer world", in the context of MySql?
    Maybe, in terms of performance during insert, it will be almost similar, because yes - it will save time for the auto-increment indexing, and for the secondary index it will store only a pointer containing the primary key instead of the whole row data, BUT...two indexes should be maintained and the secondary index, will suffer again from the reordering(since it is B or B+ tree index again) and lack of needed pages in memory, even that index size is smaller and accumulate more pages.

  • @MoemenGaballah
    @MoemenGaballah ปีที่แล้ว

    We want hidden all my ids
    What is the best way to use => UUID as Primary Or (id autoincrement + UUID as a second column as Slug)
    I think we will have an issue with performance when found by UUID and maybe with joins?
    for example, if has posts with UUID and wants only Comments By UUID for Posts we will increase the query.
    Thanks.

  • @hassaneljebyly4346
    @hassaneljebyly4346 2 ปีที่แล้ว

    well.. that's refreshing, thank you very much.

  • @sokunoun6123
    @sokunoun6123 3 ปีที่แล้ว +1

    Very Very thank you for your Video

  • @dheemanthmallikarjun514
    @dheemanthmallikarjun514 3 ปีที่แล้ว

    Great content 👍
    One quick question, for inserting the new row with a ID , who enforces that a row with a particular ID should go to a particular page? The database engine maintains this mapping ?

    • @lakshayasood2197
      @lakshayasood2197 3 ปีที่แล้ว

      yes. it all depends on the column that was used to create clusters(pages) of records. Mostly its is created based on Primary key but if PK is not there then DB engine will auto-select a column (probably a not null one). At the time of insert, the DB will look up into the clusters(pages) to find the best page that matches the index column of the record. Read more at: www.javatpoint.com/mysql-clustered-index

  • @rommellagera8543
    @rommellagera8543 2 ปีที่แล้ว

    As primary key, for me auto increment int (4 bytes) will be fast and compact even compared to long (8 bytes), I could be wrong but kindly enlighten me

  • @adsboosters
    @adsboosters 2 ปีที่แล้ว

    If i share my pc system uuid(Wmic csproduct get uuid) then my pc can be hack?

  • @philippealexandredeleye4608
    @philippealexandredeleye4608 3 ปีที่แล้ว

    I have a question for you.
    In postgres : do you think it’s better to have 1 sequence for Id form all my table or 1 sequence per tables ?

    • @hnasr
      @hnasr  3 ปีที่แล้ว +1

      For sure a sequence per table for concurrency.

    • @philippealexandredeleye4608
      @philippealexandredeleye4608 3 ปีที่แล้ว

      @@hnasr ty for your answer !

  • @milad9900
    @milad9900 ปีที่แล้ว

    Hi Hussein
    But it's in clustered index situation

  • @alinikakhtar7029
    @alinikakhtar7029 ปีที่แล้ว

    tnx alot.

  • @utub369a
    @utub369a 3 ปีที่แล้ว

    Is there any database store uuid in 16 bytes instead of char(36)?

    • @Eckster
      @Eckster 3 ปีที่แล้ว

      I believe Postgres can do that with the UUID column type

  • @EddyCaffrey
    @EddyCaffrey 3 ปีที่แล้ว

    So sad really 😅😅. Great explanation

  • @NGC-gu6dz
    @NGC-gu6dz 3 ปีที่แล้ว

    Oh I learned this lesson last week. painful.

  • @hassamulhaq7762
    @hassamulhaq7762 ปีที่แล้ว

    uuid is bad if you use uuid as a primary _key instead of default. In other case like external usage e.g API you can create new column uuid and use this for external purposes or as a query string.

  • @SiriusBlackNuar
    @SiriusBlackNuar ปีที่แล้ว

    1k😊

  • @CODEDSTUDIO
    @CODEDSTUDIO 3 ปีที่แล้ว +2

    First to comment

    • @pointerish
      @pointerish 3 ปีที่แล้ว +1

      Congrats.

    • @prabhakarjha1168
      @prabhakarjha1168 3 ปีที่แล้ว +2

      you will soon receive you certificate of winning

    • @PiggyQuacks
      @PiggyQuacks 3 ปีที่แล้ว

      HEK YEAH

  • @andreikashin
    @andreikashin 3 ปีที่แล้ว +1

    We know who you are. Remove this half-screen banner.

  • @hritiksingh8268
    @hritiksingh8268 3 ปีที่แล้ว +6

    Can you do a short video on uuid vs sequence . Such as auto incriment in MySQL and bigserial in postgres ..... like which one to use where.. and where not to use

  • @VladimirSitnikov
    @VladimirSitnikov ปีที่แล้ว +2

    Nice review. However, at 17:57 you mention "in PostgreSQL there will be no problem because the table is not clustered". I am afraid you miss the fact that b-tree index pages have fixed (ordered) structure, so inserting random values (ints or uuids) would cause the same issue with random reads of **index** pages

  • @vitiok78
    @vitiok78 3 ปีที่แล้ว +3

    Guys! Be kind to your Database. She doesn't deserve such a disrespect!

  • @amitdb42
    @amitdb42 10 หลายเดือนก่อน

    Does AWS RDS does the pseudo ordering by itself ?

  • @Dimonion1
    @Dimonion1 3 ปีที่แล้ว +1

    Fantastic explanation! Will use your videos as a guide to my colleagues :)

  • @kevinkkirimii
    @kevinkkirimii 3 ปีที่แล้ว

    I don't know if you have come across the 1 billion table project. I am not a postgres dba but I found it interesting

  • @teengrammer4393
    @teengrammer4393 2 ปีที่แล้ว

    We rrright the rrrow, this was both funny and eductional. Kudos!!

  • @giris1866
    @giris1866 2 ปีที่แล้ว

    how nosql databases like mongodb uses uuid as primary key?

  • @CoentraDZ
    @CoentraDZ 3 ปีที่แล้ว

    That was a great explanation, nothing else to say.

  • @boot-strapper
    @boot-strapper 3 ปีที่แล้ว

    2019 is old??? lol

  • @tjalferes
    @tjalferes 3 ปีที่แล้ว

    Howdy. What about uuidv6 for primary key? Also, what about generating the id in your application code versus having the database generate it? Thanks.

    • @hnasr
      @hnasr  3 ปีที่แล้ว +1

      Generating the id client side is require uniqueness, how do you know if another client doesn’t generate the same ID. That is why UUIDs are great since the collision chances are low

    • @tjalferes
      @tjalferes 3 ปีที่แล้ว

      @@hnasr Thanks. Okay, I think I understand you, but I am still sort of confused. I thought uuid are *universally* unique though -- practically speaking ("the probability to find a duplicate within 103 trillion uuidv4 is one in a billion"). As in, even if you have, say, twenty servers, you still get unique IDs. Or is "universal" a misnomer? If so, then what about database sharding? Wouldn't that cause the same issue? Anyway, here is some info on uuidv6, which is sorted: tools.ietf.org/html/draft-peabody-dispatch-new-uuid-format-00