Faster than a regular database index

แชร์
ฝัง
  • เผยแพร่เมื่อ 5 ธ.ค. 2023
  • 👉 How I seed data: • How I add millions of ...
    To learn more about PlanetScale, head to planetscale.com/youtube!
    💬 Follow PlanetScale on social media
    • Twitter: / planetscale
    • Discord: / discord
    • TikTok: / planetscale
    • Twitch: / planetscale
    • LinkedIn: / planetscale
  • วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 60

  • @avrukin
    @avrukin 6 หลายเดือนก่อน +50

    One recommendation for the benchmark - choose the date randomly, ie have your start/start+1 be a random value in the range of values (min, max). The reason you want to do that is because doing a repeated benchmark on the same date, you are very likely hitting either cpu cache or other caching in the application tier, doing each one of the queries in the iteration loop with a random date is both more realistic in terms of actual user behavior as well as avoids the cache hit scenario.

    • @pietraderdetective8953
      @pietraderdetective8953 6 หลายเดือนก่อน

      Yeah this is what I got in mind..also adding the number of repeated tests for the "faster method" while reducing the repeats for the "slow" method accentuates this issue --> makes the slow method slower and the fast method faster --> widening the gap.
      For better result, benchmarks should be done using the same repeat times.

  • @xcrap
    @xcrap 6 หลายเดือนก่อน +2

    There's something satisfying in optimizing code, databases, etc. When you gain significant performance, it just feels right :)

  • @ahmad-murery
    @ahmad-murery 6 หลายเดือนก่อน +18

    Covering index is great but one must keep in mind adding only necessary columns to the composite index, otherwise the DB size will get bigger unnecessarily.
    But again, why the reversed index (email, created_at) was created for? I didn't get it🤔
    Thanks Aaron!

    • @master74200
      @master74200 5 หลายเดือนก่อน +3

      Not only will the size grow but insertion/updates will increase in time as well, as the indexes will need to be updated as well. The more you index, and the more indexes you have, the longer the writes will be.

    • @ahmad-murery
      @ahmad-murery 5 หลายเดือนก่อน

      @@master74200 Exactly👍

  • @GringoDotDev
    @GringoDotDev 6 หลายเดือนก่อน +9

    This is awesome! I'd love to see a video at some point talking about best practices for modeling time series data in MySQL

  • @kyleareich
    @kyleareich 5 หลายเดือนก่อน

    dude your videos are the best. I've learned so much from you. Specifically: your videos cover ONE topic very well, very clearly, and very thoroughly (is that a word)? Thanks for your work!! You're a great speaker/teacher as well.

  • @mikwit
    @mikwit 5 หลายเดือนก่อน +1

    Would love a video on when to use bitmap indexes on low cardinality columns with dynamic multi-column searching (e.g. slicing and dicing data in a dashboard). I know it’s faster than b-tree indexes, but a video showing it would be awesome.

  • @Ratamir
    @Ratamir 6 หลายเดือนก่อน

    Man, I dont usually write comments, but your videos are on the next level. They are just perfect. Amazing work! Great Job!

  • @Lord2225
    @Lord2225 6 หลายเดือนก่อน +3

    This benchmark might be very worng due to caching. Idk how mysql works but every db i worked with has few layers of caching and repeating one query over and over might benchmark only cache itself.

  • @TES-A5
    @TES-A5 6 หลายเดือนก่อน

    Interesting that you mentioned PostgreSQL here and its ability to bring along extra columns with index. Even though it can do that, i was able to make my query faster after using indexes using the method you showed. It went from 450ms avg (single index) to 240ms avg (multiple indexes with ride along columns)

  • @oOShaoOo
    @oOShaoOo 6 หลายเดือนก่อน +1

    Really useful video, especially the information about the primary key within the compound index. Does that mean it is useless to create compound indexes containing the primary key (for example created_at+id) ?

  • @mikevanegmond
    @mikevanegmond 6 หลายเดือนก่อน +1

    Hey Aaron, one thing I was curious about that might be right up your alley: What indexes should I use for a table that can be filtered on any column? Usually you would index fields commonly found in your where statement, but you can't just index every column right?

  • @clixm
    @clixm 5 หลายเดือนก่อน

    Wow, genius!! That benchmark tool is very useful too

  • @kirayamato6128
    @kirayamato6128 6 หลายเดือนก่อน

    get() method is different to select() cuz get() is for collection or object while select() is for db query. I think it's better to use select() instead of get()

  • @nowendwell
    @nowendwell 6 หลายเดือนก่อน +4

    That's a lot of 7s...

    • @PlanetScale
      @PlanetScale  6 หลายเดือนก่อน +4

      I got caught in a loop there

    • @nowendwell
      @nowendwell 6 หลายเดือนก่อน

      @@PlanetScale Loved it. When recursion goes wrong.

    • @Kane0123
      @Kane0123 6 หลายเดือนก่อน

      A joke only for the dedicated few

  • @JanezNovak-fk4qr
    @JanezNovak-fk4qr 6 หลายเดือนก่อน

    This is good stuff. Thank you.

  • @AlexCernat
    @AlexCernat 6 หลายเดือนก่อน +1

    you should mentioned that PK is included in index only on SOME mysql/mariadb engines (i.e. innodb, maybe others); for myisam (well, who use myisam anymore?) and mostly other engines the PK is not included in the indexes
    in rest, great video (as usual), maybe developers should learn more about how a database work in order to not produce junk queries; but with ORMs nowadays, you mostly don't have control anymore about what is executed ...

  • @peteraylin
    @peteraylin 6 หลายเดือนก่อน

    As always, AWESOME!

  • @mihnearadu990
    @mihnearadu990 6 หลายเดือนก่อน

    Hey Aaron, great video but I think you missed something here. I would tell everyone to not add index that they don’t need in their app. Multiple reasons why this is unadvisable but in the end it just ends up slowing the query back down. So in a table with 5Mil entries and 10-20 columns, you would not create a composite key ok created at and any other combination of column/columns you might need.
    Related to this, you can make a video showing people that mysql only uses only an index when performing searches and that it calculates automatically which index is best to use. ( and us such, calculating that for 20 indexes on a table will result in slowing stuff down )

  • @20toninho
    @20toninho 6 หลายเดือนก่อน

    amazing, loved this content ...

  • @rafaelamadigidalsanto113
    @rafaelamadigidalsanto113 6 หลายเดือนก่อน +2

    Your videos are great. I love them.
    I have a suggestion for you. Could you make a video explaining how to optimize this type of query:
    select *
    from order as o
    join order_item as oi on oi.order_id = o.id
    where o.client_id = 123
    and oi.product_id = 321;
    Even if we have an index on the column order.client_id and another one on the column order_item.product_id, this query could be slow if the filtered client_id and product_id appear in many rows. That's because the join implies a lot of index lookups on the other table. The only way I know to optimize this type of query is to create a materialized view with data from both tables and create a single index on both columns. This way, the database can filter both columns at the same time and won't have to perform separate index lookups. The problem is that MySQL doesn't have materialized views, as far as I know.

    • @zgetro
      @zgetro 6 หลายเดือนก่อน +2

      But MySQL does have views. You can use that

  • @ucretsiztakipci6612
    @ucretsiztakipci6612 6 หลายเดือนก่อน +1

    I wonder what would happen if we even add partition for year?

    • @reed6514
      @reed6514 6 หลายเดือนก่อน

      Me too. My queries only deal with a few hundred rows so it doesn't matter for me lol, but it would be interesting to see.

  • @alexrix4063
    @alexrix4063 6 หลายเดือนก่อน

    Video request 🙋🏻‍♂️have been hoping out for a while as it was mentioned on a previous video: geospatial indexes

  • @mahmoud-bakheet
    @mahmoud-bakheet 6 หลายเดือนก่อน +2

    Alright so how we can apply this technique to the searching or filtering stuff ? I mean in a real application , is it mean we have to create different type of index?

    • @zgetro
      @zgetro 6 หลายเดือนก่อน +2

      No, it totally based on your application needs. like for example from application requirements
      Search: i would use algolia.
      Reports: create Views or SP.
      Quey On Large single table: index or combin index.
      Quey on Join table with ORM: will use Ladacache Package for cache in redis

  • @MohammadAminZakeri
    @MohammadAminZakeri 6 หลายเดือนก่อน

    A question : in mysql how to optimize count(*) on a table with more than 150m rows (I mean a large table) and surely I use where in it ?

  • @LenWoodward
    @LenWoodward 6 หลายเดือนก่อน +1

    I'm sure I've just missed it, but I'm not understanding why we had to create the inverse of that index

    • @reed6514
      @reed6514 6 หลายเดือนก่อน

      I think he skipped over that...

    • @mrbigsmile3902
      @mrbigsmile3902 4 หลายเดือนก่อน

      He created it to show with EXPLAIN that the planner chooses the other index. So the reverse isn't necessary

  • @Septumsempra8818
    @Septumsempra8818 6 หลายเดือนก่อน +1

    Paginators. Please help with optimizing pagination which usually implements a count(*)

    • @nunosdonato
      @nunosdonato 6 หลายเดือนก่อน

      dont do it :)

    • @Septumsempra8818
      @Septumsempra8818 6 หลายเดือนก่อน

      @@nunosdonato please elaborate

  • @alexandruciocan3128
    @alexandruciocan3128 6 หลายเดือนก่อน

    In my big table, I noticed a better performance if I'm using the entire datetime for created_at, like:
    Where created_at >= '2023-12-03 00:00:00'
    and created_at

  • @dragonwave2652
    @dragonwave2652 6 หลายเดือนก่อน

    Hello, thank you for your videos. I am improving my queries

  • @RaicaBogdan
    @RaicaBogdan 6 หลายเดือนก่อน

    Those damn sevens!!! :)) haha , top notch info, thank you. I did not considered about these type of situations, I'm smarter than yesterday now. ✌

  • @veetaha
    @veetaha 6 หลายเดือนก่อน

    Aha! I knew you knew some things about Postgres!

  • @softwaredeveloper6791
    @softwaredeveloper6791 6 หลายเดือนก่อน +1

    Is there a way to test repeatedly without having MySQL buffering give better results for subsequent queries? Change InnoDB settings?

    • @a-name-or-not-a-name
      @a-name-or-not-a-name 5 หลายเดือนก่อน

      If you make your dataset much larger than available RAM then RDBMS would not be able to fully cache the data.

  • @TES-A5
    @TES-A5 6 หลายเดือนก่อน

    You should release your editor theme settings... The Aron Theme... Make it happen. 💙

  • @sterben1941
    @sterben1941 6 หลายเดือนก่อน

    The fastest way is to remove the card in the hobby plan

  • @userasd360
    @userasd360 5 หลายเดือนก่อน

    can you provide the data for the queries that you are using

  • @RandomGeometryDashStuff
    @RandomGeometryDashStuff 6 หลายเดือนก่อน

    is created_at stored as string?

  • @RajveerSingh-vf7pr
    @RajveerSingh-vf7pr 6 หลายเดือนก่อน

    13:05 : holly

  • @viniciusataidedealbuquerqu2837
    @viniciusataidedealbuquerqu2837 6 หลายเดือนก่อน +2

    that terminal font is beautiful. care to put it on the description for less people asking?

  • @csanadtemesvari9251
    @csanadtemesvari9251 6 หลายเดือนก่อน

    If you ain't seeded, stay seated and watch this video!

  • @AnthonyBullard
    @AnthonyBullard 6 หลายเดือนก่อน

    PHP? I didn’t know you drove a Lamborghini

  • @medicallabnapata9082
    @medicallabnapata9082 6 หลายเดือนก่อน

    Amazing

  • @Dev-Siri
    @Dev-Siri 6 หลายเดือนก่อน +1

    this is the fastest query for mysql to process:
    it takes 0.000000s and costs $0.00 to run.

  • @reed6514
    @reed6514 6 หลายเดือนก่อน

    EXPLAIN your queries please. And show us your schema!

    • @reed6514
      @reed6514 6 หลายเดือนก่อน +1

      8:43 thank you! I spoke too soon maybe.

    • @PlanetScale
      @PlanetScale  6 หลายเดือนก่อน +1

      Ha, I appreciate the enthusiasm