Easy database indexing strategies

แชร์
ฝัง
  • เผยแพร่เมื่อ 21 ก.ค. 2024
  • Functional indexes and generated columns are a way to index things that aren't traditionally indexable, like JSON columns, data normalizations, or calculations.
    To learn more about PlanetScale, head to planetscale.com/youtube!
    Case insensitive searching video: • MySQL case insensitive...
    Generated hash indexes video: • Faster database indexe...
    --------
    00:00 Functional indexes and generated columns
    00:45 What is a functional index?
    02:11 Example 1: Indexing the end of a string
    05:05 Example 2: Case sensitive indexing
    07:37 Example 3: Indexing a TEXT or BLOB
    09:54 Example 4: Indexing a date part
    12:57 Example 4.5: Indexing a date part, semantically
    16:25 Example 5: Grouping by functional index
    19:08 Example 6: Calculating age
    21:09 Example 7: Data normalization
    24:04 Example 8: Indexing a raw calculation
    26:23 Example 9: Combining multiple statuses, and composite indexes
    31:02 Example 10: Indexing JSON parts
    --------
    💬 Follow PlanetScale on social media
    • Twitter: / planetscale
    • Discord: / discord
    • TikTok: / planetscale
    • Twitch: / planetscale
    • LinkedIn: / planetscale
  • วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 131

  • @outreach360-tech
    @outreach360-tech 8 หลายเดือนก่อน +45

    This video is literally the first time I understood how and why a database administrator can be a dedicated job

    • @BarakaAndrew
      @BarakaAndrew 8 หลายเดือนก่อน

      Lol me too, I just learned about this today even though I always index LOWER() on postgresql. I got tables to optimize now

  • @FarazSamapoor
    @FarazSamapoor 8 หลายเดือนก่อน +15

    Great video! Thank you, Aaron! I really liked the practical, real-world examples that you used. It just makes the whole learning process much easier to remember and put to use.

    • @PlanetScale
      @PlanetScale  8 หลายเดือนก่อน +4

      Thank you Faraz! I'm gonna lean into the practical examples for a little while, I agree with you.

  • @metaphoricallyspeaking
    @metaphoricallyspeaking 8 หลายเดือนก่อน +15

    Incredible video - awesome work Aaron (and the PlanetScale team)! 👏

    • @rembautimes8808
      @rembautimes8808 5 หลายเดือนก่อน

      Yeah it’s well structured and presented with an innovative background as well

  • @manishrc
    @manishrc 7 หลายเดือนก่อน +1

    I've learn some many practical things I can use right away in just 3 of your videos. Simply excellent content and production quality. thank you!

  • @oOShaoOo
    @oOShaoOo 8 หลายเดือนก่อน +4

    35:21 I've used the date part indexing on a updated_at column (current timestamp) that was created by an ORM. I've had the need to find all records that were modified on day-1.

    • @PlanetScale
      @PlanetScale  8 หลายเดือนก่อน +3

      Oh nice, I love it! Thanks for letting me know! You listened to my plea at the end 🥹

  • @henrymaddocks984
    @henrymaddocks984 8 หลายเดือนก่อน +1

    This is a great series and you're a fantastic presenter

  • @polojuninho
    @polojuninho 8 หลายเดือนก่อน +2

    Amazing video. Use cases are so relatable!

  • @xcrap
    @xcrap 8 หลายเดือนก่อน +5

    Super educational and useful. Thanks :)

  • @carnavalesenpanama
    @carnavalesenpanama 8 หลายเดือนก่อน

    Im using functional index now on my project using case-when statement inside, and works pretty good to generate my is_verified column. Thank you Aaron.

  • @amaelftah
    @amaelftah 8 หลายเดือนก่อน +3

    really nice video Aaron i liked it so much and waiting for more tips and tricks like that

    • @PlanetScale
      @PlanetScale  8 หลายเดือนก่อน +1

      Thank you! Glad you enjoyed it. It's a bit different style than our normal stuff, we'll see how it does 🤞🤞

  • @PM1_57
    @PM1_57 8 หลายเดือนก่อน +1

    loving the content and your way of explaining it 🔥..

  • @cassiosouza10
    @cassiosouza10 6 หลายเดือนก่อน

    Great content man, best so far, keep it coming.

  • @HJ-jg4ql
    @HJ-jg4ql 8 หลายเดือนก่อน

    These videos are really good.

  • @shutterassault1
    @shutterassault1 4 หลายเดือนก่อน

    Very helpful. I use Oracle all day as a web developer. Need to think about indexing more

  • @kiranjoshi1077
    @kiranjoshi1077 3 วันที่ผ่านมา

    This is simply great knowledge sharing, of course for free of cost.👋

  • @vikasvimal01
    @vikasvimal01 8 หลายเดือนก่อน +1

    Super insightful

  • @Kauto
    @Kauto 8 หลายเดือนก่อน +2

    Great video as always. I would like to see a complex functional index example with a where and a group by.

  • @camilosad1979
    @camilosad1979 7 หลายเดือนก่อน

    amazing content, thank you for sharing

  • @RaicaBogdan
    @RaicaBogdan 8 หลายเดือนก่อน

    awesome video! I like the functional indexes and generated columns a lot now!

  • @bensonnjonjo
    @bensonnjonjo 8 หลายเดือนก่อน +1

    Great content, very useful as usual

  • @philadams9254
    @philadams9254 8 หลายเดือนก่อน +2

    9:00 - Those concerned about collisions should know that it's still extremely rare in md5.
    You could use SHA2, which hasn't been shown to be broken, unless that's slower or not applicable somehow?

    • @PlanetScale
      @PlanetScale  8 หลายเดือนก่อน +3

      You're correct! You could use any hashing algorithm you like. For just computing checksums I like to use MD5 cause it's so fast, but you're right that it is broken for cryptographic use cases.

  • @marancibia1971
    @marancibia1971 8 หลายเดือนก่อน

    Awesome video! Thanks

  • @yatskanych_oleksandr
    @yatskanych_oleksandr 7 หลายเดือนก่อน

    As always, very good video. And now i want to implement it into my project. Thanks you!

  • @marrtins
    @marrtins 8 หลายเดือนก่อน

    Great channel for MySQL learning!

  • @stephenpaek9175
    @stephenpaek9175 6 หลายเดือนก่อน

    Thank you very much

  • @akateet
    @akateet 7 หลายเดือนก่อน

    so cool!!! thanks

  • @HenriqueFoletto
    @HenriqueFoletto 8 หลายเดือนก่อน +1

    Thank you Aaron and PlanetScale, super helpful and a joy to watch, as always!
    If a may ask a question, or a suggestion for a video: How to index (or improve performance) of a query that relies on data from a different table?
    Let’s say that I have a customers and an orders table, and I’d like to select all customers ordered by the sum of their orders amount, but only the orders placed in the current year, excluding refunded orders?
    A generated columns in the customers table isn’t possible since it’s not possible to access columns from other tables.
    I feel like this could be a common use-case (top customers of the year), but a tricky one to make performant.

    • @PlanetScale
      @PlanetScale  8 หลายเดือนก่อน +2

      This is a great example! Adding to my list

  • @rrd_webmania
    @rrd_webmania 7 หลายเดือนก่อน

    Great thanks

  • @gebi84
    @gebi84 8 หลายเดือนก่อน

    Again, great video. I'm learning a lot from you! Can you please make a video, how to read the explain of a slow query?

    • @PlanetScale
      @PlanetScale  8 หลายเดือนก่อน

      Yes for sure. On my list!

  • @RimantasBelovas
    @RimantasBelovas 8 หลายเดือนก่อน

    Loved it! 🎅

  • @nickolayfetlistov4416
    @nickolayfetlistov4416 8 หลายเดือนก่อน +1

    You are the perfect!!!

  • @ElPayetSUPERVAN
    @ElPayetSUPERVAN 8 หลายเดือนก่อน

    This is soooo good, thank you sir !

  • @lukeweston1234
    @lukeweston1234 8 หลายเดือนก่อน +1

    I would pay a solid $50 for a full course from you at this point!

    • @PlanetScale
      @PlanetScale  8 หลายเดือนก่อน +1

      You shall have it for free! planetscale.com/mysql. Enjoy!

  • @bopuc
    @bopuc 8 หลายเดือนก่อน

    Life ALTERing. Thank you.

  • @GameDesignerJDG
    @GameDesignerJDG 8 หลายเดือนก่อน +3

    For example 6, you could do the index on birthday and do 'SELECT ... WHERE birthday >= now() - 46 - 1 AND birthday < now() - 46;' (if you're looking for age equality for age 46, for example) (Haven't tested this, so I might be wrong about where to put the ±1). That'd be a good explanation of a better approach, I think.

    • @PlanetScale
      @PlanetScale  8 หลายเดือนก่อน +1

      I haven't tested your specific query but it seems correct to me, and would be a good alternative! Thanks for commenting 🤗

    • @mityukov
      @mityukov 8 หลายเดือนก่อน +1

      It would be kind to allow some indeterministic indexes, though. Say, if I don't mind to regenerate it once a day at low load time 😅

  • @Soarex
    @Soarex 8 หลายเดือนก่อน +1

    Would have been cool to discuss whether you can store the result of window function (PARTITION BY) in a generated column then index it. Say I want the most recent occurrence (max date) within some group condition, can I index the window function or do I need a materialized view?

    • @PlanetScale
      @PlanetScale  8 หลายเดือนก่อน

      Unfortunately that wouldn't work. The window would require looking at multiple rows and a generated column can only look at the row it belongs to (as far as I know!)

  • @leonarddurmishi6265
    @leonarddurmishi6265 8 หลายเดือนก่อน +1

    Hello and very nice video as always, just a question about the is_weekend case, I don't think adding an index to 0 1 column is good as it has very low cardinality, adding an index there would possibly make the query slower, or maybe mysql would ignore it and not use it at all, any thoughts?

    • @PlanetScale
      @PlanetScale  8 หลายเดือนก่อน +1

      Great point! It would depend on the shape of your data for sure. And it might even be the case the the optimizer would ignore it for `is_weekend = 0` and use it for `is_weekend = 1`. You could also use it as part of a compound index, like the "Combining multiple statuses" example.

  • @fernandoacostanet
    @fernandoacostanet 8 หลายเดือนก่อน

    I have no idea about anything 😂 this channel makes mysql looks awesome

    • @PlanetScale
      @PlanetScale  8 หลายเดือนก่อน

      Mission accomplished!

  • @TheROck3052
    @TheROck3052 8 หลายเดือนก่อน

    Great video. I would like to know more about JSON.

  • @TheStabacco
    @TheStabacco 8 หลายเดือนก่อน

    educational and entertaining, keep it up!

    • @PlanetScale
      @PlanetScale  8 หลายเดือนก่อน

      🫶 Thank you so much!

  • @blackriver2021
    @blackriver2021 8 หลายเดือนก่อน +1

    Superb🎉🎉🎉🎉

  • @fallenpentagon1579
    @fallenpentagon1579 8 หลายเดือนก่อน

    I was thinking "wow Aaron, he's so generous, he's so kind" the entire video

    • @PlanetScale
      @PlanetScale  8 หลายเดือนก่อน +1

      Now imagine I'd only given you nine examples! You'd be furious! Close call 😮‍💨😮‍💨

  • @arnothar8035
    @arnothar8035 8 หลายเดือนก่อน +2

    18:16 Be careful. Datatype "timestamp" has a risk of having a overflow on dates after 2038-01-19 03:14:07 UTC, which is in kinda near future. I would suggest to use "datetime" datatype instead.

    • @PlanetScale
      @PlanetScale  8 หลายเดือนก่อน +1

      You're not wrong 😉 th-cam.com/video/d6HaTHkXqcQ/w-d-xo.html

  • @noamanahmed1
    @noamanahmed1 8 หลายเดือนก่อน

    Awesome videos as always. One quick question. What will happen if there is no email key in JSON data for some of the columns?

    • @PlanetScale
      @PlanetScale  8 หลายเดือนก่อน

      I believe it will just be NULL in that case

  • @Pekz00r
    @Pekz00r 8 หลายเดือนก่อน

    Very good video! I consider myself very knowledgable in SQL, but I learned several things here. Especially the thing with the calculations inside the composite index was really cool.
    Is there a way to make a two way bind with generated function? For example in the last example, is it possible that if you change the email column, the email in the JSON field is also updated automatically?

    • @PlanetScale
      @PlanetScale  8 หลายเดือนก่อน

      Unfortunately not! It's a one way binding

    • @Pekz00r
      @Pekz00r 8 หลายเดือนก่อน

      @@PlanetScale Thank you for the answer! That is what I thought. I have to continue to do this in the application layer then. Laravel makes this pretty easy to do in either observers or mutators.

  • @kirilldanshin
    @kirilldanshin 8 หลายเดือนก่อน

    Indexing customer-provided metadata stored as json alongside the main row data
    Not that often of a use case, but still good to know if you are building a SaaS or something

    • @PlanetScale
      @PlanetScale  8 หลายเดือนก่อน

      Ooo that's interesting

  • @BradleyBernard
    @BradleyBernard 8 หลายเดือนก่อน

    Awesome video, really like the is visible case! 🎉
    Where do we draw the line of generated column vs app biz logic? IE: handling a composite flag column via app code vs in DB
    My guess is it depends, but curious if you have any great examples :)

    • @PlanetScale
      @PlanetScale  8 หลายเดือนก่อน +1

      It depends! Haha sorry.
      It really does though! I think the question about handling a composite flag at the app layer vs DB layer is easier, because you can't index it at the app layer. If you have an "is_visible" flag that's just a Laravel scope at the app layer, you still end up with 4 or 5 conditions at the DB layer, and that's tough to index. If you push that to the DB you can index it more easily. If that's like a default scope that you use all the time, it might make sense to push that down to the DB layer. Hope that helps! Nice to see you here

    • @BradleyBernard
      @BradleyBernard 8 หลายเดือนก่อน

      @@PlanetScale Haha yes, thanks for the explanation - I should have clarified more: for this case of the composite flags, having it as a generated column where value is handled by DB (then slap an index on it) or having it as a normal column and my Laravel app code updates its value (combining date nullability and flag yes no into one value) and then slap an index on it. Should be identical but the q is: who should do the update logic and why? Or do you think it doesn’t realllllly matter in a case like this? I like letting the DB do the work but haven’t given generated columns a chance yet but now I probably should cuz why not!

    • @PlanetScale
      @PlanetScale  8 หลายเดือนก่อน +2

      @@BradleyBernard Ah I see! If it's simple enough (doesnt require much logic) then I prefer generated column because it is impossible for it to get out of sync. If you use something like a model observer in Laravel, you have to be careful that every single update goes through that observer, otherwise you're in trouble.

  • @dave6012
    @dave6012 8 หลายเดือนก่อน

    Hey Aaron, great video. What about filter searching?
    Say I have a stats table of different roles on a team across a dashboard of projects. I want to be able to do a partial (bonus for fuzzy) match on several column values.
    This table can and will be updated often because it’s aggregated from other tables. What would be a good indexing strategy? Sounds like a nice example of generated column + functional index, but I’m not sure 🙂

    • @PlanetScale
      @PlanetScale  8 หลายเดือนก่อน

      Hmmm sounds like a partial (or fuzzy) match on several columns is a great use case for a FULL TEXT index, which allows for that. You might give that a go!

    • @dave6012
      @dave6012 8 หลายเดือนก่อน

      Thanks! I’ll look into that. Of course, if you wanted to make a video on the subject, I always enjoy how simply and eloquently you explain these concepts with examples 🙂

  • @Gerbuuun
    @Gerbuuun 8 หลายเดือนก่อน +1

    Can you do a video about data aggregation? e.g. json_arrayagg and if/why you should use them

  • @customraspi
    @customraspi 2 หลายเดือนก่อน

    Can you also make a video about MemCP database which is wire compatible to MySQL and does such indexes automatically?

  • @SirKenchalot
    @SirKenchalot 8 หลายเดือนก่อน +1

    How about when you use encrypted values in a table? Can/should you index an encrypted column? By encrypted, I mean encrypted by the api or the app, not encrypted at rest under the hood.

    • @PlanetScale
      @PlanetScale  8 หลายเดือนก่อน

      If you encrypt at the app layer and store it in the DB encrypted, you could totally index that. I'm not sure you'd even need a functional index at that point!

  • @fernandopillo
    @fernandopillo 8 หลายเดือนก่อน

    Amazing! Do you have any vídeo about Explain? Thanks!

    • @PlanetScale
      @PlanetScale  8 หลายเดือนก่อน +1

      Not yet! It's on my list though. Soon!

  • @TrondHus
    @TrondHus 7 หลายเดือนก่อน

    This is interesting...
    I am trying to make a query I have faster. Without any search parameters, the result is returned in 4,4 seconds (which is to long). And I bet it has something to do with the query joining just 28 tables. The biggest table is just 13k big.
    I also have some GROUP_CONCAT in the Select part (removed sub queries)...
    So small data...
    So I am watching any video on indexing with great interest.

  • @taquanminhlong
    @taquanminhlong 8 หลายเดือนก่อน

    Hey, do generated columns or functional indexes impact performance compare to static columns (normal one) when it comes to querying 100000 rows for analytics and report, etc... ?

    • @PlanetScale
      @PlanetScale  8 หลายเดือนก่อน +1

      It depends! If you write the generated column to disk it acts as a normal column. If it's virtual, then it's just like a macro for the underlying calculation. If you're going to be filtering against it, I'd probably put an index on it or make it a stored generated column

  • @nelson6e65
    @nelson6e65 8 หลายเดือนก่อน

    Those generated columns do work in MySQL 5.7 same as in MySQL 8 (intelligent index)?

    • @PlanetScale
      @PlanetScale  8 หลายเดือนก่อน +1

      Generated columns work the same in MySQL 5.7 and 8! Functional indexes are an 8 only feature

  • @TravisFont
    @TravisFont 8 หลายเดือนก่อน

    A bit long, but a lot of fire examples 🔥🔥👍

  • @pietraderdetective8953
    @pietraderdetective8953 8 หลายเดือนก่อน

    hey great video as always!
    idea for a new video: what about making a CRUD app but with raw SQL?
    sounds fun!

    • @PlanetScale
      @PlanetScale  8 หลายเดือนก่อน

      Interesting! I'll add it to my list

  • @ryanriddell8376
    @ryanriddell8376 8 หลายเดือนก่อน +1

    What app are you using to write/execute the queries?

    • @PlanetScale
      @PlanetScale  8 หลายเดือนก่อน

      TablePlus! th-cam.com/video/7V_CJBPZPes/w-d-xo.html

  • @bradvanskyhawk7119
    @bradvanskyhawk7119 8 หลายเดือนก่อน

    Question on example 9, when you added the (views > 9000). I don't think this part of the index was used in your explain plan. I think only the is_visible is used. I think you will see an attached_condition in the json formatted explain plan with views > 9000. Do you know why this is?

  • @run-dd
    @run-dd 8 หลายเดือนก่อน

    OK - just to contextualize - Postgres has functional index for at least 20 years (quick search -> V7.2 from 2002-02-04) - so nice to learn about, but it's not new, just said because you introduce with words like in case your MySQL hasn't that feature... If that's the case maybe MySQL is the wrong DB

    • @PlanetScale
      @PlanetScale  8 หลายเดือนก่อน +1

      Still, pretty good examples right?

  •  8 หลายเดือนก่อน +1

    Nice. I’m ready to become a MySQL Monk

  • @mrbigsmile3902
    @mrbigsmile3902 8 หลายเดือนก่อน +2

    MySQL uses 'unsigned tinyint' instead of boolean?

    • @PlanetScale
      @PlanetScale  8 หลายเดือนก่อน +2

      Yup! Boolean also works, but is an alias for unsigned tinyint

  • @nikolav6350
    @nikolav6350 7 หลายเดือนก่อน

    Dear Aaron,
    I recently watched your indexing video, but I'm still grappling with a problem. In my posts table, there's a 'likes_count' column that tracks the number of likes for each post. This count changes constantly as users like posts. However, I'm unsure whether creating an index on the 'likes_count' column is good idea, considering its frequent updates. When attempting to retrieve the most current popular posts orderby likes_count, I'm uncertain about the best approach. Any insights on this would be immensely helpful.

  • @filon861
    @filon861 8 หลายเดือนก่อน

    i fucking love how u explain things, not a single boring moment. i learned a lot. thanks `select * from people where name = 'Aaron' and is_cool = 1`

    • @PlanetScale
      @PlanetScale  8 หลายเดือนก่อน

      You're too kind. Thank you!

  • @chrishabgood8900
    @chrishabgood8900 8 หลายเดือนก่อน

    Will these work in Postgres?

    • @PlanetScale
      @PlanetScale  8 หลายเดือนก่อน

      I'm honestly not sure!

  • @rajapanda832
    @rajapanda832 8 หลายเดือนก่อน

    Can you please make a video about Partitioning in MySQL

  • @vishaldinesh
    @vishaldinesh 8 หลายเดือนก่อน

    Hey it's Mr dunphy from planet scale, Hi.

    • @PlanetScale
      @PlanetScale  8 หลายเดือนก่อน +1

      Hey kids, it's MySQL Dad. I believe in you.

  • @BrunoBernard-kn6vt
    @BrunoBernard-kn6vt 8 หลายเดือนก่อน

    How do you populate your database for videos ?

    • @PlanetScale
      @PlanetScale  8 หลายเดือนก่อน +1

      I use a little Laravel app and the Faker library

    • @BrunoBernard-kn6vt
      @BrunoBernard-kn6vt 8 หลายเดือนก่อน

      Thanks :) do you mind open source it ?

    • @PlanetScale
      @PlanetScale  8 หลายเดือนก่อน +1

      It's just one giant script file 🤐

  • @griffadev
    @griffadev 6 หลายเดือนก่อน

    What is this GUI?

  • @abdellahcodes
    @abdellahcodes 8 หลายเดือนก่อน

    you've missed something: kitchen sink
    awesome video Aaron

    • @PlanetScale
      @PlanetScale  8 หลายเดือนก่อน +1

      Haha you had me scared at first! Thank you so much

  • @lpanebr
    @lpanebr 8 หลายเดือนก่อน

    It's all good but the DBA has to explain to the devs so that they write queries using the substring, the md5 etc... Then the devs need to do that..

    • @PlanetScale
      @PlanetScale  8 หลายเดือนก่อน +1

      Depends on the direction I suppose... if the devs are already doing it and the DBA is trying to index it, you're free! Otherwise, a generated column might be the easiest way to communicate across teams, since it has a nice neat name

    • @lpanebr
      @lpanebr 8 หลายเดือนก่อน

      @@PlanetScale I love your content. Thank you.

  • @user-gm8xr7qr2l
    @user-gm8xr7qr2l 8 หลายเดือนก่อน

    I am neither a DBA nor i Work with mysql.
    Still, i watched this video until the end, and i feel like i am missing out something

  • @davidmutua6971
    @davidmutua6971 6 หลายเดือนก่อน

    MySQL 8.x supports functional indexes but MySQL 5.7 doesn't. Also mariadb doesn't. Am on mariadb instead you use a generated column. Then add a normal index there.
    ALTER TABLE people ADD COLUMN month_created INT GENERATED ALWAYS AS (MONTH(created_at)) VIRTUAL;
    ALTER TABLE people ADD INDEX idx_month_created (month_created);

  • @Svish_
    @Svish_ 8 หลายเดือนก่อน

    Why in the world doesn't MySQL have a proper type for `boolean` values? Seems so weird and awkward to see `TINYINT` being used for a true/false type thing. 🤔

    • @PlanetScale
      @PlanetScale  8 หลายเดือนก่อน

      Haha you can type boolean, but it turns into a tiny int. It's an alias

    • @Svish_
      @Svish_ 8 หลายเดือนก่อน

      @@PlanetScaleYeah, seems a bit weird to me. Maybe you can do a video on booleans in sql, if you can find some history on why there's no proper booleans in mysql?

  • @nonnnth
    @nonnnth 8 หลายเดือนก่อน

    Did he just do an age reveal???

    • @PlanetScale
      @PlanetScale  8 หลายเดือนก่อน +1

      Or did I? We'll never know!

  • @FGj-xj7rd
    @FGj-xj7rd 8 หลายเดือนก่อน

    Adding JSON to relational database was a mistake. I have had so many dreadful queries to deal with regarding them 😂
    So much unmaintainable doodoo 😭💩

    • @PlanetScale
      @PlanetScale  8 หลายเดือนก่อน +1

      It can get out of control quickly, that's for sure