Secret To Optimizing SQL Queries - Understand The SQL Execution Order
ฝัง
- เผยแพร่เมื่อ 15 พ.ค. 2023
- Get a Free System Design PDF with 158 pages by subscribing to our weekly newsletter: bytebytego.ck.page/subscribe
Animation tools: Adobe Illustrator and After Effects.
Checkout our bestselling System Design Interview books:
Volume 1: amzn.to/3Ou7gkd
Volume 2: amzn.to/3HqGozy
The digital version of System Design Interview books: bit.ly/3mlDSk9
ABOUT US:
Covering topics and trends in large-scale system design, from the authors of the best-selling System Design Interview series.
Great video! One addition: The "EXPLAIN" command is an invaluable tool for optimizing SQL queries. It provides a detailed execution plan, allowing the developers to understand how the database engine processes a query. By analyzing the execution plan, you can address the performance bottlenecks with proper optimizations, e.g. proper indexes.
Thanks for sharing this.
Thanks a lot for the addition, really good :)
@cmertayak - I second you. It's an awesome command I use many times at my work to optimise. My go to command to improve queries execution.
Thanks for sharing
Oh yes, if you run EXPLAIN in some desktop client like Mysql Workbench, shows you detailed chart diagram of your Query, quite useful
The way you explained with the animations are Awesome. Great Job. Very Well Explained.
Thank you for a fantastic visualization of the SQL queries execution order. That's exactly what I have been missing in the other materials. I really appreciate your style of teaching
Opt for indexes with SELECT, WHERE, JOIN clauses.
Use full column comparison to get data instead of half or computed comparison (i.e startsWith)
Avoid ORDER_BY on large data retreval
Use limit of smaller number with pagination for more data.
Could you explain how? What if i need large data retrieved with order by. How would i use limit and pagination in this case? Thanks
bro this way of teaching is really really make sense. thanks a lot for these visuals.
Simple and to the point explanation. Love it. Thanks 👍
One of the best SQL videos I have come across, just the way it is put together and the infographics. If you are learning SQL, you really should understand the mechanics behind optimizing queries, how databases work. Just adding more hardware or VM resources will not fix the issue if your queries are not optimized properly.
Very well presented, thanks for explaining SARGAble concept
This is the best explanation I've ever seen. Big thumbs for you!
Wow. To the point with knowledge I can use today. Thank you.
thanks, helped clear up some issues I had.
wow, what an awesome introduction to SQL optimization.
Very good intro. Would like a more detailed explanation on more complex queries.
they don't do detailed explanations. it's basically "use indexes". don't sort lots of data. well, thanks.
@@jonbaird9718agreed, TH-cam is made for juniors
Nice and simple explanation.Thanks
*Explanation level is so beautiful!*
Great video, very informative and well explained bravo!
Amazing. Thank you!
Excellent video explaining basic concepts in very short time..❤
Impressive graphic animation, could you please share how the execution plan animation was done
Thank you for your time and effort to explain any of the subjects. Really like it and more over able to register the concept in mind easily. Thanks again,.
Very profound, please share more on SQL like windows and CTE, your explanation is very approachable.
Thanks for your sharing Bro's.
Hi Sir thank you 🙏 for taking the time to explain the SQL. Sorry Iam new and very helpful.
Excellent explanation, thanks!
cool, didn't think it's possible to include all these concepts in 6 min video. One thing, it's great to watch it when you want to summarise already existing knowledge
Love your channel. Your videos are great.
Additionally, for the optimizer to "make up" a reasonably good plan (from the various alternatives), it needs to know a bit about the data (value) distribution. This is where STATISTICS / ANALYZE (depends on the DB vendor) come handy. It helps the optimizer do estimates for the various steps (rows, size of data, etc.) of each plan, and figure out which of the different plans is the best candidate to execute. Therefore it is important to collect this information on critical columns (usually join, where clause columns). It is also important to keep this information regularly refreshed so that the optimizer does not make bad decisions based on stale statistics. Very bad things can happen with stale statistics.
these videos are amazing!!!! thanks!!!
Awesome as usual! Thanks a lot!
Understanding how the DB engine works with indexes is key. you may assume that a WHERE purchase_date >= 2022 AND purchase > 100 would be the same if you have indexes on purchase_date and purchase, but it might be required to have a composite index... Order in the WHERE clause may also be important as it helps reducing the dataset before applying the second condition.
WHERE order has no effect on most sql systems. The only way you can force SQL to filter data first is to use a derived query.
Thank you, this was really helpful.
good things to practice for the interview. Thanks
Superb video! Simple explanation on query optimisation.
Thanks. Good to know! Useful!
Awesome visualization, I've been loving all the short videos on this channel!
Clarifying Q. The execution order has SELECT happening after HAVING, so this should mean that the calculated column total_spent doesn't exist at the time the HAVING clause is evaluated?
As usual, excellent and to the point video!
I heard it called "predicate pushdown" when you move a condition earlier in the plan
Very simple and to the point, love the visualization too
Well explained. However I do miss 1) the generation of more query-plans and selection amongs them (cost estimations) and (as an element herein) 2) different table access tactics (sequential scan, index access or index only).
Best explanation ever
Thank you so much!
So good explanations
Fantastic explanation.
thanks a lot for your content
oh my goodness, this is too good for non IT background jumping ship to see where AI will land. Thx. You are my 3blue1brown for IT
This stuff is gold. Thank you for making this available for free. Really appreciate it!
1:26
Very good video. It is really helpful.
Thanks for this! Will there be a transcription soon?
thanks so much!
Great. Thanks for sharing..
Your presentation is so pleasant to watch, is it manually key-framed in the video editor or are there tools to do that naturally?
Lord Buddha. I'm looking for an active data flow visualization that can shorten data query response times! A great video, it saved me today. Leaving with 1 subscription as a fan! 🔍⚡
You should select from the orders table then join the customers since your where clause is a column in orders table! Your SQL is joining on unnecessary rows from orders & customers!
You guys are awesome!
Index usage tip: When using params in your query (e.g., select .... where year > ?), databases may not utilize an index if it is unbalanced. For instance, if you have approximately 1 million rows with year = 2022 and only 1000 rows with year = 2023, the database cannot predict whether the parameter will be useful for filtering. To resolve this issue, pass the value directly in the query itself, allowing the execution plan to determine if the index is suitable for the intended purpose.
As I wrote in my comment, good understanding on how you db engine works is key. And they are all different. So never assume that a good query on a MySQL will be a good query on Postgres, Oracle or any SQL engine.
this opens the gate for SQL injection, don't do this
@@maf_aka I think the idea was not to use prepared statements *where you don't need them.* E.g. if you already have validation in place that ensures your received value is enum (number, null, etc.) - you can be sure no SQL injection is possible there - so no need to use prepared statements *there.*
Ok, but then you get a different query plan for each (different parameter / set of parameters) query
@@lethern2 yep. but that's why you need to understand how your db engine works
What program is this used in the presentation?
muchas gracias!
Something doesn't add well here. If you notice HAVING clause refers to 'total_spent' which is defined in SELECT, so dependency wise HAVING should be after SELECT and not before it.
I always thought that the SELECT happened before HAVING, considering that we can use SELECT aliases in the HAVING filter.
good explaination
מדהים!
Hi The actual plan should be derived from the explain and explain analyze right instead from the query?
Nice bird's-eye view introduction.
It is not clear how to 'use appropriate indexes' to optimize for sorting, and how to implement pagination. Especially in your example where the sort order is made on an aggregate.
order_date is mentioned as indexed - is that implicit or explicitly defined?
thank you for your video,
i working on IT with 10 years experience, but I never know the order between JOIN and WHERE,
utill I watch this video
I have always thought that the Sql structure is poorly designed by not starting from FROM and placing the reference at the end of the statement, for example in a SELECT it should go just before ORDER BY, in an UPDATE the SET after WHERE, etc. Somehow they wanted to remedy the problem by introducing the WITH clause but I'm sure many regret that whoever designed the language should have worked a little harder at the time.
Would building a cte table and then running a non-sargable query on it, should also be avoided?
Question: at the end of the video you mentioned do not sort the whole data and use pagination for optimizing ORDER BY and LIMIT. Those are the things I use for pagination! What do you mean by that?
The other thing is from your video LIMIT happens after ORDER BY. How come it can help when ORDER BY has already happened?!
Btw great videos and content, thank you for these
This is pretty cool.
Can you make a video explaining the difference between system design and software architecture?
What tool do you use to generate your animations?
Very good video
This query actually does not need to join customers table since all the fields are present in the orders table already. (unless there are invalid / dirty customer_id data in the orders table and you want to filter them out)
I feel like this is a bit misleading because sometimes where and select influence the first stage. As you said, when there’s a covering index, the database won’t read the entire table. So the select and where influence what is read from the source.
Order and limit can also come it at the source as well if the index can be used with the order. You refer to this when you talk about “sorting the whole table”.
CTEs and sub queries are not mentioned but that’s okay i guess.
so in the above example, which place we should index ?
my app didnt reached 40 queries per second yet but i will implement that just in case my app will be next amazon :D
00:45 Understanding SQL query execution and optimization techniques
01:30 Understanding SQL execution plans can optimize queries for better performance
02:15 Optimizing SQL queries through index usage
03:00 Writing soluble queries is essential for optimizing database performance.
03:45 Sargable queries improve query performance.
04:30 Understanding the SQL execution order is crucial for query optimization
05:15 Optimizing SQL Queries with Indexes
05:57 Understanding SQL execution order is key
Crafted by Merlin AI.
Why are we using HAVING total_spent >_ 1000, but not WHERE total_spent >_ 1000 ? Can you please explain?
Will it be even faster if we always order where first and join after?
很不错
hi, can you enable captions/subtitle for this video? thank you!
Can you/anyone please explain execution of case when and window function with group by
0:12 _[JOIN comes before WHERE]_
is there any way to make the WHERE clause execute first to narrow the rows required to make the JOIN in the first place??
this is the only reason i still do this using a nested query rather than JOIN
a CTE can be benefitial in your use case.
This is top-notch in every aspect. I read a book with similar content, and it was top-notch. "Better Sleep Better Life" by William Brook
Will this work with MySQL as well?
In this example the 'total_spent' alias is already in use in the HAVING clause without defining. How is that possible?
yes, I have the same question, it doesnt make sense...
Having uses total_spent from the SELECT, so how come HAVING is executed before the SELECT?
I'd say so too. This is error. First SELECT part is evaluated, then - HAVING part.
Can anyone help me when does the function count or sum will be executed will it be after limit ?
I think Order by is evaluated before select as order by might change selected rows...is it correct?
Yes this vid is full of mistakes
is there a way to contact you? I have some specific questions on indexes?
Is that a typo in the first select clause, total spent should be total_spent?
yes, i think so, and I have another question, 'Having' uses total_spent from the SELECT, so how come HAVING is executed before the SELECT? Doesnt make sense...
you should have more subtitles
can someone explain to me what's mutant query plans with a real life example?
Ambiguous query
I still don't understand the difference between first point noted on here 3:19 and second point noted on 3:23. Would you mind to re-explain it ? thank you!
subtitles not available
What about mongodb ?
Why are there no subtitles? I need subtitles. Thank you very much!
Where is the translation of the CC?
Please add subtitles
👍
這集沒字幕..