Quick question... the switching looked instant, although I would expect that at least the first execution of the query should still be slow to fill the boost cache and setup the change listeners.. Did you just skip that for the vid?
Boost fills the cache in the background. So until Boost is ready the queries don't run through it. As far as listeners, it just subscribes to the existing Vitess events so that part is pretty lightweight
I didn't give it a try but I'm curious: - The cache is updated in sync or async? will the updates be available right after we write the new values? - Will the cache work well on heavy write tables? Something like a data logs table that can have a thousand rows written per second
Do you know what the only problem with this channel is? Without a doubt, it's not having videos every day. 😂 Congratulations on yet another great video. ❤
So, the more materialized cached views you have using the same tables, the worst the write performance will be? How much impact does it have? I'm curious if it is an acceptable trade off, like if you get an extra 2s on a write but gain 60s on a read, I would be ok with that. Could you shed some light please?
As someone who works with bigger databases and complex queries on a daily basis, I can say, that this is quite unrealistic. The queries that take a lot of time are not the queries, that are run over and over again. Those repeated queries are usually optimized already. The big long queries are the queries that are very complex and have many dynamic parts (like different where clauses, changing group conditions, etc.) and can therefore not be cached or optimized in the way you are saying.
If something sounds too good to be true, that's probably because it's not true. If it was this easy to make every query a thousand times faster everyone would do that. Disappointing to see clickbait on this channel, but good to know.
Selecting a single column is less data. Less data needs to be read from storage, sent over the network and in the end displayed/processed, so it'll be faster. "*" in general means the db will need to first fetch what columns exist on that table in order to retrieve them. So it's never ideal for performance, even if you need all the data.
@@gileee Thanks Didn't quite get the last part though "So it's never ideal for performance, even if you need all the data." What is not ideal... "SELECT *" ??
@@adarshchacko6137 Yeah. If you do * apparently the db has to first fetch the definition of the table to figure out which columns it has. So it would be slightly more performant to write "SELECT col1, col2, ..." and list all the columns.
The most technical documents we have on the implementation would be in the linked blog post! I think in the blog post there is also a link to the academic paper it's based upon
@@PlanetScale This OSDI ’18 paper is really very detailed. I would like to know more code details. Is the source code of Boost function public? So which file is the entrance to the implementation in the github warehouse?😀
10:32 what about the data that already exist in the database? do you rewrite the entire data or just move the existing data to cache based on the query that we want to boost
It's a good question! The research paper was only released recently. We're among the few people in the world that have implemented it at a production grade. It's tough to deconstruct queries and apply partial updates to caches upon write instead of rerunning entire queries, so that's my guess.
The auto creation of triggers to update cached representations / views on the fly on data changes sure is impressive.. how ever.. isn't this 101 of database architecture, anyone creating proper databases should know? This feels like a feature primarily targeted towards people with lacking database knowledge. Especially the video title.
Sorry if it wasn't clear, this does not rely on triggers or views. It creates a query plan, inverts it, and incrementally updates the cache as new writes come in. It works for more than just counts, and it doesn't run the entire query again. The attached blog post goes into more detail if you're interested.
Boost deconstructs the query plans and applies incremental updates to the cache when data is updated. For more details check out the video or the linked post!
Just found your chanel, can't stop watching :D, great videos so far, keep up the good work. For things like this i tend to to add a redundant (count/calc/processed) column to the table and update it on created_at/updated at, it does just fine. your method is probably faster since you do less with db but u use more non-db resources, so it's trade off i guess.
Not necessarily. A view gets re-created/re-calculated whenever it is read/queried. With planet scale boost it is re-calculated when the underlying data which the query represents is written to
I love this guy. Why I don't this guy recently?
Quick question... the switching looked instant, although I would expect that at least the first execution of the query should still be slow to fill the boost cache and setup the change listeners..
Did you just skip that for the vid?
Boost fills the cache in the background. So until Boost is ready the queries don't run through it. As far as listeners, it just subscribes to the existing Vitess events so that part is pretty lightweight
You guys are amazing. Love to watch your videos and learn new things.
I didn't give it a try but I'm curious:
- The cache is updated in sync or async? will the updates be available right after we write the new values?
- Will the cache work well on heavy write tables? Something like a data logs table that can have a thousand rows written per second
Do you know what the only problem with this channel is? Without a doubt, it's not having videos every day. 😂 Congratulations on yet another great video. ❤
Thank you!
Probably never gonna need this, but I'm curious why an 8GB cache is more than 8x as expensive as a 1GB cache.
So, the more materialized cached views you have using the same tables, the worst the write performance will be? How much impact does it have? I'm curious if it is an acceptable trade off, like if you get an extra 2s on a write but gain 60s on a read, I would be ok with that. Could you shed some light please?
Does Boost support ORDER BY queries? And does it support the use of NOW() in a WHERE clause?
As someone who works with bigger databases and complex queries on a daily basis, I can say, that this is quite unrealistic. The queries that take a lot of time are not the queries, that are run over and over again. Those repeated queries are usually optimized already. The big long queries are the queries that are very complex and have many dynamic parts (like different where clauses, changing group conditions, etc.) and can therefore not be cached or optimized in the way you are saying.
If something sounds too good to be true, that's probably because it's not true. If it was this easy to make every query a thousand times faster everyone would do that. Disappointing to see clickbait on this channel, but good to know.
Curious question:
Between the 2 which is faster and why ?
QUERY 1:
SELECT * FROM users;
QUERY 2:
SELECT FROM users;
Selecting a single column is less data. Less data needs to be read from storage, sent over the network and in the end displayed/processed, so it'll be faster.
"*" in general means the db will need to first fetch what columns exist on that table in order to retrieve them. So it's never ideal for performance, even if you need all the data.
@@gileee Thanks
Didn't quite get the last part though "So it's never ideal for performance, even if you need all the data."
What is not ideal... "SELECT *" ??
@@adarshchacko6137 Yeah. If you do * apparently the db has to first fetch the definition of the table to figure out which columns it has. So it would be slightly more performant to write "SELECT col1, col2, ..." and list all the columns.
Understood... Thank you @gileee
Hi, that's great! Is there any introduction to the implementation of the boost acceleration function in the source code?
The most technical documents we have on the implementation would be in the linked blog post! I think in the blog post there is also a link to the academic paper it's based upon
@@PlanetScale This OSDI ’18 paper is really very detailed.
I would like to know more code details. Is the source code of Boost function public? So which file is the entrance to the implementation in the github warehouse?😀
What are the disadvantages?
Cost, but it’s free during beta
Is this eventuell consistent or does writing block until boost’s caches are updated?
This guy is very smart
10:32
what about the data that already exist in the database? do you rewrite the entire data or just move the existing data to cache based on the query that we want to boost
Man you are awesome, i am binge watching your stuff, kudos
Thank you!
Write-through cache
It's similar but PlanetScale Boost is more performant and less to maintain.
@@PlanetScale I shall get to the bottom of this dark magic
Why does a simply query that returns just a thousand records take almost an entire second in first place?
It's joining in a table that's an aggregate over a few million rows
@@PlanetScale ah I see, my bad. I don't use planetscale myself, but you do have some good SQL videos, so keep up the great work, cheers!
why isn't boost standard......?
HOW isn't he question. WHY is the question. We have good enough chips to leave 1980 behind.
It's a good question! The research paper was only released recently. We're among the few people in the world that have implemented it at a production grade. It's tough to deconstruct queries and apply partial updates to caches upon write instead of rerunning entire queries, so that's my guess.
What is the sql client are you using? Look so sleek
TablePlus!
Does this have a penalty on write performance?
Sounds the same as a projection
Impressive ! Is the web interface for continuous testing opensource ?
No unfortunately not!
You'd think a cache would be affordable, $100/month minimum is wild.
Anyone knows the Vetez he is mentioning in the video?
vitess.io!
Excellent feature but too expensive.
The auto creation of triggers to update cached representations / views on the fly on data changes sure is impressive.. how ever.. isn't this 101 of database architecture, anyone creating proper databases should know?
This feels like a feature primarily targeted towards people with lacking database knowledge.
Especially the video title.
Sorry if it wasn't clear, this does not rely on triggers or views. It creates a query plan, inverts it, and incrementally updates the cache as new writes come in. It works for more than just counts, and it doesn't run the entire query again. The attached blog post goes into more detail if you're interested.
Error Code: 1193. Unknown system variable 'boost_cached_queries' 0.141 sec
That's nice!! It's like IBM DB2 version of MQT with MAINTAIN BY SYSTEM option, but created automatically! That's cool!!
You forgot to explain the architecture of your solution, even if it OOB in MySQL.
I hate when people do not use table/view aliases, it is a very bad habit, it makes complex queries difficult to analyze
This guy is very strong
What is the application used for running SQL query?
He literally said it in the video?
TablePlus!
@@isakhansson917 maybe I missed 😅
How dose it make it 1000 x faster ?
Boost deconstructs the query plans and applies incremental updates to the cache when data is updated. For more details check out the video or the linked post!
Lenseless glasses T_T
🥸
if we use Planetscale Boost we don't need caches like radius right???
For queries that are Boosted, you would no longer need a cache!
This guy is very me
light mode 🤨
@@OrangeNerd1 ¯\_(ツ)_/¯
Just found your chanel, can't stop watching :D, great videos so far, keep up the good work.
For things like this i tend to to add a redundant (count/calc/processed) column to the table and update it on created_at/updated at, it does just fine. your method is probably faster since you do less with db but u use more non-db resources, so it's trade off i guess.
Same 🤓
Can’t be the only one that ended up feeling this was an Ad disguised as a video.
👎🏻😞
You guys have gotten my business, your product is amazing and prioritizes speed and affordability which is key for me
This guy is very handsome
your last name checks out!
5:28
Quite weird that he wears fake glasses..
Sheeeesh
Very cool!
Pretty old days my mysql 😊.
Love to see this happen. I'll really give a shot in my next upcoming project.
Amazing!
Love planetscale coz of these improvements they're doing 👍🏻
This is what we call a View
Not necessarily. A view gets re-created/re-calculated whenever it is read/queried. With planet scale boost it is re-calculated when the underlying data which the query represents is written to
Kind of! In the video I explain why it's like a view, but fundamentally quite different.
set @@boost_cached_queries = 1;