The great part about your videos is how you build the concepts from the ground up with first principles & fundamental intuitions. I am watching your videos after reading DDIA and I am able to appreciate them so much more. Thanks Jordan for your efforts.
Love it Jordan ! Thanks for the video, looks like you are confident regardless of the rejections, as you just want to perform data analysis. One thing tho, I think LSM tree itself should not be an in memory binary tree, it consists of both in-memory component of memtable (binary tree) and the on-disk component SStable
To summarise: Column storage - data of a column is stored together in disk. Selecting few column for analysing is faster as we can pick those column in faster way as they are located at same location in disk. Column oriented storage also allows column compression.
Love this series! Thank you, Jordan. Can you please post the pdfs to the ipad notes that you write during the videos? It can help a lot in quickly refreshing the topics.
Bdw, like your content & thank you for making these videos. Some unsolicited skin advice. I heard you mention about bad skin/acne in one of your earlier video. I had something similar & would suggest cutting out all sugar, beer, wine, simple carbs from diet. Increasing fiber intake, adding a probiotic & taking all of your multivitamins. Healing leaky gut takes a while. All the best! 😊
Hey Jordan, thank you for the series, complicated concept stick really well thanks to silly examples! As i understood, both compressions are used only if there is a small amount of possible values. But why would i ever use dictionary compression, that doesnt reduce amount of values stored, just makes each value smaller, if regular compression greatly reduces amount of values stored? Wont result of regular compression always take less space?
Additionally if you just need one column, I assume you don't have to load a bunch of extra data from disk into memory just to then filter it all out. So in addition to locality benefits, you also don't have to read as much stuff from disk. Although if latency outweighs bandwidth because you aren't reading much data, it might not matter.
Is the compressed data stored along side the actual table on hard disk? How does the client knows that compressed data 011000110(example) denotes real value (let's say)1?
There would be a little bit of additional metadata saying that the column is in compressed form - it would defeat the purpose of compression if we stored the original column besides it :)
If columns are stored separately, how do I get data from a column that is not where clause? ex: select name where age < 23 // how it matches column name with column age?
This whole playlist is a valuable treasure.
Not sure if anyone (or may be all of you) noticed the phrase he very smoothly utters at 6:58 -- “at least that’s what she said…” … amazing man!!
Haha, didnt notice it.. such humor in the context of explaining technical stuff... awesome:)
The great part about your videos is how you build the concepts from the ground up with first principles & fundamental intuitions.
I am watching your videos after reading DDIA and I am able to appreciate them so much more.
Thanks Jordan for your efforts.
The way you synthesize the information given in DDIA in 15 Minutes for a topic is amazing:)
Love it Jordan ! Thanks for the video, looks like you are confident regardless of the rejections, as you just want to perform data analysis. One thing tho, I think LSM tree itself should not be an in memory binary tree, it consists of both in-memory component of memtable (binary tree) and the on-disk component SStable
The humour makes these videos even better!
To summarise:
Column storage - data of a column is stored together in disk. Selecting few column for analysing is faster as we can pick those column in faster way as they are located at same location in disk.
Column oriented storage also allows column compression.
The more that we can fit in there - the better... that's what she said... I'm dying))))
Love this series! Thank you, Jordan. Can you please post the pdfs to the ipad notes that you write during the videos? It can help a lot in quickly refreshing the topics.
Hey Pavan! I will once my current series is complete so that I can do it all in bulk
@@jordanhasnolife5163 if it's complete can you do it now
Bdw, like your content & thank you for making these videos.
Some unsolicited skin advice. I heard you mention about bad skin/acne in one of your earlier video. I had something similar & would suggest cutting out all sugar, beer, wine, simple carbs from diet. Increasing fiber intake, adding a probiotic & taking all of your multivitamins. Healing leaky gut takes a while. All the best! 😊
Thanks man! I'd love to cut alcohol, alas I love it - the main issue for me personally I think is eating tons of dairy for lifting, doesn't help me
Hey Jordan, thank you for the series, complicated concept stick really well thanks to silly examples!
As i understood, both compressions are used only if there is a small amount of possible values. But why would i ever use dictionary compression, that doesnt reduce amount of values stored, just makes each value smaller, if regular compression greatly reduces amount of values stored? Wont result of regular compression always take less space?
You can't always do regular compression! It requires similar values to be next to one another on disk
Additionally if you just need one column, I assume you don't have to load a bunch of extra data from disk into memory just to then filter it all out. So in addition to locality benefits, you also don't have to read as much stuff from disk. Although if latency outweighs bandwidth because you aren't reading much data, it might not matter.
Absolutely correct
Could you pls do a video on different types of indexes - clustered, multi dimensional etc?
What if there is more than 9 rows during conversion from bitmap to run lenght encoding? :D How would one know if 13 denotes "13" or "1" and"3"
Keep in mind that these are really numbers in binary, so assuming we used an int to represent our run size we just read the next 32 bits.
Is the compressed data stored along side the actual table on hard disk? How does the client knows that compressed data 011000110(example) denotes real value (let's say)1?
There would be a little bit of additional metadata saying that the column is in compressed form - it would defeat the purpose of compression if we stored the original column besides it :)
If columns are stored separately, how do I get data from a column that is not where clause? ex: select name where age < 23 // how it matches column name with column age?
Sort them in the same order
Along with the data, the row_id can be present, like:
Name: Jordan:1, Trump:2
Age: 25:1, 102:2
Audio is completely out of sync tho 😪
Damn, will look into this and have it fixed by next time
did you choose 322 on purpose are u a dota 2 fan ?
Haha I did not didn't know there was a reference there
nomad filmmaker we know about... but nomad coder? wow!
Haha not exactly, but that would be fun