This is the ideal db lectures every school should offer but not every student deserves... only in CMU... just awesome.. Prof. Pavlo, knows db, luvs db...
@@jaffreyjoy cuz u have to get admitted by school like CMU first.. most of their students have been working hard on the admission. Yep not everyone deserves.
1:04:48 - I'm not sure I understand one thing. If the compaction caused the slot number to change for a tuple "ccc", that means that the upper parts of the system (e.g. indexes) HAVE TO get notified that the way they used to refer to that tuple (page:74608, slot:2) is not correct anymore.
notes to self: we have a page directory to help find exact mem location of a page and we inside each page we have a header + slot array that help locate the mem location of a tuple
For a moment I thought that slots (or offset/slot) part of the record identifier never changed and that the only thing that changed was what that slot pointed to. Like, in case tuplets get re-organized/de-fragmented, slots would update their pointers. But it seems, based on this lecture, that slots that the system exposes as part of the record identifier can change.
You said that rowid(s) are useful because we don't have to update indexes, but when we inserted into SQL Server, and it changed ids, does it mean that it would have to update indexes as well? Wouldn't it be better to keep ids as is, even if we had to move tuples inside the page (to compact data), and fill the empty slot in the middle with a new tuple using the address after existing tuples? Or do slots have to have strictly increasing offsets for data at the page? Well, I guess, the answer is "it depends" 🙂
I don't understand why we need the slotted pages. If a page is full and we do something like update all the tuples so that they are a bit bigger, won't we then have too much data to store in the page, and we'll need to deal with invalid references anyway? Or can the size of tuples not change? Or do we need to deal with invalid references in that case but we just prefer not to do that all the time for efficiency reasons, and the slots just help us do it less?
does this mean that maximum row size can never be greater than 16kb ? Im using postgres at work and i think some rows easily exceed 16kb (with jsonb data).
a very important question does this course feels like it's in depth course ? i mean does software developers that aren't gonna specialize in DB administration have to know all of this stuffs like data storage in DBs ?
THB I don’t really like the fact that the beep muting the original words, which does interrupt and lose the original feel of the course. I think we should honor what the professor said unchanged.
I hope the part about not talking to his family because of a voting choice was a joke :(. I know politics are important in America, but your family should be even more important.
it would be better to just leave the vid as what it was, right? he thought it was okay to use profanity during live lecture (school seems okay with him) and then dozens of youtubers use profanity here as well .... personally i think it would be better to not mute it.. the beeeeeeeeep sound really hits my eardrum & getting annoyed of that
This instructor is so cool. He makes database course fun to learn.
I love how he curses, and has a DJ! 🤣
So much more down to earth than I'm used to.
This is the ideal db lectures every school should offer but not every student deserves... only in CMU... just awesome.. Prof. Pavlo, knows db, luvs db...
Not every student deserves???
@@jaffreyjoy cuz u have to get admitted by school like CMU first.. most of their students have been working hard on the admission. Yep not everyone deserves.
I have been DBA for years and I did not know these intimate details. Great thanks to AP. You are simply awesome.
This course is awesome! This guy is like sensei of database systems.
1:06: In one statement from Oracle: insert into r (select 101,'aaa' from dual union select 102,'bbb' from dual union select 103,'ccc' from dual)
1:04:48 - I'm not sure I understand one thing. If the compaction caused the slot number to change for a tuple "ccc", that means that the upper parts of the system (e.g. indexes) HAVE TO get notified that the way they used to refer to that tuple (page:74608, slot:2) is not correct anymore.
I exactly had this question and scrolling down comments to find that I got it right .
Amazing. I was reading a totally different db book and wondering why we weren't using virtual memory. This is exactly the answer I needed!
notes to self:
we have a page directory to help find exact mem location of a page
and we inside each page we have a header + slot array that help locate the mem location of a tuple
Invaluable content!! I have been looking for this for a long time
kudos to the video auditor that took the time to beep the sh*t out of the video
loll this class is sick. I wish my profs were this cool.
This course is so good. Andy is awesome
For a moment I thought that slots (or offset/slot) part of the record identifier never changed and that the only thing that changed was what that slot pointed to. Like, in case tuplets get re-organized/de-fragmented, slots would update their pointers.
But it seems, based on this lecture, that slots that the system exposes as part of the record identifier can change.
He is just awesome!!. Can you please upload the lectures un-beeped? (may be by age-restricting them) The beeps are not good! Thanks a lot Andy!!
best db course
What is the book thay use in this course?
You said that rowid(s) are useful because we don't have to update indexes, but when we inserted into SQL Server, and it changed ids, does it mean that it would have to update indexes as well? Wouldn't it be better to keep ids as is, even if we had to move tuples inside the page (to compact data), and fill the empty slot in the middle with a new tuple using the address after existing tuples? Or do slots have to have strictly increasing offsets for data at the page? Well, I guess, the answer is "it depends" 🙂
From within Oracle 'set history on' gets you command recall.
can someone explain "WHY NOT USE THE OS?" part 18:42 can someone demonstrate with an example?
See this paper: db.cs.cmu.edu/mmap-cidr2022/
The best part starts at 22:08
Here lies one who hated mmap! xD
Love this song at the end of each lecture.
Thank you so much for sharing this invaluable content.
is this still relevant? i thought everything is stored in memory nowadays with spark etc
Can a tuple be larger than a page size and be splitted into two pages
Does the page have any relation with the page terminology in Operating System page?
Thanks for sharing this great course.
TDD and CI/CD for DBs/Data is the neglected frontier.
I don't understand why we need the slotted pages. If a page is full and we do something like update all the tuples so that they are a bit bigger, won't we then have too much data to store in the page, and we'll need to deal with invalid references anyway? Or can the size of tuples not change? Or do we need to deal with invalid references in that case but we just prefer not to do that all the time for efficiency reasons, and the slots just help us do it less?
i think it might be the 3rd one. Not sure though. Have to see next lectures to figure this out.
does this mean that maximum row size can never be greater than 16kb ? Im using postgres at work and i think some rows easily exceed 16kb (with jsonb data).
No. For Postgres, they store larger values in separate TOAST storage tables.
So Andy is also called Andrew ...
is doing "vaccum full" on prod databases from time to time a good idea? since it appears to save space
Vacuum full locks tables when it rewrites them, so you need to be careful when you run it.
a very important question
does this course feels like it's in depth course ?
i mean does software developers that aren't gonna specialize in DB administration have to know all of this stuffs like data storage in DBs ?
You should have a good idea how to use databases, but if you are not going to specialise in this then maybe you don’t need to know everything.
讲得很好,许多地方和同学一起讨论后更清楚了
How does one create record id's for external tables ?
So does the OS uses virtual memory for everything except the i/o when running the database server? Won't that be a bottleneck?
Can you elaborate with an example?
This is an amazing class
THB I don’t really like the fact that the beep muting the original words, which does interrupt and lose the original feel of the course. I think we should honor what the professor said unchanged.
how to get the H.W Q's please ? thanks for the best instructor.
15445.courses.cs.cmu.edu/fall2021/assignments.html
did you get it yet ?
@@saifmohamed1776 did anyone get it?
what is the difference between files and pages?
A page is a small chunk of data inside a file. Like a cluster. It's a unit of storage.
Will teach LOG-STRUCTURED FILE ORGANIZATION in next lecture
1:00:00 Tim Kraska betrayed me.. lol
Who is Tim Kraska? And how did he betray you?
Bro is a straight g
Amazing video
This course require me to have some SQL language basis.
Like the begining~
great course!
50:00
Thanks so much for telling your students you hate Trump... Super helpful!!!!
you are talking way too fast, or i think i'm too slow. Excellent course!
58:40
This was so cool
23:58
Did he just say shit in a lecture and bleep it out?
I hope the part about not talking to his family because of a voting choice was a joke :(. I know politics are important in America, but your family should be even more important.
why are you bleeping the shit out of andrew?
course dj...🤣
it would be better to just leave the vid as what it was, right?
he thought it was okay to use profanity during live lecture (school seems okay with him) and then dozens of youtubers use profanity here as well ....
personally i think it would be better to not mute it.. the beeeeeeeeep sound really hits my eardrum & getting annoyed of that
fuck he talks so fast