F2023 #05 - Storage Models & Database Compression (CMU Intro to Database Systems)
ฝัง
- เผยแพร่เมื่อ 2 พ.ย. 2024
- Andy Pavlo (www.cs.cmu.edu...)
Slides: 15445.courses....
Notes: 15445.courses....
15-445/645 Intro to Database Systems (Fall 2023)
Carnegie Mellon University
15445.courses....
Some timestamps
12:12 - Storage Models
13:27 - N-ary Storage Model
20:00 - Decomposition Storage Model
32:59 - DSM: Tuple Identification
34:44 - DSM: Variable-length data
36:38 - DSM: History
41:10 - PAX (Partition attributes across) model
45:50 - Compression
Great sessions. Hooked up to it.
LOVE THE NEW INTRO.
39:39 Why is subway in there 😂? Did subway built a columnar database for their needs? I googled it but couldnn't find anything related to it.
PS: oh someone asked the same question at 40:56
Great intro & outro!
thanks a lot. really appreciated.
39:50 re: Parquet's provenance, I don't know about Dremio being involved, but I think Cloudera and Twitter were heavily involved and then contributed it to the Apache foundation
Slide 44: It must be 8 * 2-bits = 16 bits, right? Because we are having 8 tuples not 9. And the 2 *8-bits = 16bits stand for mapping the bits into concrete values ("Y"/"N" in this case), right?
Yes, you are correct. It was a typo. Fixed! Thank you.
Loved it
Slide 42. Compressed value 2 should be 191, no? Loving the series btw!! Thank you!! 🎉❤
Slide 22, can we use hybrid? Have an index that maps virtual offset (virt. page_id, virt. slot_num) of row in table to list of physical (page_id, slot_num) to which row attributes belongs to? And then have index that maps physical (page_id, slot_num) to virtual offset? This way we can make point query lookups just traversing two indexes and get all physical pages and slots for row at once. At the same time we don't need store embedded ids with attributes in the same page.
Virtual offset it's probably like rowid in sqlite (0, 1, 2, ...)
That one dude laughing at isDead example lmao.