Hi Mr. Pavlo. I did not get your point at 23:28. If we have primary key lookups, MySQL way is better since we do one lookup and we find the whole tuple there. Whereas if we have secondary indexes in MySQL that point to primary index (since that's where the whole tuple is stored AFAIU), this amounts to two lookups. For the Postgres approach, we first lookup the key in the index and find the record ID to fetch the page (this applies to both primary and secondary indexes AFAIU). So this amounts to also two lookups till we get to the tuple. So the question is, if both ultimately need two lookups for secondary indexes, why is MySQL the worse choice in case of secondary indexes? Thanks a lot in advance 🙂
I am currently studying this topic, so anyone can correct me if I am wrong. Prof. Pavlo mentioned that leaf node for secondary index in MySQL stores primary key. (22:01) Thus, assuming you found the leaf node for the secondary key, - MySQL: probe the primary key index to search for the record - Postgres: do a lookup on the page that the record is stored So, doing a simple lookup (PostgreSQL) is faster than probing the primary key index (MySQL)
You shouldn't treat an index query process as one lookup, cause it has two steps: first traverse the tree to get the leaf page location, and second fetch leaf page. The cost of the first step is O(height of the tree) , are same in both of them. but the cost of the second step vary between MySQL and Postgress. When doing Primary Search: MySQL is better than Postgress, but when doing Secondary Search: Postgress is better
Was surprised to hear it too, 'cause, the article that details their reason for switching to MySQL was too detailed, (and covered so many other use cases of theirs asides fetching data) that it shocked me to hear that they reverted. And Andy himself had said (in the 2023 classes), that Postgres only has a good API but with BAD implementation. I ALMOST concluded that Postgres was bad for any use case after reading that (Uber) blog post.
Hi Mr. Pavlo. I did not get your point at 23:28. If we have primary key lookups, MySQL way is better since we do one lookup and we find the whole tuple there. Whereas if we have secondary indexes in MySQL that point to primary index (since that's where the whole tuple is stored AFAIU), this amounts to two lookups.
For the Postgres approach, we first lookup the key in the index and find the record ID to fetch the page (this applies to both primary and secondary indexes AFAIU). So this amounts to also two lookups till we get to the tuple.
So the question is, if both ultimately need two lookups for secondary indexes, why is MySQL the worse choice in case of secondary indexes?
Thanks a lot in advance 🙂
I am currently studying this topic, so anyone can correct me if I am wrong.
Prof. Pavlo mentioned that leaf node for secondary index in MySQL stores primary key. (22:01)
Thus, assuming you found the leaf node for the secondary key,
- MySQL: probe the primary key index to search for the record
- Postgres: do a lookup on the page that the record is stored
So, doing a simple lookup (PostgreSQL) is faster than probing the primary key index (MySQL)
You shouldn't treat an index query process as one lookup, cause it has two steps: first traverse the tree to get the leaf page location, and second fetch leaf page. The cost of the first step is O(height of the tree) , are same in both of them. but the cost of the second step vary between MySQL and Postgress. When doing Primary Search: MySQL is better than Postgress, but when doing Secondary Search: Postgress is better
IIUC, it should be "Secondary indexes must store the primary key as their values." based on the video, right?
What’s the Uber blog post of them switching BACK to Postgres? (@23:40)
www.uber.com/blog/postgres-to-mysql-migration/
I searched for it and didn't find any clue
Was surprised to hear it too, 'cause, the article that details their reason for switching to MySQL was too detailed, (and covered so many other use cases of theirs asides fetching data) that it shocked me to hear that they reverted.
And Andy himself had said (in the 2023 classes), that Postgres only has a good API but with BAD implementation.
I ALMOST concluded that Postgres was bad for any use case after reading that (Uber) blog post.
Hi all, why does the size of a node equal to a size of a page