The Basics of Database Sharding and Partitioning in System Design
ฝัง
- เผยแพร่เมื่อ 4 มิ.ย. 2024
- Make sure you're interview-ready with Exponent's system design interview prep course: bit.ly/3YTjsjH
Learn the basics of database sharding and partitioning in system design with this video! Database sharding and partitioning are both about breaking up a large data set into smaller subsets. Sharding implies that the data is spread across multiple databases, while partitioning usually implies dividing a table into multiple pieces based on some separation criteria. We'll explain the differences between these two concepts and when to use each one.
Chapters -
00:00 - Intro
01:18 - Sharding techniques
02:43 - Manual vs Automatic sharding
04:27 - Advantages of sharding
05:13 - Disadvantages of sharding
Watch more system design videos here:
- Meta engineering manager answers a rate limiter interview question: • System Design Mock Int...
- Google SWE answers an algorithms interview question: • Google Software Engine...
- Google TPM answers Tiktok system design interview question: • System Design Mock Int...
- Flipkart EM “Design Amazon Prime Video” system design interview question: • System Design Intervie...
👉 Subscribe to our channel: bit.ly/exponentyt
🕊️ Follow us on Twitter: bit.ly/exptweet
💙 Like us on Facebook for special discounts: bit.ly/exponentfb
📷 Check us out on Instagram: bit.ly/exponentig
📹 Watch us on TikTok: bit.ly/exponenttikttok
ABOUT US:
Did you enjoy this interview question and answer? Want to land your dream career? Exponent is an online community, course, and coaching platform to help you ace your upcoming interview. Exponent has helped people land their dream careers at companies like Google, Microsoft, Amazon, and high-growth startups. Exponent is currently licensed by Stanford, Yale, UW, and others.
Our courses include interview lessons, questions, and complete answers with video walkthroughs. Access hours of real interview videos, where we analyze what went right or wrong, and our 1000+ community of expert coaches and industry professionals, to help you get your dream job and more!
Make sure you're interview-ready with Exponent's system design interview prep course: bit.ly/3YTjsjH
Animations to visualize what she is saying would make this video perfect!
I didn't knew what a database sharding was. This video gave me good amount of topics for me to research and learn. Thanks for the video!
This was exactly the information that I needed. Thank you!
you guys are amazing i recently found your channel i am learning a lot and i am loving it
very well described, thanks for sharing.
Greatly explained, I subbed
Great and to the point explanation, No bluff
Thanks
Glad you liked it!
Great video!
Some people are very beautiful with a helping hand , thanku❤
Crystal clear
Awesome, thanks
Awesome explanation.
Thanks!
Great video on sharing, but partitioning wasn't mentioned or discussed.
a few things to add. i prefer partitioning based on a guaranteed key in the sense it will not distribute badly ... so the "first letter of name" is a bad idea. better use the record id and group 100k of them or what into a partition. then before storing partitions on different servers, there are a few more things to do first. one is to split modifying queries from read-only queries (which has to be done on the application level) so a simple read-replica-server (which is trivially to be setup in postgres) can be used. next what is possible is a db split on the logical level. i mean for example keep the user's core data on db1 and chat messages on db2. leaving out foreign keys and using weak references instead, with a periodic cleanup job that resolves broken links is a good idea, eliminating issues on backup restore when cut in a bad moment as well.
Coming from a decade+ of data work with health records, I have to bump this comment. Name, location and birthdate combined still aren't unique. Messing up data with potential tromps like this is straight up lethal in some fields.
Remember, friends: bad data is worse than no data.
I would think that another potential disadvantage would be if you are using commercial rather than OpenSource operating systems or databases where the licensing costs increase as the number of servers increase also.
Good video but confusing use of the term 'partition', which is different than 'shard'.
Who is she and how do we get more videos with her?
Monolithic Databases??
Untill her hands moved I thought she was an AI robot 😂
The video script explains the basics of database sharding and partitioning in system design. It discusses how sharding can help manage large amounts of data by breaking it up into smaller partitions spread across multiple servers. The script also highlights the advantages and disadvantages of sharding in terms of scalability, performance, and operational complexity.
Key moments:
00:32 Traditional databases encounter limitations with increasing data size, necessitating sharding to enhance scalability and performance.
-Geobase sharding partitions data based on user locations, reducing latency by routing users to the closest node.
-Range-based sharding divides data by key value ranges, simplifying partition computation but potentially leading to uneven splits.
-Hash-based sharding uses hashing algorithms to evenly distribute data across partitions, reducing hotspots but potentially separating related rows.
-Automatic sharding dynamically manages data partitioning for higher performance and scalability, but manual sharding at the application layer increases development complexity.
03:55 Sharding enables scaling, faster queries, and system availability, but poses challenges like complex management, hot spots, and high operational costs.
-Advantages of sharding include scalability, faster queries, and improved system availability during outages.
-Disadvantages of sharding involve complex data relationships, potential hot spots, and operational costs for maintaining high availability.
Generated by sider.ai
It sounds you messed up partitioning with sharding.
And commodity hardware does not have ECC - don’t run a db on it.
Each partition is stored within the same database server SO it's easier because sharding require multiple database servers ?
Sorry, everyone...
I parted *_and_* sharded 😢
Some visualization would have gone a long way
Thanks for the feedback!
You are looking so cute 🥰
Well thanks for reading the script.
😂😂😂
A lot of these YT educators write down the material before speaking to the camera. What’s your point?
am in love with this lady what her id
you got the definition of Sharding wrong. understood you never did sharding in your life.
reading for a teleprompter is not teaching!! sure it gave me topics that I can refer myself
A lot of youtube educators have their material scripted before speaking to the camera? What’s your point?