Great video. Small correction at the 10:20. You said leaderless replication improves write throughput. Actually: -Leaderless replication trades write performance and consistency for improved availability. -Multileader replication trades consistency for increased write performance and availability. -Single leader replication trades write performance and availability for improved consistency This is because in leaderless replication, every node in a quorum is read and written from. This is like single leader replication since all servers are being hit for each request, so doesn't increase write throughput by letting different read/writes hit different servers. Often times, deployments will mix and match leaderless replication with multileader replication. In this case, you have different quorums for different sections of your application - you hinted upon this in sloppy quorum (profile quorum is different from message quorum)
AH thank you for this! I've been reading Designing Data Intensive Apps lately and it helps so much to have a review of the chapters. I wonder do you now (after 2 years from posting) think that it's a great book (in my opinion the best for a detailed System Design concepts) or did you maybe find something even better? Your videos rock!
So as explained (6:06) for the point no. two, If you restore a failed node from an older node then now that node is no longer going to have an up-to-date key and value right, But if the write was anyways failed so the up-to-date value is the value in the older nodes, ryt! because the new value should have never been written since it is a failed write!!! No how does this mess up the w+r>n equation?
1) you have 3 nodes 2) you successfully write to 2 of them 3) one of the two that was written to goes down 4) we bring up a new node 5) we restore the new node from the one without the write 6) we successfully read from those two nodes and don't see the new write
Hi Jordan, Awesome Content, Thanks for putting that together !! I generally see quorums being discussed for leaderless replication, Does this concept not valid for Single leader replication to achieve tunable consistency
While I imagine that is possible, it would require waiting for responses for a quorum of nodes, whereas single leader replication is typically completely asynchronous
First time I watched this I was just looking for help while reading "Designing Data-Intensive Applications", and this vide really helped to explain stuff especially sloppy quorums. Now I'm going through your whole playlist and have questions about putting it all together. Single leader with plenty of replicas is the move high read, low write databases, right? When you have higher writes there are many more specifics you can discuss like if only certain sections of data are being written or only certain users etc, so there are application specific factors to take into account beyond read/write ratio. With that said, determining multileader vs leaderless Both can speed up writes at the cost of reads, both also have the issue of write conflicts. Multileader may be more suited for a smaller number of servers, or an app that tolerates eventual consistency, is that right? Leaderless may be better for an app that has more servers and less predictable behavior? We can also change the throughputs to speed up writes and slowdown reads or vice versa by modifying the quorum (N=99 W=10 R=90, or N=99 W=50 R=50, ...) assuming we have a migratory period where all servers are on the same page before changing those values. I guess could you elaborate on when to use multileader vs leaderless?
Yeah to me this sounds mostly right. It basically just depends on how you want to deal with write and read speeds. In a leaderless setup you need to write to many nodes and read from many nodes, that being said you have far stronger consistency guarantees!
Single Leader: Conflict free, low write throughput Multi Leader: Has conflicts, but conflict free within a datacenter, good for cross data center coordination Leaderless: Has conflicts, best for maximizing write throughput, can use quorum reads and writes to try and achieve strong consistency or ditch those and just use read repair and anti entropy for maximum speed and eventual consistency
@@jordanhasnolife5163 What's the use of leaderless over Multi Leader? Both have high write throughput, both suffer from write conflicts. But reads will be slower in case of Leaderless due to read repair.
@@darth_vader4824 You don't have to do read repair - you could do anti-entropy which is enough. Read repair is just nice if you want to get data more consistent around the replicas more quickly. As a result, since multi leader configurations typically do not use anti entropy, they take a while to sync up, hence why people tend to mainly use it for cross data center operations (I, living in the US, don't care about up to date Chinese writes for the most part).
This channel is a pure gold. I've decided to watch every video you posted.
Thanks man!
@@jordanhasnolife5163 +1
Doing exactly that now and loving each video. These videos are addictive and set a high standard. Hope Jordan knows that by now.
Watched 4 vidoes in a day. Showing more commitment to this channel than most people do to their wives.
Only thing I commit to is getting this bread
Great video. Small correction at the 10:20. You said leaderless replication improves write throughput.
Actually:
-Leaderless replication trades write performance and consistency for improved availability.
-Multileader replication trades consistency for increased write performance and availability.
-Single leader replication trades write performance and availability for improved consistency
This is because in leaderless replication, every node in a quorum is read and written from. This is like single leader replication since all servers are being hit for each request, so doesn't increase write throughput by letting different read/writes hit different servers. Often times, deployments will mix and match leaderless replication with multileader replication. In this case, you have different quorums for different sections of your application - you hinted upon this in sloppy quorum (profile quorum is different from message quorum)
Yeah agreed with everything said, nice catch
Well, keep up the series, really enjoying the videos, commenting for the algorithm to push it to more users.
Thanks so much!
All your videos are great. So much content in just a short time. And man your Jokes at the start of the videos really cracks me up.
I am binge watching your playlist because I have nothing else to do tonight. How awesome am I ?
That's pretty metal, I'm all for it
Concise and exhaustive coverage of all the topics, thanks
AH thank you for this! I've been reading Designing Data Intensive Apps lately and it helps so much to have a review of the chapters. I wonder do you now (after 2 years from posting) think that it's
a great book (in my opinion the best for a detailed System Design concepts) or did you maybe find something even better?
Your videos rock!
Yep! Totally still think it's the best thing I've read :)
Great video, great examples, great enthusiasm. Really though this helped a lot, thanks.
So as explained (6:06) for the point no. two,
If you restore a failed node from an older node then now that node is
no longer going to have an up-to-date key and value right,
But if the write was anyways failed so the up-to-date value is the value in the older nodes, ryt! because the new value should have never been written since it is a failed write!!!
No how does this mess up the w+r>n equation?
1) you have 3 nodes
2) you successfully write to 2 of them
3) one of the two that was written to goes down
4) we bring up a new node
5) we restore the new node from the one without the write
6) we successfully read from those two nodes and don't see the new write
Thanks! Good and concise content, It will help me on my Cloud Computing test!
youtube why not you recommend this content ... committed to watch your awesome content bro :)
Love the videos, they are great
Thanks!
I usually hate everyone from tech, but you are so cool that I had to subscribe and like the videos, which is something that I never do
Oh baby, welcome Julio!
Hi Jordan, Awesome Content, Thanks for putting that together !!
I generally see quorums being discussed for leaderless replication, Does this concept not valid for Single leader replication to achieve tunable consistency
While I imagine that is possible, it would require waiting for responses for a quorum of nodes, whereas single leader replication is typically completely asynchronous
Beautiful!
First time I watched this I was just looking for help while reading "Designing Data-Intensive Applications", and this vide really helped to explain stuff especially sloppy quorums. Now I'm going through your whole playlist and have questions about putting it all together.
Single leader with plenty of replicas is the move high read, low write databases, right?
When you have higher writes there are many more specifics you can discuss like if only certain sections of data are being written or only certain users etc, so there are application specific factors to take into account beyond read/write ratio.
With that said, determining multileader vs leaderless
Both can speed up writes at the cost of reads, both also have the issue of write conflicts. Multileader may be more suited for a smaller number of servers, or an app that tolerates eventual consistency, is that right? Leaderless may be better for an app that has more servers and less predictable behavior? We can also change the throughputs to speed up writes and slowdown reads or vice versa by modifying the quorum (N=99 W=10 R=90, or N=99 W=50 R=50, ...) assuming we have a migratory period where all servers are on the same page before changing those values.
I guess could you elaborate on when to use multileader vs leaderless?
Yeah to me this sounds mostly right. It basically just depends on how you want to deal with write and read speeds. In a leaderless setup you need to write to many nodes and read from many nodes, that being said you have far stronger consistency guarantees!
Awesome content. Thanks.
thanks for doing this !
Great explaination
When to use a particular replication strategy?
Single Leader:
Conflict free, low write throughput
Multi Leader:
Has conflicts, but conflict free within a datacenter, good for cross data center coordination
Leaderless:
Has conflicts, best for maximizing write throughput, can use quorum reads and writes to try and achieve strong consistency or ditch those and just use read repair and anti entropy for maximum speed and eventual consistency
@@jordanhasnolife5163 What's the use of leaderless over Multi Leader? Both have high write throughput, both suffer from write conflicts. But reads will be slower in case of Leaderless due to read repair.
@@darth_vader4824 You don't have to do read repair - you could do anti-entropy which is enough. Read repair is just nice if you want to get data more consistent around the replicas more quickly.
As a result, since multi leader configurations typically do not use anti entropy, they take a while to sync up, hence why people tend to mainly use it for cross data center operations (I, living in the US, don't care about up to date Chinese writes for the most part).
meaningful content ,Thanks
I bought vaseline and kleenex because of the last video. Will this help me pass the interviews?
As long as you make sure to use them during the interview itself you should be good to go
🐐
great (y)
Looks like I found Goldmine here.
Oh trust me I love digging for gold ;)
It's good that u don't have a life, else we would have missed such a good content😬..anyway why is ur channel not famous yet🤔
Idk man help me spread it haha - otherwise I'll have to make a day in the life video
@@jordanhasnolife5163 😂😂
came here for system design, learning how to show commitment to wives....of other people.
one more comment
I'll take it