Thanks for making all these insightful videos! One question -> in this new series of videos, you mentioned that linearizability is mainly about reads never go back in time. For example, x = 10 is written to the replica1, x = 5 (old value) in replica2. Before new value 10 is populated to replica2, user read 5 from replica2 and that's okay. The only issue is that if they read 5 from replica2, and then 10 from replica1 and again 5 from replica2 (go back). (i.e okay to read not the most recent value, but once we read newer values we can never read old ones again). I also checked out the old series video for this concept. There it mentioned that no stale reads, which my understanding is that user should never read 5 after 10 is written to any of the replicas which is linearizability. I am a bit confused now, curious which one is more accurate?
I'd say the first definition is more accurate. The second one would moreso define "strong consistency", but it's very easy to build strong consistency from linearizable storage. In linearizable storage, each write has a sequence number, so we can easily tell when our write is going back in time and disregard it.
Hi Jordan! Love your videos. I look forward to them. Just one question. At the end of your video around 12:10 you said, "if you're holding a lock... and then all of a sudden the system thinks the previous owner of the lock is the one holding it, you're in trouble." What does this mean exactly? I'm pretty sure you're referring to the idea where a previously failed leader comes back up and becomes the leader again, and that holds older values. But what does this have to do with locking?
I'm referring to a situation where grabbing a distributed lock allows you to write to some external data store. If someone who previously grabbed it goes down, comes up, and thinks they still have it, that could lead to multiple concurrent writes to the external store.
Great explanation for a Lamport counter. You have a gift for teaching. I have two questions. 1. At @11:07, isn't Leader DB used just for writes ? 2. Is there a use case where software engineer will use this in real life? Mostly everybody is using some kind of cloud infrastructure where user does not have to worry about implementation details (but one has to learn how to operate the cloud). I am interested what does "distributed" engineer do? It looks to me that this is a job that is mostly needed in developing cloud infrastructure behind the scenes.
Nope, you can read from leaders too. This is more theoretical, and I imagine that the only software engineers using this IRL are the ones developing databases that use it IRL
Hi Jodan, great video! For the version vector concurrent situation, why don't we merge the updates as [3, 2], instead of arbitrary choose left/right node version?
@@yunfu518 I don't understand what you mean here - we need to agree between all the nodes which one of these writes came before another, which is the definition of linearizability
Yoo, Jordan. At the 8:12 I'm wondering what about the final counter count? In total those two dudes have incremented the clocks 5 times in total, but ultimatelly we ended up with 4, due to the initial first increment that happend concurrently and left both DB's at 1, instead of going for the max(A1, B0) + 1 = 2, or max(A0, B1) = 2 (...) Aha, you explained it right after I started having my question. Nice, I'm going to leave a comment for the algorithm anyway, thanks!
Hey man! Wishing you the best - unfortunately unlike my time at google, I'm now a bit short on time, so I don't think I have the availability to mentor at the moment :(
Loved watching this video you made it so easy, definitely waiting for distributed consensus, thanks!!
Glad to hear, thanks Leo!
Thanks for making all these insightful videos! One question -> in this new series of videos, you mentioned that linearizability is mainly about reads never go back in time. For example, x = 10 is written to the replica1, x = 5 (old value) in replica2. Before new value 10 is populated to replica2, user read 5 from replica2 and that's okay. The only issue is that if they read 5 from replica2, and then 10 from replica1 and again 5 from replica2 (go back). (i.e okay to read not the most recent value, but once we read newer values we can never read old ones again). I also checked out the old series video for this concept. There it mentioned that no stale reads, which my understanding is that user should never read 5 after 10 is written to any of the replicas which is linearizability. I am a bit confused now, curious which one is more accurate?
I'd say the first definition is more accurate. The second one would moreso define "strong consistency", but it's very easy to build strong consistency from linearizable storage. In linearizable storage, each write has a sequence number, so we can easily tell when our write is going back in time and disregard it.
Hi Jordan! Love your videos. I look forward to them. Just one question. At the end of your video around 12:10 you said, "if you're holding a lock... and then all of a sudden the system thinks the previous owner of the lock is the one holding it, you're in trouble." What does this mean exactly? I'm pretty sure you're referring to the idea where a previously failed leader comes back up and becomes the leader again, and that holds older values. But what does this have to do with locking?
I'm referring to a situation where grabbing a distributed lock allows you to write to some external data store. If someone who previously grabbed it goes down, comes up, and thinks they still have it, that could lead to multiple concurrent writes to the external store.
Great explanation for a Lamport counter. You have a gift for teaching. I have two questions.
1. At @11:07, isn't Leader DB used just for writes ?
2. Is there a use case where software engineer will use this in real life? Mostly everybody is using some kind of cloud infrastructure where user does not have to worry about implementation details (but one has to learn how to operate the cloud). I am interested what does "distributed" engineer do? It looks to me that this is a job that is mostly needed in developing cloud infrastructure behind the scenes.
Nope, you can read from leaders too.
This is more theoretical, and I imagine that the only software engineers using this IRL are the ones developing databases that use it IRL
Hi Jodan, great video! For the version vector concurrent situation, why don't we merge the updates as [3, 2], instead of arbitrary choose left/right node version?
How do you merge x = 6 and x = 12? The gist is, we want a total ordering, and in a total ordering you can't have concurrent writes.
@@jordanhasnolife5163 Yea, I was kind of confused why don't we take the latest version of each of the replica for concurrent writes.
@@yunfu518 I don't understand what you mean here - we need to agree between all the nodes which one of these writes came before another, which is the definition of linearizability
@@jordanhasnolife5163 Nvm, i get it now. I was stuck with the idea of merging. But you are right, without knowing the order, we can't really merge.
Yoo, Jordan. At the 8:12 I'm wondering what about the final counter count? In total those two dudes have incremented the clocks 5 times in total, but ultimatelly we ended up with 4, due to the initial first increment that happend concurrently and left both DB's at 1, instead of going for the max(A1, B0) + 1 = 2, or max(A0, B1) = 2 (...)
Aha, you explained it right after I started having my question. Nice, I'm going to leave a comment for the algorithm anyway, thanks!
Hahaha
thanks for the great explanation! Finally understand the lamport! Curious what's the pros and cons between version vector and lamport clock?
Lamport clock: fewer bytes to store
Version vectors: tell us when writes are concurrent as opposed to ordering them arbitrarily
I would love to be mentored by you.
I am an international student who's struggling to find a job
Hey man! Wishing you the best - unfortunately unlike my time at google, I'm now a bit short on time, so I don't think I have the availability to mentor at the moment :(
Plugin your ipad charger 😛