Distributed Systems 7.1: Two-phase commit

แชร์
ฝัง
  • เผยแพร่เมื่อ 12 ก.ย. 2024

ความคิดเห็น • 56

  • @Alvaro-hm9vu
    @Alvaro-hm9vu 2 ปีที่แล้ว +20

    Got a job because of you... you changed my life... thank you

  •  3 ปีที่แล้ว +35

    As soon as I find enough time I'm going to go through all the series. Thank you for making the effort.

  • @krizh289
    @krizh289 4 หลายเดือนก่อน +1

    Thanks for putting these lectures on youtube--education should be accessible to all

  • @ahmetb
    @ahmetb 3 ปีที่แล้ว +6

    I was reading your book and got tired at the beginning of chapter 8 then I found your TH-cam channel while trying to watch some videos before I dig into the chapter! Thanks for all your work in making this field more understandable.

  • @IrvinHerreraGarza
    @IrvinHerreraGarza 2 ปีที่แล้ว +3

    Mr. Kleppmann , I love your book and the way you explain things in your videos. Thank you so much for creating this material.

  • @andreip9378
    @andreip9378 หลายเดือนก่อน +1

    Wow, I didn't know Martin has a YT channel. Instant subscribe.

  • @sachin_getsgoin
    @sachin_getsgoin 3 ปีที่แล้ว +2

    Delighted to watch the series. Thanks for creating this. I am already grateful to you because of "DDIA"

  • @nakonachev1407
    @nakonachev1407 2 ปีที่แล้ว +3

    Great lecture, straight to the point. Thanks for the effort put into it and the adequate way of explaining it.

  • @zhou7yuan
    @zhou7yuan 3 ปีที่แล้ว +6

    "Consistency" [0:11]
    ACID
    Read-after-write-consistency (lecture 5)
    Replication
    Consistency model
    Distributed transactions [2:26]
    Atomic commit versus consensus [4:47]
    >1 propose | all votes
    any 1 proposed value decided | must all commit/abort
    crash tolerated | abort if 1 node crash
    Two-phase commit (2PC) [6:33]
    (key moment) [9:45]
    The coordinator in two-phase commit [10:25]
    Fault-tolerant two-phase commit (1/2) [12:58]
    Fault-tolerant two-phase commit (2/2) [16:43]

  • @thewolfer2281
    @thewolfer2281 2 ปีที่แล้ว +2

    Legend!! Im passing this course cuz of this playlist, the whole distributed systems in 1 day thanks to you

    • @iyadelwy1500
      @iyadelwy1500 2 ปีที่แล้ว

      Bas yala ya abdo

    • @thewolfer2281
      @thewolfer2281 2 ปีที่แล้ว

      @@iyadelwy1500 😂😂😂 walahy sebtaha 3shan enta tshofha

  • @mikedelta658
    @mikedelta658 ปีที่แล้ว

    Crystal clear explanation. Hats off to you, Martin!

  • @timurlanrahimberdiev6096
    @timurlanrahimberdiev6096 2 ปีที่แล้ว +2

    Great lectures, Great book, Great author 👍

  • @lifeirao7605
    @lifeirao7605 3 ปีที่แล้ว +5

    super illustrative. Thank you!

  • @user-vu5dl8mj2m
    @user-vu5dl8mj2m 2 ปีที่แล้ว

    Grateful for the amazing lecture! Finally, get some impression about how Raft works.

  • @2tce
    @2tce 2 ปีที่แล้ว +1

    @martin Kleppmann, thanks for the interesting presentation all the way from Cambridge. I'll like to suggest that we could update the Linearizable CAS to:
    IF old = new THEN
    success := true
    There is no point comparing the old and new, if they are the same. :)

  • @paulchicos1872
    @paulchicos1872 2 ปีที่แล้ว

    you my guy are a gem of humanity

  • @manishsakariya4595
    @manishsakariya4595 3 ปีที่แล้ว

    Very nice and detailed video. I would love to see your three-phase commit explanation,

  • @veerajbhokre1847
    @veerajbhokre1847 8 หลายเดือนก่อน

    Amazing lectures. Thank you so much. You are a god.

  • @kobew1351
    @kobew1351 5 หลายเดือนก่อน

    Hope you can make a video to explain three phase commit and how it improves fault tolerance.

  • @martinkunev9911
    @martinkunev9911 3 ปีที่แล้ว +3

    What does the failed replica do when it comes up?

  • @zaixrx
    @zaixrx หลายเดือนก่อน

    Big thanks

  • @jeniamtl6950
    @jeniamtl6950 11 หลายเดือนก่อน

    atomic commitment is completely different from atomic in ACID. For example, if students and classes are handles on different nodes, then after all components have voted yes and the coordinator send the commit messages, there will be a moment when the student has enrolled in a class but the class does not yet exist or vise versa. This is completely different from "atomic" in ACID.

  • @OffAndGo
    @OffAndGo ปีที่แล้ว

    Hello, the video is so helpful but hope that my question can be clarified, best.
    Does the coordinator node care if other nodes have committed successfully or not, if it does and a node failed to commit, does the coordinator make a second decision for sending an abort to all the nodes?

  • @rastaeule7482
    @rastaeule7482 3 ปีที่แล้ว

    Very clear explanation!

  • @BHARATKUMAR-le6eq
    @BHARATKUMAR-le6eq 2 ปีที่แล้ว

    Hi Martin, you told failure detector can be run on any node. So my doubts are what will happen if the specific node is down or crashed on which failure detector is running?? and then how we will detect how many other nodes also crash??

  • @abcdef-fo1tf
    @abcdef-fo1tf ปีที่แล้ว

    Am I right in understanding that we can use raft to send total order broadcasts and elect new coordinators for node communication and two phase commit for commiting data?

  • @za406
    @za406 ปีที่แล้ว

    Question: Why is the "prepare" message necessary if replicas "ack" on the original transaction message?

  • @ivan.p
    @ivan.p 3 ปีที่แล้ว

    Good explanation! Thank you!

  • @albumlist1
    @albumlist1 2 ปีที่แล้ว

    Hi Martin, Thanks for this amazing series. I have a question here . If for any replica there are conflicting answers (one sent by the replica itself and other sent by other node on behalf of the replica(suspecting the replica is down) around the same time, shouldn't it take the later decision instead of the first decision? If some other node said a "No" (on this replica's behalf) and then the actual replica recovers itself and says a "yes" , then taking the later decision looks more logical . Same is true in the opposite case.

    • @m-ld3832
      @m-ld3832 ปีที่แล้ว

      At first glance, that approach is appealing, since it appears to be the safest, avoiding any confusion by taking the most conservative default position. However, that isn't actually necessary, by virtue of the way Total Order Broadcast works. This is down to the relative timing of the slow / recovered replica's vote of "Yes", and the consensus decision by all the nodes. If the "Yes" vote is received from the slow node _after_ all the other "No" votes from others on its behalf, those "No" votes are overridden by the "Yes", since that was the first vote seen by it from others.
      What's not entirely clear from the video is precisely when a consensus is considered to have been reached, and if/how this is consequently communicated among them. Presumably, if all the other nodes have already settled on the decision against proceeding before the "Yes" vote is received from the slow one, then that decision is not invalidated. The previous video in this series may expand upon this.

  • @yuchen6630
    @yuchen6630 ปีที่แล้ว

    thank you

  • @zuggrr
    @zuggrr 2 ปีที่แล้ว

    This is fantastic ! thank you so much :)

  • @yihanwu3823
    @yihanwu3823 3 ปีที่แล้ว

    Fault tolerant 2PC means the coordinator is redundant and can be removed?

  • @danish6192
    @danish6192 ปีที่แล้ว

    Why client is opening transaction simultaneously on 2 nodes in 2PC ? shouldn't the transaction be open on master node only ?

  • @tanmaymehrotra86
    @tanmaymehrotra86 2 ปีที่แล้ว

    what if is nodes reply to co ordinator that yes we can peform this transaction and send out the ok message (in response to prepare) but after sending the prepare they crash ? I assume these nodes will replicate the data (via consensus) so even in the face of faliure another leader will get elected. I do understand how total order broadcast work via raft but I am unable to how data is locked ?

  • @complicated2359
    @complicated2359 ปีที่แล้ว

    If database gone down after it had agreed to commit, what would you do?

  • @jainamm5307
    @jainamm5307 5 หลายเดือนก่อน

    What happens if one of the nodes has sent ok for prepare but while waiting for all the oks it crashes ? The transaction will go forward in all the other nodes.

    • @jainamm5307
      @jainamm5307 5 หลายเดือนก่อน

      One potential solution to this problem is to have a recovery mechanism for the node when it comes back up.

    • @jainamm5307
      @jainamm5307 5 หลายเดือนก่อน

      One potential solution is to have a recovery mechanism for the node when it comes back up.

  • @QDem19
    @QDem19 3 ปีที่แล้ว

    Thank you for going over this.
    I have a question regarding slide 2 of the Fault tolerant 2PC. Which node is taking the decision on the fate of the transaction, is it the current term leader of the Total Order Broadcast, or can it be any node participating in the transaction.
    It seems like it should be the former, i.e. current term leader, but just wanted to be sure.

    • @giorgiobuttiglieri5876
      @giorgiobuttiglieri5876 10 หลายเดือนก่อน

      Each node can independently understand if the distributed transaction failed: each node receives the same sequence of messages and the algorithm used to determine if the transaction failed is deterministic.
      So all the nodes will reach the same conclusion without the need of a coordinator.

    • @jainamm5307
      @jainamm5307 5 หลายเดือนก่อน

      @@giorgiobuttiglieri5876 When you say each node receives the same sequence of messages - how is the "sequence" guranteed to be the same in every node?

    • @giorgiobuttiglieri5876
      @giorgiobuttiglieri5876 5 หลายเดือนก่อน

      ​@@jainamm5307 For the proposed fault-tolerant version of the 2PC, we use total order broadcast as communication primitive.
      So by definition all nodes receive the same messages in the same order.
      If you are interested in how to achieve this, there are other videos in this channel explaining it very well

  • @arthursimeon2620
    @arthursimeon2620 2 ปีที่แล้ว +1

    So is the coordinator used for decision making on commits, and the total order broadcast system just a backup in case the coordinator crashes?

  • @austecon6818
    @austecon6818 ปีที่แล้ว

    I still don't get how with geographically distributed nodes (with different ping/latency to each other)... total order broadcast can prevent a (very rare and unlikely) race condition where you have 5/10 nodes that get the failure detector message to abort fractions of a second before the sluggish node sends a vote to go ahead and commit... and the other 5/10 nodes would have the opposite ordering
    If it happens at exactly the same time... due to network latency effects... you could have a split of the network (5 nodes with low ping to the failure detector and 5 nodes with low ping to the sluggish node but high ping to the failure detector)... so in that case do you just go with majority rules and always have an odd total number of nodes to decide which is the true(er) version of history? But now we are into 3 phases not 2 phases...
    So is this like a shitty version of the raft protocol or something where it assumes 0 network latency?

    • @Ynno2
      @Ynno2 5 หลายเดือนก่อน

      Total order broadcast requires consensus and if only 5/10 nodes have agreed then there's no quorum and no consensus. Neither event will be actionable until n/2+1 nodes have received it. If there is a 50%/50% split, neither side of the split will make any decisions (nothing will be committed and everything will grind to a halt) until the partition is resolved.

  • @murali1790able
    @murali1790able 2 ปีที่แล้ว

    I thought consensus are used in databases but looks like consensus can't solve atomic commit problem. Can anyone explain the real application of consensus?

    • @vhscampos1
      @vhscampos1 2 ปีที่แล้ว

      Consensus achieves total order broadcast, i.e. all nodes deliver messages/operations in the same order.

  • @tarunstv796
    @tarunstv796 2 ปีที่แล้ว

    Discourse from "distributed systems" God himself.
    Very nice 👌

  • @HasanAmmori
    @HasanAmmori 2 ปีที่แล้ว

    "Reasonably simple way"... Yeah. That's what I thought

  • @BHARATKUMAR-le6eq
    @BHARATKUMAR-le6eq 2 ปีที่แล้ว

    I have one more doubt. So we will wait to get an "OK" message from all the replicas or we will commit to a specific replica after receiving the "OK" message??. I mean if we will wait for all the replicas that make sense but if we just commit after receiving "OK" then it may consist of inconsistency. Ex if one replica sends the message "OK" and we commit the change to a specific replica but the other replica crash and does not send the "OK" message then both replica will be inconsistent.

  • @trozzonick77
    @trozzonick77 3 ปีที่แล้ว

    Would not make more sense to use a Queue as or helper for the coordinator

  • @cristianokwiatkovsk9059
    @cristianokwiatkovsk9059 3 ปีที่แล้ว

    First cut the fucking hair xD next recording...