I was asked to explain CAP theorem for a job interview and struggled then, I wish I had this at that time ! Thanks for the amazing videos Hussein !! :D
Even i was confused P with data sharding/partitioning .U tried hard to explain CA. Finally u made me confident that CA has possibility with P. U explained A in ACID very well .
This is one of my favorite videos, I love the new technique you have picked up, putting objects on the screen while we can also see your face, helps build rapport !
In your example of Master-replicas data store, you gave the example of having inconsistent reads because of the replication lag. From wikipedia it says " When there is no network failure, both availability and consistency can be satisfied" . In this case, there does not seem to be a network failure? unless replication lag is considered one. If there is no network failure (no partition occurred) why couldnt we provide both A and C according to the theorem and wiki?
Partition tolerance doesn’t mean no network failures. It means no networking at all. So the only way you get A and C is when you have no networking partitions. Network partitions leads not only to failures but latencies as well. The only way no network failures can occur is when you absolutely don’t have any network.
About eventual consistency, I think the point is that the system is guaranteed to eventually be consistent *automatically* as opposed to possibly being inconsistent forever. Interestingly, eventual availability may also make sense-eventually the system will be available automatically, as opposed to someone having to go and (for example) restart a failed service or something. Thoughts?
I say that tongue and cheek and I don’t see anything wrong about “eventual consistency”. I think the problem here is we put to much emphasis on trying to label and name things that don’t really need to be named and labeled and that just causes unnecessary confusion all together and stray us from what is really important.
I thought to be available would mean having replicas in diff nodes whether for a service or database, when a machine dies we have more to answer hemce achieve availability. Isnt this achieved by adding partitioning? So how can I be highly available without partitions? How can I be available and consistent? Can a single beefy machine be considered highly available?
I think CAP theorem does not really capture the essence of tradeoffs involved with scaling architecture, PACELC theorem is a better representation imo. en.wikipedia.org/wiki/PACELC_theorem
Hey Husseinz when you explained sync write to 3 replicas you mentioned that it's example of consistent and partition tolerant system. Isn't it consistent and available? In this case your system allows to write only to all replicas which suggest that it's aiming to prevent partitioning. In case of network error we sacrifice availability then as write cannot happen until it's possible to write to all nodes
Thanks for this amazing video Hussein! Please can you talk about index corruption in databases? We recently experienced index corruption in 2 different tables in our postgres database, I didn't know such was possible till I saw it first hand. We found out about the corruption accidentally when we tried to read data that we were absolutely sure was in the database but the database kept returning no result.
Thanks for sharing never ran into index corruption thought these are rare. The only time although uber did ran into one back in 2016 with postgres 9.2 it was a bug on that version. I will do more research. Do you know what postgres version where you running
Hi Hussein, can you give your opinion about system design interviews? I mean it's kind of really a trend in all the online platform and bootcamps, I know what is system design but what I don't understand is if you only learn how to design parking system and blah blah only in theory but never practiced or implemented those how does it make you a better engineer or qualified for the position?
I think I gave my opinion in a video, not sure. I think its not productive to watch someone else design a system. Designing a system is an art that is unique to you with use cases/situations and requirements. Following a step by step design puts shackles and limit you on a mind set that “this is the only right way of doing things” which limits creativity. I say if you want to design a system, sit down and do it. Don’t worry about anything else. Be critical of yourself in the process. You will feel proud of what you built.
Hey Hussein, great content as always. I wanted to point out something here. Incase of we are preferring C over A, I found that your example would result neither of them in the following case: - If write to first node is successful, and failed for second node and it can't be written after many tries. Now you can choose not to commit the changes of master server as you have a transaction going on. But the data is already written in the first child node. Thus it's not consistent as you have to delete that data from the first node. And for that period of time, your data is inconsistent. I just realise something while writing the comment, this example is always inconsistent even if we don't get failure. Because the first node will get updated, and it will take some amount of time to update the second node. So for that fraction of time, your system is inconsistent. Sorry for such a long comment. Hope it makes sense.
Absolutely. As long as there is a partition (networking) there is a possibility of inconsistency. When the user attempts to commit to the primary node , the primary node sends a request to commit to all worker nodes one by one and waits for the response. and only commits when all the workers reply. it is possible that one node will get a commit milliseconds before the others and in that time a read request might get issued to that node and that yields to inconsistent results. That is why synchronous replications is also eventually consistent in the grand scheme of things.
Dear sir can you please make video It is possible we make own CDN server Like I have 2 server one is from USA and one is from Germany and my all sites host in USA server but I need to hide my usa ip and show my Germany ip in dnschecker it is possible or not please make video . It's work like cloudflare I don't use any company cdn I need to make own cdn if possible please 🙏 make videos thank you
CAP theorem is about distributed systems though, so, I don't think your example is correct when talking about 1 beefy machine. The guy who came up with this theorem did so while researching distributed systems specifically. Also, CAP theorem is obsolete at this point, look at PACELC theorem, it's an extension of CAP.
Learn the fundamentals of Database Engineering, head to husseinnasser.com/courses for a discount code on my udemy course
Eventually available. That was funny.
I am addicted to your backend videos.
Respect++ !
The best channel on engineering that I have come across.
Looking at the quality of content discussed here, I can already see this channel become huge in no time. Kudos to you Hussein! Great work!
I was asked to explain CAP theorem for a job interview and struggled then, I wish I had this at that time ! Thanks for the amazing videos Hussein !! :D
Thanks Adarsh!!
Did you get the job?
@@KerronHutton No
@@AdarshMenon 😅
which company?
Even i was confused P with data sharding/partitioning .U tried hard to explain CA. Finally u made me confident that CA has possibility with P. U explained A in ACID very well .
This is one of my favorite videos, I love the new technique you have picked up, putting objects on the screen while we can also see your face, helps build rapport !
Super video! I applauded for £2.00 👏
I have watched many videos about CAP but I got it only from this video.
Your examples is so helpful to understand exactly. Thanks!
In your example of Master-replicas data store, you gave the example of having inconsistent reads because of the replication lag. From wikipedia it says " When there is no network failure, both availability and consistency can be satisfied" . In this case, there does not seem to be a network failure? unless replication lag is considered one. If there is no network failure (no partition occurred) why couldnt we provide both A and C according to the theorem and wiki?
Partition tolerance doesn’t mean no network failures. It means no networking at all. So the only way you get A and C is when you have no networking partitions. Network partitions leads not only to failures but latencies as well.
The only way no network failures can occur is when you absolutely don’t have any network.
Eventually consistent = immediately inconsistent
Awesome explanation buddy!
Hussein sir your explanations are dope
You can add "eventual" to anything
About eventual consistency, I think the point is that the system is guaranteed to eventually be consistent *automatically* as opposed to possibly being inconsistent forever. Interestingly, eventual availability may also make sense-eventually the system will be available automatically, as opposed to someone having to go and (for example) restart a failed service or something. Thoughts?
I say that tongue and cheek and I don’t see anything wrong about “eventual consistency”. I think the problem here is we put to much emphasis on trying to label and name things that don’t really need to be named and labeled and that just causes unnecessary confusion all together and stray us from what is really important.
you are awesome Hussein.. great explanation 👍
love that sword in the background
I thought to be available would mean having replicas in diff nodes whether for a service or database, when a machine dies we have more to answer hemce achieve availability. Isnt this achieved by adding partitioning? So how can I be highly available without partitions? How can I be available and consistent? Can a single beefy machine be considered highly available?
Please also talk about PACELC Theorem
Somedays ago I got asked This in SDI at De Shaw I've rotted it already and aced
I think CAP theorem does not really capture the essence of tradeoffs involved with scaling architecture, PACELC theorem is a better representation imo.
en.wikipedia.org/wiki/PACELC_theorem
Was studying HLD for interviews. So recently came across CAP
Perfect explanation.. thank you...
Hey Husseinz when you explained sync write to 3 replicas you mentioned that it's example of consistent and partition tolerant system. Isn't it consistent and available? In this case your system allows to write only to all replicas which suggest that it's aiming to prevent partitioning. In case of network error we sacrifice availability then as write cannot happen until it's possible to write to all nodes
while the replicas are being made , the node is unavailable until the replication is done .
awesome explanation. Thanks
Thanks for this amazing video Hussein! Please can you talk about index corruption in databases? We recently experienced index corruption in 2 different tables in our postgres database, I didn't know such was possible till I saw it first hand. We found out about the corruption accidentally when we tried to read data that we were absolutely sure was in the database but the database kept returning no result.
Thanks for sharing never ran into index corruption thought these are rare. The only time although uber did ran into one back in 2016 with postgres 9.2 it was a bug on that version.
I will do more research. Do you know what postgres version where you running
@@hnasr probably too late already, version 9.6
Would like to see discussing SOLID Principles.!! Can we expect??
Great video thank you so much
Hi Hussein, can you give your opinion about system design interviews? I mean it's kind of really a trend in all the online platform and bootcamps, I know what is system design but what I don't understand is if you only learn how to design parking system and blah blah only in theory but never practiced or implemented those how does it make you a better engineer or qualified for the position?
I think I gave my opinion in a video, not sure. I think its not productive to watch someone else design a system. Designing a system is an art that is unique to you with use cases/situations and requirements. Following a step by step design puts shackles and limit you on a mind set that “this is the only right way of doing things” which limits creativity.
I say if you want to design a system, sit down and do it. Don’t worry about anything else. Be critical of yourself in the process. You will feel proud of what you built.
Found the video th-cam.com/video/aAJJORDSlho/w-d-xo.html
Fast forward to 5:20
It’s never really pick 2. Partition tolerance is a must in all systems, therefore you’re really picking between PA or PC
Hey Hussein, great content as always. I wanted to point out something here. Incase of we are preferring C over A, I found that your example would result neither of them in the following case:
- If write to first node is successful, and failed for second node and it can't be written after many tries. Now you can choose not to commit the changes of master server as you have a transaction going on. But the data is already written in the first child node. Thus it's not consistent as you have to delete that data from the first node. And for that period of time, your data is inconsistent.
I just realise something while writing the comment, this example is always inconsistent even if we don't get failure. Because the first node will get updated, and it will take some amount of time to update the second node. So for that fraction of time, your system is inconsistent.
Sorry for such a long comment. Hope it makes sense.
Absolutely. As long as there is a partition (networking) there is a possibility of inconsistency. When the user attempts to commit to the primary node , the primary node sends a request to commit to all worker nodes one by one and waits for the response. and only commits when all the workers reply. it is possible that one node will get a commit milliseconds before the others and in that time a read request might get issued to that node and that yields to inconsistent results. That is why synchronous replications is also eventually consistent in the grand scheme of things.
great video!
No one can be told what CAP theory is, they have to see it for themselves
You are awesome wowsome and never flawsome
Eventual Partition Tolerance
Is that a sword in the background?
Dear sir can you please make video
It is possible we make own CDN server
Like I have 2 server one is from USA and one is from Germany and my all sites host in USA server but I need to hide my usa ip and show my Germany ip in dnschecker it is possible or not please make video .
It's work like cloudflare
I don't use any company cdn I need to make own cdn if possible please 🙏 make videos thank you
Homeboy here doing Caitlyn Jenner real bad 🤣🤣
CAP theorem is about distributed systems though, so, I don't think your example is correct when talking about 1 beefy machine. The guy who came up with this theorem did so while researching distributed systems specifically.
Also, CAP theorem is obsolete at this point, look at PACELC theorem, it's an extension of CAP.
Don't be an eventual subscriber 😅
No CAP
fr fr
baddo desuka, lmaooooo watching too much anime? baddo desu yo!
Comment 1
🧢
no CAP