What are Distributed CACHES and how do they manage DATA CONSISTENCY?

แชร์
ฝัง
  • เผยแพร่เมื่อ 25 ม.ค. 2025

ความคิดเห็น • 535

  • @VrajaJivan
    @VrajaJivan 5 ปีที่แล้ว +488

    Gaurav nice video. One comment. Writeback cache refers to writing to cache first and then the update gets propagated to db asynchronously from cache. What you're describing as writeback is actually write-through, since in write through, order of writing (to db or cache first) doesn't matter.

    • @gkcs
      @gkcs  5 ปีที่แล้ว +56

      Ah, thanks for the clarification!

    • @KumarAbhishek123
      @KumarAbhishek123 5 ปีที่แล้ว +37

      Yes, would be great if you can add a comment saying correction about the 'Write back cache'. Thanks for the great video!

    • @gururajsridhar7314
      @gururajsridhar7314 5 ปีที่แล้ว +8

      I agree.. a comment in the video correcting this would be good update to this.

    • @mrityunjoynath7673
      @mrityunjoynath7673 5 ปีที่แล้ว +2

      So Gaurav was also wrong in saying "write-back" is a good policy for distributed systems?

    • @jyotipandey9218
      @jyotipandey9218 5 ปีที่แล้ว

      @Gaurav Yes that would be great. That part was confusing, had to read about that separately.

  • @waterislife9
    @waterislife9 4 ปีที่แล้ว +299

    Write-through: data is written in cache & DB; I/O completion is confirmed only when data is written in both places
    Write-around: data is written in DB only; I/O completion is confirmed when data is written in DB
    Write-back: data is written in cache first; I/O completion is confirmed when data is written in cache; data is written to DB asynchronously (background job) and does not block the request from being processed

  • @GK-rl5du
    @GK-rl5du 5 ปีที่แล้ว +498

    Other variants
    1. There are 2 hard problems in computer science: cache invalidation, naming things, and off-by-1 errors.
    2. There are only two hard problems in distributed systems: 2. Exactly-once delivery 1. Guaranteed order of messages 2. Exactly-once delivery

    • @gkcs
      @gkcs  5 ปีที่แล้ว +13

      Hahahaha!

    • @GK-rl5du
      @GK-rl5du 5 ปีที่แล้ว +41

      @@gkcs A humble suggestion, I think you should have a sub-reddit for the channel, because these are such critical topics [not just for cracking interviews], I'm sure they'd definitely encourage healthy discussions. I think YT's comment system is not really ideal to have/track conversations with fellow channel members.

    • @RAJATTHEPAGAL
      @RAJATTHEPAGAL 4 ปีที่แล้ว +1

      This is an underrated comment .... 😂😂😂

    • @kumarakantirava429
      @kumarakantirava429 4 ปีที่แล้ว +1

      ​@@gkcs Can you please give some hints on WHY "out of order Delivery" is a problem in distributed systems, if the application is running on TCP ..................PLease Kindly reply.

    • @kumarakantirava429
      @kumarakantirava429 4 ปีที่แล้ว

      @goutham Kolluru , Can you please give an hint on WHY "out of order Delivery" is a problem in distributed systems, if the application is running on TCP ..................PLease Kindly reply.

  • @mannion1985
    @mannion1985 5 ปีที่แล้ว +16

    I can already hear the interviewer asking "with the hybrid solution: what happens when the cache node dies before it flushes to the concrete storage?" You said youd avoid using that strategy for sensitive writes but you'd still stand to lose upto the size of the buffer you defined on the cache in the e entire of failure. You'd have to factor that risk into your trade off. Great video, as always. Thank you!

  • @mengyonglee7057
    @mengyonglee7057 ปีที่แล้ว +42

    Notes:
    In Memory Caching
    - Save memory cost - For commonly accessed data
    - Avoid Re-computation - For frequent computation like finding average age
    - Reduce DB Load - Hit cache before querying DB
    Drawbacks of Cache
    - Hardware (SSD) much more expensive than DB
    - As we store more data on cache, search time increases (counter productive)
    Design
    - Database (Infinite information) vs Cache (Relevant information)
    Cache Policy
    - Least Recently Used (LRU) - Top entires are recent entries, remove least recently used entries in cache
    Issue with caches
    - Extra calls - When we couldn’t find entry in cache, we query from database.
    - Threshing - Input and output cache without ever using results
    - Consistency - When update DB, we must maintain consistency between cache and DB
    Where to place the cache
    - Close to server (in memory)
    - Benefit - Fast
    - Issue - Maintaining consistency between memory of different servers, especially for sensitive data such as password
    - Close to DB (global cache, i.e. Redis)
    - Benefit - Accurate, Able to scale independently
    Write-through vs Write-back
    - Write-through - Update cache, before updating DB
    - Not possible for multiple servers
    - Write-back - Update DB, before updating cache
    - Issue: Performance - When we update the DB, and we keep updating the cache based on that, much of the data in the cache will be fine and invalidating them will be expensive
    - Hybrid
    - Any update first write to cache
    - After a while, persist entries in bulk to database

    • @pushp3593
      @pushp3593 ปีที่แล้ว

      nice, but write through and write back notes part is wrong, pls correct it. you can check other comments. thanks

    • @cheerladinnemouli2864
      @cheerladinnemouli2864 11 หลายเดือนก่อน

      Nice notes

  • @Sound_.-Safari
    @Sound_.-Safari 4 ปีที่แล้ว

    Cache doesn’t stop network calls but does stop slow costly database queries. This is still explained well and I’m being a little pedantic. Good video, great excitement and energy.

  • @zehrasubas9768
    @zehrasubas9768 5 ปีที่แล้ว +9

    Hi Guarav, I really like your videos thank you for sharing! I need to point out something about this video. Writing directly do DB and updating cache after, is called write around not write back. The last option you have provided, writing to cache and updating DB after a while if necessary, is called write back

    • @gkcs
      @gkcs  5 ปีที่แล้ว +1

      Thanks Zehra 😁

  • @NohandleReqd
    @NohandleReqd 2 ปีที่แล้ว +1

    Teaching and learning are processes. Gaurav makes it fun to learn about stuff, then let it be systems or the egg dropping problem.
    I might just take the InterviewReady course to participate in the interactive sessions.
    Take a bow!

  • @enfieldli9296
    @enfieldli9296 3 ปีที่แล้ว

    I just can't find a better content on YT than this, thanks man!

  • @bhavyeshvyas2990
    @bhavyeshvyas2990 5 ปีที่แล้ว +3

    Dude you are the reason for my system design interest Thanks and never stop making system design videos

  • @mayankvora8329
    @mayankvora8329 3 ปีที่แล้ว

    I don't know how people can dislike your video Gaurav, you are a master at explaining the concepts.

  • @Freeman937
    @Freeman937 4 ปีที่แล้ว +2

    The world needs more people like you. Thank you!

  • @akash.vekariya
    @akash.vekariya 4 ปีที่แล้ว +17

    This man is literally insane in explanation 🔥

  • @大盗江南
    @大盗江南 4 ปีที่แล้ว +19

    each of ur videos, i watched ay least twice lol, thank you!! WE ALL LOVE U! U R THE BEST!

    • @rishiraj9131
      @rishiraj9131 3 ปีที่แล้ว +2

      I also watch his videos mamy times.
      At least 4 times to be precise.

  • @OwenValentine
    @OwenValentine 5 ปีที่แล้ว +5

    Gaurav, what you initially described as write-back at around 10:30 I have seen described as write-around. Write-back is where you write to the cache and get confirmation that the update was made, then the system copies from the cache to the database (or whatever authoritative data store you have) later... be it milliseconds or minutes later. Write through is reliable for things that have to be ACID but it is slower than write back. You later describe what I have always heard as write-back at around 12 and a half minutes

    • @gkcs
      @gkcs  5 ปีที่แล้ว

      Yes, I messed up with the names. Thanks for pointing it out 😁

    • @durden0
      @durden0 6 หลายเดือนก่อน

      @@gkcs so does this mean mean that write-through is good for critical data (financial/passwords) and write-back/write-around is not?

  • @rahuljain5642
    @rahuljain5642 3 ปีที่แล้ว +6

    If someone explains any concept with confidence & clarity like you in the interview, he/she can rock it seriously. Heavily inspired by you & love your content of system design. Thanks for the effort @Gaurav Sen

  • @VikramKumar-qo3rg
    @VikramKumar-qo3rg 4 ปีที่แล้ว

    Fun part. I was going through 'Grokking The System Design Interview' course, found the term 'Redis', started searching for more on it on youtube, landed here, finished the video and Gaurav is now asking me to go back to the course. Was going to anyway! :)

    • @gkcs
      @gkcs  4 ปีที่แล้ว

      Hahaha!

  • @anjurawat9274
    @anjurawat9274 5 ปีที่แล้ว +1

    I watched this video 3 times because of confusion but ur pinned comment saved my mind
    thank you sir

  • @AnonyoX
    @AnonyoX 5 ปีที่แล้ว +12

    Great video. But I wanted to point out that, I think what you are referring to as 'write-back' is termed as 'write-around', as it comes "around" to the cache after writing to the database. Both 'write-around' and 'write-through' are "eager writes" and done synchronously. In contrast, "write-back" is a "lazy write" policy done asynchronously - data is written to the cache and updated to the database in a non-blocking manner. We may choose to be even lazier and play around with the timing however and batch the writes to save network round-trips. This reduces latency, at the cost of temporary inconsistency (or permanent if the cache server crashes - to avoid which we replicate the caches)

  • @SatyadeepRoat
    @SatyadeepRoat 4 ปีที่แล้ว

    I am actually using write back redis in our system but this video actually helped me to understand what's happening overall. GReat video

  • @rajeevkulkarni2888
    @rajeevkulkarni2888 3 ปีที่แล้ว +1

    Thank you so much for these videos!. Using this I was able to pass my system design interview.

  • @manasbudam7192
    @manasbudam7192 4 ปีที่แล้ว +1

    What you explained as write-back cache is actually a write-around cache. In write-back cache...you update only the cache during the write call and update the db later (either while eviction or periodically in the background).

  • @Satu0King
    @Satu0King 5 ปีที่แล้ว +51

    Description for write back cache is incorrect.
    Write-back cache: Under this scheme, data is written to cache alone and completion is immediately confirmed to the client. The write to the permanent storage is done after specified intervals or under certain conditions. This results in low latency and high throughput for write-intensive applications, however, this speed comes with the risk of data loss in case of a crash or other adverse event because the only copy of the written data is in the cache.

    • @gkcs
      @gkcs  5 ปีที่แล้ว +8

      Thanks for pointing this out Satvik 😁👍

    • @justinmancherje6168
      @justinmancherje6168 5 ปีที่แล้ว +4

      I believe the description in the video given for write-back cache is actually a write-around cache (according to grokking system design)

    • @mostinho7
      @mostinho7 4 ปีที่แล้ว

      What if the cache itself is replicated? Will write-back still has risk of data loss

    • @arpansen964
      @arpansen964 2 ปีที่แล้ว

      Yes, as per my understanding, write-through cache : when data is written on the cache it is modified in the main memory, write back cache: when dirty data (data changed) is evicted from the cache , it is written on the main memory, so write back cache will be faster. The whole explanation around there two concepts given in this video seems fuzzy.

  • @neeraj91mathur
    @neeraj91mathur 3 ปีที่แล้ว +1

    Nice video Gaurav, really like your way of explaining. Also, the fast forward when you write on board is great editing, keeps the viewer hooked.

  • @ashwinasokan
    @ashwinasokan 2 ปีที่แล้ว

    Bhai. u r a life saver! Brilliant tutoring. Thank you!

  • @jayantsogani8389
    @jayantsogani8389 5 ปีที่แล้ว +9

    Thanks Gaurav, your lecture helped me to crack MS. Keep posting video's

    • @gkcs
      @gkcs  5 ปีที่แล้ว +2

      Congrats!

    • @shubham.1172
      @shubham.1172 5 ปีที่แล้ว

      Are you in the Hyd campus?

  • @muraliboddu4007
    @muraliboddu4007 3 ปีที่แล้ว

    nice quick video to get an overview. thanks Gaurav. you are helping a lot of people.

  • @majortakleef8445
    @majortakleef8445 4 ปีที่แล้ว

    Gaurav, what you are describing as a Write Back cache is actually called Write Around cache. What you describe as the hybrid mechanism, is actually called the Write Back cache. In both assumption is an asynchronous update unlike Write Through where update is synchronous. Might be worth taking this video offline and uploading a corrected version to avoid misleading folks prepping for interviews.

  • @prakharpanwaria
    @prakharpanwaria 3 ปีที่แล้ว +1

    Good video around basic caching concepts. I was hoping to learn more about Redis (given your video title)!

  • @shreyasns1
    @shreyasns1 3 ปีที่แล้ว +1

    Thank you for the video. You could have gone a little deeper about how the cache is implemented? What’s the underlying data structure of the cache?

  • @nimitkanani1691
    @nimitkanani1691 2 ปีที่แล้ว

    At 2:45, you say that storing a lot of data on cache leads to increased search times. Can you explain how?

  • @billyean
    @billyean 2 ปีที่แล้ว

    Explained like my interviewed candidate today.

  • @AmitKumar-je7rn
    @AmitKumar-je7rn 3 ปีที่แล้ว +1

    I have one doubt. The definition you gave for write-back should be for write-around. In write-around, we hit the DB first and then update the cache.
    In write-back, we first update the cache and then wait for some time to bulk write in DB.
    Please let me know if my understanding is wrong.

  • @carlamckenzie2669
    @carlamckenzie2669 ปีที่แล้ว

    @12:48 Would this scenario apply if there are multiple replicas for a service with redis?

  • @ravinmulchandani
    @ravinmulchandani 2 ปีที่แล้ว +1

    Nice Explanation Gaurav. This video covers basics of caching. In one of the interviews, I was asked to design the Caching System for stream of objects having validity. Is it possible for you to make some video on this system design topic?

  • @kushal1
    @kushal1 4 ปีที่แล้ว +1

    At 3:05 seconds, you mention that if we keep on storing everything in cache we might as well increase our search time. Isn't cache key value pairs entries and search being a O(1) operation?

    • @gkcs
      @gkcs  4 ปีที่แล้ว

      It is O(1), we have have limited main memory. Once we run out, we will have to fall back on secondary storage, which is an I/O call.
      Also, the O(1) assumes very few collisions for hash buckets. As the number of entries per bucket increases, the search time slows too (This scenario is unlikely, but good to know about).

    • @kushal1
      @kushal1 4 ปีที่แล้ว

      @@gkcs I agree with your points. That point doesn't comes through that fair and up in video. It conveys as if cache itself slows down when it is filled with more data within the given memory limit. Hope, I am making sense.

  • @shoaibzafar5663
    @shoaibzafar5663 ปีที่แล้ว

    This everything what I needed. I am really looking forward to learn that how can create an online game hosting server . I researched a lot on how do it and I didn't get it what is exactly happening. Your CDN video was really good 👍. Now I have understood how exactly CDN works and why it uses distributed caching 👍💯

    • @gkcs
      @gkcs  ปีที่แล้ว

      Thank you 😁

  • @muhammadanas11
    @muhammadanas11 4 ปีที่แล้ว +1

    The way you explained concepts is AWSOME.
    Can you please create a video that decribes DOCKER and Containers in your style.

  • @chikumanu
    @chikumanu 3 ปีที่แล้ว

    i think you mixed write-back with write-around cache. write-back is when you just update the cache and the database gets updated at a later point in time. write-around is when the db gets updated first and then the cache gets notified asynchronously about that update.

  • @manishamulchandani1500
    @manishamulchandani1500 3 ปีที่แล้ว +1

    I have one doubt regarding the cache policy. Gaurav explained that for critical data we use Write Back policy to ensure consistency. In write through one instance memory cache gets updated and others can remain stale.
    1) My question is same can happen in Write Back, one instance's in memory cache entry gets deleted and we update DB..other instances still have that entry. So there is inconsistency in write Back as well. Why do we prefer write back for critical data because same issue is there in write back.
    If answer is invalidate all instances in memory cache entry then same can be done for Write through. Which makes me ask question 2.
    2) My another question is : We can update all instances' in memory cache entry and then update DB. In this way consistency is maintained so why not we use this for critical data like password financial information.

  • @kabooby0
    @kabooby0 3 ปีที่แล้ว +4

    Great content. Would love to hear more about how to solve cached data inconsistencies in distributed systems.

  • @devinsills1281
    @devinsills1281 3 ปีที่แล้ว +3

    A few other reasons not to store completely everything in cache (and thereby ditching DBs altogether) are (1) durability since some caches are in-memory only; (2) range lookups, which would require searching the whole cache vs a DB which could at least leverage an index to help with a range query. Once a DB responds to a range query, of course that response could be cached.

  • @michaelscheppert3664
    @michaelscheppert3664 3 ปีที่แล้ว

    thanks for this quick tutorial :) your English is really good

  • @veryconfuseduser
    @veryconfuseduser 2 ปีที่แล้ว

    13:00 Can you please explain why financial data should use write-back and not write-through? I thought you want high consistency it's not like social network where consistency doesn't matter. Write-through has higher consistency than write-back does it not?

  • @sjljc2019
    @sjljc2019 4 หลายเดือนก่อน

    I have a question, the first point you mentioned is to reduce network calls. But as you mentioned that we need a seperate system, thus the network calls minimization stands void. Right?
    So, how benificial it is to use Redis if we are still doing IO calls? Is it like, DB IO call is more expensive than Redis IO call? I am a bit skeptical on this part.

  • @rishiraj1616
    @rishiraj1616 5 ปีที่แล้ว

    This is my video on your channel and I must say that you explain very well! You seem professional, knowledgable and researched your topic well!

  • @KajkoCar
    @KajkoCar 5 ปีที่แล้ว +71

    Title: What is Distributed Caching? Explained...
    There is not a single 'D' in this 'Distibuted' explanation. You are talking about 'cache' and it's variations in implementation ONLY.
    All in all, change the title to 'What is caching?'

  • @sharifulhaque6809
    @sharifulhaque6809 3 ปีที่แล้ว

    Very easy understanding Gaurav. Thanks a lot !!!

  • @pat2715
    @pat2715 11 หลายเดือนก่อน

    amazing clarity, intuitive explanations

  • @ShukyPersky
    @ShukyPersky 3 ปีที่แล้ว

    What is the efficiency of such architecture for rapidly changing data. Not only write-thru is required (as Vijay Somasundaram indicated below), but reading from the database is always required in order to get the most updated information, in which case this architecture is almost useless. Do I miss anything?
    In other words, it would be better to start with going thru the use cases where this architecture has an advantage.
    thanks a lot for preparing this video

  • @jajasaria
    @jajasaria 5 ปีที่แล้ว +2

    always watching your videos. topic straight to the point. keep uploading man. thanks always.

  • @runfunmc64
    @runfunmc64 5 ปีที่แล้ว +5

    The cache isnt stored on a SSD, its stored in memory right? At 2:36 you mentioned a cache is stored on an SSD.

    • @nou4605
      @nou4605 3 ปีที่แล้ว

      Depends on the kind of cache

  • @stevengassert7747
    @stevengassert7747 3 ปีที่แล้ว

    As we add more data to a cache, why would search time increase? Since we most likely are using key-value pairs, wouldn't retrieval always be O(1)?

  • @djanupamdas
    @djanupamdas 5 ปีที่แล้ว

    I think simply telling THANK YOU will be very less for this help !!! Superb video.

    • @gkcs
      @gkcs  5 ปีที่แล้ว

      Glad to help :)

    • @jagrick
      @jagrick 5 ปีที่แล้ว

      I mean you can always do more by becoming a channel member 😄

  • @hareendranep8422
    @hareendranep8422 4 ปีที่แล้ว

    Very nice presentation . Simple, powerful and fast presentation. Keep up the style

    • @gkcs
      @gkcs  4 ปีที่แล้ว

      Thank you!

  • @246810ben10
    @246810ben10 5 ปีที่แล้ว +1

    The hybrid approach suggests
    1. Write update data only on the local server cache. Do not write to db
    2. After some time interval, persist some chunked amount of the cache data to the db
    But what if between 1 and 2, the local server crashes? Isn't the update data lost forever?

    • @gkcs
      @gkcs  5 ปีที่แล้ว

      It is.

  • @coledenesik
    @coledenesik 2 ปีที่แล้ว

    Please make a full series in Redis or Paid Course.

  • @GalazyC12
    @GalazyC12 11 หลายเดือนก่อน

    Thank you so much..! your videos are really valuable. Really appreciate your effort, sir.!!

  • @mahadreamz
    @mahadreamz 3 ปีที่แล้ว

    Is global cache also runs as in-memory data store but can be deployed in a different cluster (other than app server) ?

  • @openretailsstore3808
    @openretailsstore3808 3 ปีที่แล้ว

    @Gaurav Sen - How network call can be reduced in terms of distributed cache wherein cache would be distributed? Why distributed cache is faster than database?

  • @roycrxtw
    @roycrxtw 3 ปีที่แล้ว

    I really hope I have watched this video before my interview this week...:(

  • @airliu1
    @airliu1 3 ปีที่แล้ว

    What types of data are better stored on Server Cache vs Global Cache?

  • @victorvianna10
    @victorvianna10 5 ปีที่แล้ว +1

    Your System Design videos are very good and helpful, thanks!

  • @rahulchawla6696
    @rahulchawla6696 2 ปีที่แล้ว

    wonderfully explained. thanks

  • @jazeem10
    @jazeem10 5 ปีที่แล้ว +205

    this isn't distributed caching , this is simply about caching & Redis ...

    • @larskrenning260
      @larskrenning260 4 ปีที่แล้ว +20

      @@deshkarabhishek This indeed again an example of "click bait". A person saying X but - as many others before him - explaining Y. Where Y is The Basics, and X is The Difficult. These people who this "click bait" trick are mostly people from India. I'm not saying that all Indian people upload worthless info, some of them are really spectacular - but 100% of the worthless info are from India. With regards to Redis / Caching - my guess it that RedisLabs acknowledged this "click bait" problem and uploads extremely good info. (And some of this info is actually done by some ultra intelligent Indians - because when an Indian is intelligent, he / she is extremely intelligent)

    • @YashArya01
      @YashArya01 4 ปีที่แล้ว +15

      @@larskrenning260 I think you gotta keep in mind that some of what you're seeing is because of the high population and because of the higher proportion of Indians pursuing engineering. :) So I'm not sure you get anything of value from that anecdotal observation.

    • @namangarg3933
      @namangarg3933 4 ปีที่แล้ว +10

      @@deshkarabhishek Well, that's bad. It will be great if you could share a video with your production experience. May be Gaurav can also learn about 'DISTRIBUTED' cache from you.

    • @shubhammadankar6390
      @shubhammadankar6390 4 ปีที่แล้ว +2

      @@namangarg3933 correct

    • @TheAppAlchemist
      @TheAppAlchemist 4 ปีที่แล้ว +3

      @@larskrenning260 lol, are you a jealous pig? cuz your comment sounds like a nazi who is not potty trained, this is youtube, not toilet, please behave and inform yourself before commenting such stupid stuff.
      your comment makes me feel go and throw up
      100% crap people like you make this world stink
      I agree this video was not his best video, but you all are here and learning from him
      your comment shows how much of ignorant you are I would delete it if I was you

  • @devendrparhate
    @devendrparhate 4 ปีที่แล้ว +1

    Correction: INPUTING and OUTPUTTING -> Adding and Removing 5:46

  • @sandeepk9640
    @sandeepk9640 3 ปีที่แล้ว

    Nicely packed lot of information for glimpse.. Great work

  • @debsworld3784
    @debsworld3784 ปีที่แล้ว

    Gaurav - One question here , i got about Write through and Write Back , what about the Read through and Read Back

  • @an_R_key
    @an_R_key 4 ปีที่แล้ว

    You articulate these concepts very well. Thanks for the upload.

  • @daysimples7658
    @daysimples7658 4 ปีที่แล้ว +2

    Summary
    Caching can be used for the following purposes:
    Reduce duplication of the same request
    Reduce load on DB.
    Fast retrieval of already computed things.
    Cache runs on SSD (RAM)
    Rather than on commodity hardware.
    Don't overload the cache for obvious reasons:
    It is expensive(hardware)
    Search time will increase
    Think of two things:(You obviously want to keep data that is going to be most used)
    !So predict!
    When will you load data in the cache
    When will you evict data from the cache
    Cache Policy = Cache Performance
    Least Recently Used
    Least Frequently used
    Sliding Window
    Cache Policy = Cache Performance
    Least Recently Used
    Least Frequently used
    Sliding Window
    Avoid thrashing in Cache
    Putting data into the cache and removing it without using it again most of the time.
    Issues can be of Data Consistency
    What if data has changed
    Problems with Keeping cache in Server memory(In memory)
    -What if the server goes down(cache will go down)
    -How to maintain consistency in data across cache.
    Mechanism
    Write through
    Always write first in the cache if there is an entry and then write in DB.
    The second part can be synchronous.
    But if you have in-memory cache for every server obviously you will enter into data inconsistency again
    Write back
    Go to Db, make an update, and check-in cache if you have the entry.. Evict it.
    But suppose there is no any important update and you keep evicting entries from cache like this you can again fall into thrashing.
    One can use Hybrid approach as per the use case.
    Thanks to @GauravSen

  • @pranavsurampudi6838
    @pranavsurampudi6838 4 ปีที่แล้ว +2

    One Observation, cache need not run on expensive hardware, and for cache, one would use "memory" centric instances on the cloud, not SSD(s) and caches can be used in place of a database if the size is relatively small and you require high throughput and efficiency.

  • @AbhideepChakravarty
    @AbhideepChakravarty 4 ปีที่แล้ว +1

    The draw back of write through you explained is equally applicable in Write Back i.e. I null the value in S1 still the value is not null in S2. Major thing is - Redis is not distributed cache. Even their own definition does not include the word "Distributed" - Redis is an open source (BSD licensed), in-memory data structure store, used as a database, cache and message broker. It supports data structures such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperloglogs, geospatial indexes with radius queries and streams. Redis has built-in replication, Lua scripting, LRU eviction, transactions and different levels of on-disk persistence, and provides high availability via Redis Sentinel and automatic partitioning with Redis Cluster.

  • @Mysterious_debris_1111
    @Mysterious_debris_1111 4 ปีที่แล้ว

    Awesome explanation gaurav. You're cool man. We want a lottt more from you. We admire your ability to explain topics with great simplicity.

  • @sauravchaudharysc
    @sauravchaudharysc 3 ปีที่แล้ว

    Can you please explain this ?
    Here you are telling that write through is first we write on cache then update on main memory but on other website it says it should be done in parallel.
    In write back it said in video we update the main memory and evicts the entry on cache but on other website it says only the cache location is updated in the write-back method and later on main memory is updated.

  • @1apocalyps
    @1apocalyps 4 ปีที่แล้ว

    Given that initially one will have nothing cached, at what point or when does one begin the initial coaching of data, is it when user count has increased to a pre invisioned number?

    • @gkcs
      @gkcs  4 ปีที่แล้ว

      Have a look at how to avoid cold starts. I have kept the example simple by expecting the cache to do 'lazy loading'.

  • @legozxx6655
    @legozxx6655 5 ปีที่แล้ว +2

    Great explanation. You are making my revision so much easier. Thanks!!

  • @pradeepkumarmishra1711
    @pradeepkumarmishra1711 3 ปีที่แล้ว

    As you said if tons of data are stored on cache, the search time will increase, but how ? I think it uses key => value pair (dictionary) so won't it always be faster than db calls, also cache is also stored in some other system (server) that your server calls up to get the data, so how we are reducing network calls here ? could you please help here?

  • @code_report
    @code_report 5 ปีที่แล้ว +2

    Great video Gaurav!

    • @gkcs
      @gkcs  5 ปีที่แล้ว

      Thanks code_report 😁

  • @ivandrofly
    @ivandrofly 5 ปีที่แล้ว +1

    My boy look very energized... keep it up!

    • @gkcs
      @gkcs  5 ปีที่แล้ว +1

      😁

  • @happilysmpl
    @happilysmpl 3 ปีที่แล้ว

    Excellent! Great video with tremendous info and design considerations

  • @munibabuc
    @munibabuc 3 ปีที่แล้ว

    In write through cache if db update is failed for some reason...how to rollback cache update

  • @sivaram2492
    @sivaram2492 3 ปีที่แล้ว

    A label/comment in the video about the change of usage w.r.t to write-back and write-through would help future viewers. I never saw the pinned comment until recently. This could have backfired in an interview.

  • @veliea5160
    @veliea5160 2 ปีที่แล้ว

    redis that you mentioned here is a redis cache server or redis database?

  • @bhardwajrrc6963
    @bhardwajrrc6963 4 ปีที่แล้ว

    In your hybrid update system there are issues like if the updated server fails before updating it to db entry will be lost while the customer is unaware of inundation of data. Instead of that it can be done that l8ke in write back system that the data should be written first on the db and then a that new data is updated on all servers thzt has it in therencache..

  • @vakul121
    @vakul121 4 ปีที่แล้ว +1

    It is a really great video.Finally found a detailed video.Thank you for sharing your knowledge!!

  • @chenwang7194
    @chenwang7194 4 ปีที่แล้ว +3

    Nice video, thanks! For the hybrid mode, when S1 persists to DB in bulk, the S2 is still having the old data, right? How do we update S2?

  • @rupeshpatil6957
    @rupeshpatil6957 4 ปีที่แล้ว

    Thanks for Video Gaurav.
    What if global cache itself failed? What are different backup strategies for it?

  • @adithyaks8584
    @adithyaks8584 4 ปีที่แล้ว

    We are using Redis... But due to network calls we have implemented an in memory store to keep some information temporarily... Looking for some cache/DB which is in memory in server instance memory space for a SpringBoot application

  • @rahulgulabani498
    @rahulgulabani498 2 ปีที่แล้ว

    How it is reducing network call if redis clusters running on different server .

  • @SuperGojeto
    @SuperGojeto 4 หลายเดือนก่อน

    I have another doubt about your "write through" in the video. Even if it's nonfinancial data like say comments, and you write to the database during flushing, say you are flushing from cache server 1 and server 2 also receives updates for the same key, then that will cause invalid data to be written from database.

  • @flixpods
    @flixpods 4 ปีที่แล้ว

    Very knowledgeable. Nicely explained

    • @gkcs
      @gkcs  4 ปีที่แล้ว

      Thanks!

  • @miguelpetrarca5540
    @miguelpetrarca5540 3 ปีที่แล้ว

    if a disk store is used instead of in memory, does this still suffer from consistency issues? I would assume disk store is still on a per server basis and therefore has same consistency issue as in memory?

  • @bobuputheeckal2693
    @bobuputheeckal2693 2 ปีที่แล้ว +1

    I couldnt get info on Distributed Cache.

  • @GustavoRodrigues-le3zw
    @GustavoRodrigues-le3zw 2 ปีที่แล้ว

    Amazing Explanation!! Thanks!!

  • @liveentertiner7275
    @liveentertiner7275 4 ปีที่แล้ว

    How to handle cache if the new user is created and list of users API don't have the newly created user (because the cache has an old set of records)?
    Please help me

  • @timhomstad
    @timhomstad 3 ปีที่แล้ว +1

    Do you implement caching on most systems? It will add complexity, how can you determine if it is worth the additional effort to develop.
    Love the videos by the way. These are a great learning tool, you do a great job.

  • @kevinz1991
    @kevinz1991 2 ปีที่แล้ว

    learned a ton in this video thanks so much

  • @esrevereverse-l8v
    @esrevereverse-l8v 2 ปีที่แล้ว

    Around 3.30, listed in the PROs of having Cache system. Can someone explain how network calls are being reduced. In real time scenarios, how can we reduce network calls because we are anyway pinging DB for information.

  • @RiteshKumar-qk6uy
    @RiteshKumar-qk6uy 5 ปีที่แล้ว +1

    Hi @Gaurav, In write through policy can this will be also issue , let's suppose someone did some update transaction it updated the cache and went through to db to update but there is some check is enabled in db where that transaction failed , so we will having incorrect data in cache? same goes with hybrid model out of 10 transaction let's say 2 failed while updating in db , in this way we will be having incorrect response from the cache .

  • @adamhughes9938
    @adamhughes9938 4 ปีที่แล้ว

    If we used redis, wouldn't a write through cache just update all the nodes in the cluster eventually? Or are you saying write through can't work on caches that have multiple servers? This confused me as why we couldn't use write through in your fist example...