Adding a cache is not as simple as it may seem...

แชร์
ฝัง
  • เผยแพร่เมื่อ 27 ก.ย. 2024

ความคิดเห็น • 211

  • @dreamsofcode
    @dreamsofcode  6 หลายเดือนก่อน +64

    Big shout out to everyone in the comments on this video for asking GREAT questions

    • @MohammadJairumi
      @MohammadJairumi 4 หลายเดือนก่อน

      Can you share your neovim distro?

    • @dreamsofcode
      @dreamsofcode  4 หลายเดือนก่อน

      @@MohammadJairumi you can find it on GitHub at elliottminns/dotfiles

  • @nathaaaaaa
    @nathaaaaaa 6 หลายเดือนก่อน +164

    Usually instead of write-through, I just DEL the relevant keys and force a new cache miss. Looks very reliable to me

    • @dreamsofcode
      @dreamsofcode  6 หลายเดือนก่อน +38

      That's a cool idea. I imagine it makes things a little more simple and can work for more advanced aggregations!
      I like it!

    • @xorlop
      @xorlop 6 หลายเดือนก่อน +14

      lol just left long winded comment about this

    • @dreamsofcode
      @dreamsofcode  6 หลายเดือนก่อน +9

      ​@@xorlop I'm glad you did!

    • @daleryanaldover6545
      @daleryanaldover6545 6 หลายเดือนก่อน

      I would do the same 😊

    • @123mrfarid
      @123mrfarid 6 หลายเดือนก่อน

      Good idea. Thank you..

  • @o11k
    @o11k 6 หลายเดือนก่อน +295

    "There are only two hard things in Computer Science: cache invalidation and naming things" ~Phil Karlton

    • @hansenchrisw
      @hansenchrisw 6 หลายเดือนก่อน +75

      And off by one errors 😉

    • @Rundik
      @Rundik 6 หลายเดือนก่อน +10

      And cache invalidation

    • @tacticalassaultanteater9678
      @tacticalassaultanteater9678 6 หลายเดือนก่อน +1

      ​@@hansenchriswand scope bloat

    • @af43bacc
      @af43bacc 6 หลายเดือนก่อน +5

      Concurrency and floating bugs: "Am I a joke for you?"

    • @markhaus
      @markhaus 6 หลายเดือนก่อน +1

      @@Rundik and cache invalidation

  • @xorlop
    @xorlop 6 หลายเดือนก่อน +52

    What a cool video! So many great ideas.
    Another idea for write through cache: delete the key instead! I think this could be good because whenever you update the entry, you are resetting its LRU value, which might not be accurate/helpful. I think there are a few cases where db write is not aligned with access of the key from cache. What if user writes spell but doesn’t use it right away, for example? By deleting it, you are saying save is not the same as use, which might be better aligned for a spell store. Newly updated spells might not be so popular. It also helps minimize cache overall size, which probably helps the redis LRU algorithm, which is only approximate LRU.

    • @dreamsofcode
      @dreamsofcode  6 หลายเดือนก่อน +6

      This is a great idea!
      Deleting does make a lot of sense when it comes to fast accessed data. I think the time you'd want to use write through would be when the data itself takes a long time to populate. But even then, you'd likely use some sort of expiration/deletion based resync.
      Another approach would be to not extend the expiration which I believe you can do with another redis SET option.

    • @daleryanaldover6545
      @daleryanaldover6545 6 หลายเดือนก่อน +2

      yes, the principle is delete cache on every operation except get requests, that's where we store the cache!

  • @underflowexception
    @underflowexception 6 หลายเดือนก่อน +13

    if you're using PHP and Laravel you can use the dispatchAfterRequest function to save to cache

  • @WanderingCrow
    @WanderingCrow 6 หลายเดือนก่อน +20

    Great video, very clear and articulated!
    Cache is among my biggest weakness, I think, after authentication and token management, so I'd be interested to learn more about them, and how/when to use them effectively 🤔

    • @dreamsofcode
      @dreamsofcode  6 หลายเดือนก่อน +2

      Thank you!
      They definitely have their use cases, but they're not that simple to implement and there's a lot to consider, more so that I even do in the video!

    • @GreatTaiwan
      @GreatTaiwan 6 หลายเดือนก่อน +4

      external IDP with SAML2 & OIDC (PKCE) is my biggest weakness

    • @SeanLazer
      @SeanLazer 6 หลายเดือนก่อน

      My advice is squeeze as much perf as you can out of your primary data store before you add a caching layer! Your RDBMS can take you a lot further than some people realize.

  • @wcrb15
    @wcrb15 6 หลายเดือนก่อน +3

    Too many people reach for caching as a mechanism to improve performance when actual performance tuning of thr application is the more appropriate action. Cache isnt going to save you if uou application is over fetching or inefficiently grabbing data from the DB. But wjem it's used correctly caching is awesome!

  • @giuliopimenoff
    @giuliopimenoff 6 หลายเดือนก่อน +48

    I just removed the Redis cache for my project because I figured out it created more issues than benefits. Databases are already fast as heck, so use caches with intention. I use Redis for session tokens for example but nothing else rn

    • @dreamsofcode
      @dreamsofcode  6 หลายเดือนก่อน +13

      I think this is right choice. Session tokens are a good use of caching.

    • @goldensunrayspone
      @goldensunrayspone 6 หลายเดือนก่อน +2

      honestly I almost never use an external cache for anything I've written, because it helps me considerably more to consider how and WHY I want to cache information for each particular type of data. Some of it never needs a cache at all, and some of it only needs to cache a few bits of data, and it also gives you a hint on how to tell when your cache is stale since you know exactly what you're caching and when, rather than caching all data all the time

    • @giuliopimenoff
      @giuliopimenoff 6 หลายเดือนก่อน +6

      also when data is relational caching just triplicates the effort to keep it synced properly

    • @anthonycavagne4880
      @anthonycavagne4880 6 หลายเดือนก่อน

      I don't exactly understand you store the userId as the key and the token as the value ? Why is this better than using cookie ?

    • @giuliopimenoff
      @giuliopimenoff 6 หลายเดือนก่อน

      @@anthonycavagne4880 I use cookies and in the cookie I store the session id. Then in redis I have an hashmap with the user id and session id, so I can get the session data quickly and can also invalidate all sessions if needed

  • @EvanEdwards
    @EvanEdwards 6 หลายเดือนก่อน +1

    Best walk in on a codebase I ever did was to realize they had left on passthough on their cache layer. The cache was invalidated with every request and passed through, presumably as a debug/development one line shortcircuit. I pointed it out, they deleted the one line and vastly improved responsiveness. They were nearly three months post-launch. (I was working on a loosely coupled service; consulting company operating as a separate department, essentially. I found it when looking into connecting to their database for some features).

  • @RomanKornev
    @RomanKornev 6 หลายเดือนก่อน +12

    In the Write-Through caching case, what happens when the concurrent cache write takes slightly longer than expected? Now the client assumes that the data was updated, but when reading it back it would be a race condition, and the value might be stale.

    • @dreamsofcode
      @dreamsofcode  6 หลายเดือนก่อน +5

      Your correct, that's one drawback of concurrency as it can introduce a race condition.
      The only way to solve it would be to lock the cache and pass that lock through to the concurrent task.
      The other approach is to do write through caching with the cache being the first target, although this can lead to some weird state if the database operation fails.
      Either way adds complexity!

    • @NotherPleb
      @NotherPleb 6 หลายเดือนก่อน

      I was thinking something similar, like the DB or the cache fails, you need a way to sync the state again. I think the easiest solution is to spawn 2 tasks, one for the DB and one for the cache and await the results of both and handle those cases. However, the response time is the slowest of the two

    • @dreamsofcode
      @dreamsofcode  6 หลายเดือนก่อน +2

      @@NotherPlebEven spawning two tasks can be complicated, however, especially if one fails and the other doesn't. You need to then reconcile afterwards.

    • @NotherPleb
      @NotherPleb 6 หลายเดือนก่อน

      @@dreamsofcode yes, but I guess you always need to wait for the result of both in the handler when you mutate data, you can't just "set and forget" as an optimization

    • @EduarteBDO
      @EduarteBDO 6 หลายเดือนก่อน +1

      I think one solution workflow would be: lock cache key > update database > (update failed > unlock cache) update success > delete cache and let a cache miss happen in the future.

  • @yuu-kun3461
    @yuu-kun3461 6 หลายเดือนก่อน +7

    After watching the recent PostgreSQL video by "The Art Of The terminal" it would seem to me that adding Reddis to PostgreSQL is not needed for most projects.
    Additionally, as presented in the blog post by martinheinz, a cache can be achieved by creaing UNLOGGED tables. And if the key-value pair functionality of Reddis is that important, the video mentioned covers that too.

    • @dreamsofcode
      @dreamsofcode  6 หลายเดือนก่อน +6

      100% agree. Most use cases it's not needed. And the complexity can often outweigh the benefit.
      I don't know if I would consider UNLOGGED tables a viable alternative. I did some benchmarking on them for another video I had planned and they're no where near as fast. There's also a lot of caveats with using them and if you're unaware it's unlogged, mistakes can happen.

    • @peppybocan
      @peppybocan 6 หลายเดือนก่อน +1

      I think Redis is more perfomant than Postgres' unlogged tables, just because Redis a specific tool optimized for in-memory store. Postgres, OTOH, has layers of abstraction for a simple store and retrieve functionality. Use correct tools for the correct uses.

    • @DanniDuck
      @DanniDuck 6 หลายเดือนก่อน

      @@dreamsofcode *The complexity often does not outweigh the benefit. It's extremely simple and will make everything 100x faster. For example, say you have a bunch of base64 encoded images stored in pg, you can make it so the image (likely ~10 kb each or so), gets stored in memory as it's result format, allowing you to make anything involving images significantly faster. It can make things super fast if you use it right, eg. big queries for a product's info or whatever.

    • @OneShore
      @OneShore 6 หลายเดือนก่อน

      @@peppybocan Yeah, the difference is that Redis is very lightweight. If you're looking to throw more RAM & CPU at a DB problem, then Redis starts to make sense. Because 8GB Redis + 8GB Postgres is going to outperform 32GB Postgres in many cases.

    • @peppybocan
      @peppybocan 6 หลายเดือนก่อน

      not necessarily, depends on the workload. If you have a transaction-heavy processing, there is no way around it. E.g. if you are a paypal and you need strong ACID guarantees you may find yourself in a pickle. Storing payment information in-memory is fine as long as you have resiliency built into the application.@@OneShore

  • @Sarwaan001
    @Sarwaan001 6 หลายเดือนก่อน

    I work at a team that handles very large amounts of data and we usually take a “best tool for the job approach”. E.g. for a graph database as the ground truth, we use it for simple queries that are O(1) use a search db for search at O(log(n)) and use a trigger to send data from the graph database, obtain the full object by performing a walk, sending the object to a search db.
    Feel like this is technically like caching but it’s still very fast and we think of cache databases more of a crutch to by more time rather than a solution.

  • @posteisnoob5763
    @posteisnoob5763 6 หลายเดือนก่อน +2

    Thanks for the great video!! I would really like to see your take on when / when not to cache

  • @alirahimi4477
    @alirahimi4477 6 หลายเดือนก่อน +3

    Most of the time that one select by id isnt the bottleneck, rather it is a complex query that returns a possibly big result and caching that can be a real pain or plainly impossible. When you query by X but update by Y there is no clear way to use the write through method to update your cache because you dont even know what cache keys you should be updating!

    • @dreamsofcode
      @dreamsofcode  6 หลายเดือนก่อน +2

      Yep, you're correct!
      Apologies if I didn't make that clear in the video, I didn't want to complicate the caching implementation itself so went for a simple query

  • @n0kodoko143
    @n0kodoko143 6 หลายเดือนก่อน +3

    awesome video. I would love to see a 'when to and not to cache' video.

    • @dreamsofcode
      @dreamsofcode  6 หลายเดือนก่อน +1

      Thank you! I shall do one then! 😁

  • @zshanahmad2669
    @zshanahmad2669 6 หลายเดือนก่อน +2

    great video. my biggest problem with caches in bigger projects is dealing with related data.
    for example, I cached the blogs api: GET blogs/blog_ID, this api returns the JSON which has blog and the information about the author.
    When the data about author changes e.g. their name, I have to invalidate the blogs/blog_ID too, otherwise users will get the old author data.
    I know I could only return the blog data in blogs/blog_ID request, but I cant change the frontend, which expects the user data inside of the response.

  • @theblckbird
    @theblckbird 6 หลายเดือนก่อน +3

    In Rust, you can do the following to convert a Result to an Option:
    let my_result = action_that_returns_a_result(); // Result
    let as_option = my_result.ok(); // Option
    It works the other way around as well:
    let option = Some("foo"); // Option
    let as_result = option.ok_or(0); // Result
    let option = Some("foo"); // Option
    let as_result = option.ok_or_else(|| 3 * 3 / 9) // Result

    • @dreamsofcode
      @dreamsofcode  6 หลายเดือนก่อน +2

      This is much simpler! Thank you.

    • @nathanoy_
      @nathanoy_ 6 หลายเดือนก่อน

      Awesome write up. I was about to write a similar comment. Now I write this reply to push this one. 👌

    • @Maxelya
      @Maxelya 6 หลายเดือนก่อน

      I still consider myself lacking experience with Rust, but somehow I knew about these "ok" methods and was about to point it out after watching the vid ^^'.

  • @hunorportik5618
    @hunorportik5618 6 หลายเดือนก่อน

    Useful info, well described.
    One important thing was left out IMO: using concurrency might actually re-introduce the stale-data-issue since one might fail due to a non-transient (or improperly handled transient) issue.

    • @dreamsofcode
      @dreamsofcode  6 หลายเดือนก่อน

      That's correct! This issue becomes even more problematic in a distributed system as well if we horizontally scale our app.

  • @michaelhenze877
    @michaelhenze877 5 หลายเดือนก่อน

    Would really like to see a comparison between NvChad and your current NeoVim configs.

  • @nexovec
    @nexovec 6 หลายเดือนก่อน +2

    I just realize you can literally ship a product that's just static files and a Postgres server. Curb your stack, please.

  • @fahimferdous1641
    @fahimferdous1641 6 หลายเดือนก่อน +1

    I legit thought today's sponsor was docker XD
    Are you using the embedded terminal though?

  • @CrypticConsole
    @CrypticConsole 6 หลายเดือนก่อน +3

    Why do you need to cache this in Redis? Could you not just use master slave database scaling for read heavy workloads?

    • @dreamsofcode
      @dreamsofcode  6 หลายเดือนก่อน

      Read replication is a decent solution in many use cases, especially read heavy as you mentioned.
      Just like caching, it's a tradeoff so it does depending on what your data model / system looks like.

  • @kennedydre8074
    @kennedydre8074 6 หลายเดือนก่อน

    I would really love to see a video of when to cache and when not to cache, thank you.

  • @legobuildingsrewiew7538
    @legobuildingsrewiew7538 6 หลายเดือนก่อน

    Instantly subscribed! Great video.

  • @TheTwober
    @TheTwober 6 หลายเดือนก่อน +1

    Now imagine you program in Java and all those problems are already solved. :)
    Just use a SoftReference that will be cleared by the GC if it needs memory, and the attached ReferenceQueue can be (blockingly) polled by a background thread, so your cache gets informed whenever something got removed by the GC. A near perfect cache is nowaydays literally 3 lines of code in Java.

  • @mementomori8856
    @mementomori8856 6 หลายเดือนก่อน

    crazy that you release this the same day as I start implementing Redis from scratch

  • @ordinarygg
    @ordinarygg 6 หลายเดือนก่อน +2

    90% of issues is missed indexes, or crappy backend code that runs 99% and 1% db time. So before you say DB is slow, please benchmark your API and DB independently. Simple 8 core Ryzen machine can handle 300k selects/sec and 60k/inserts per seconds using PostgreSQL. 256 cores and 1 TB of ram will solve a lot of issues in single instance. People don't even reach a level of vertical scaling first, instead starting scaling horizontally, huge mistake for middle-small businesses and startups.

    • @dreamsofcode
      @dreamsofcode  6 หลายเดือนก่อน +1

      Yep, I agree with you (I believe I stated something similar at the end).
      There are certain use cases where caching applies, but in general, optimizing your database queries is the correct approach.
      There is a case for horizontal scaling over vertical still, especially wrt availability. But even then you can use read replication to improve that.

    • @kriffos
      @kriffos 6 หลายเดือนก่อน

      If you want a really fast cache, it is a good idea to scale the cache together with your application and have no http request to the cache. Most of the time spent to get the data is probably http overhead. I think cache as micro service is - most of the time - a bad idea.

  • @fahimferdous1641
    @fahimferdous1641 6 หลายเดือนก่อน +1

    What would be an example usecase for the random eviction policy?

    • @I25mI25
      @I25mI25 6 หลายเดือนก่อน +1

      LRU comes with a small overhead since you have to somehow store/maintain a "list" of which items were last accessed. In many "normal" cases, it is likely that an item that was recently accessed will be accessed again, so keeping the newer/most frequently used ones in cache is worth the overhead. If your access patterns on the other hand are mostly random, keeping track of usage patterns isn't really worth it, so you can just delete any random entry. You might still want to use a cache even in random use cases when the occasional random cache hit might still give a big enough boost/save you money in bandwidth/storage access cost to make the added complexity of a cache worth it.

    • @dreamsofcode
      @dreamsofcode  6 หลายเดือนก่อน

      This is a great explanation.
      As for specific use cases, it's hard to really describe any that would fall into this. But any data / queries that have no discernable pattern, or in a system where the likelihood of needing a key is the same across your data set.

  • @Daniel-i8v2i
    @Daniel-i8v2i 5 หลายเดือนก่อน

    what video editing software do you use? it looks like you're on Linux

  • @PiesekLeszek90
    @PiesekLeszek90 6 หลายเดือนก่อน

    Write-through cache sounds like you just have 2 databases running at the same time, but I assume it's because of the simplicity of the example? I'd imagine you only cache the prepared API response with all it's relations and after applying logic, and not "raw data" as it is in the main database?
    This doesn't sound too optimal when you update one record that applies to many users, but each user needs it's own cached version?

  • @Amejonah
    @Amejonah 6 หลายเดือนก่อน

    There is one big question I have for a long time: how do distributed microservices work? Especially how scaling of certain services can be achieved? How buses/message broker play a role in it?
    You might be the one who can address these questions using simpler terms.

  • @TR1XT3RZ360
    @TR1XT3RZ360 6 หลายเดือนก่อน

    can you share your terminal setup.

  • @skr-kute1677
    @skr-kute1677 6 หลายเดือนก่อน

    Thanks for the vid
    Informative and simple

  • @Zutraxi
    @Zutraxi 4 หลายเดือนก่อน

    Don't forget the retry policy for when your concurrent write to the cache fails. What if the api crash as the write is happening.
    Better use a fault handling outbox pattern.
    Suddenly caching is slower than accessing the database.

  • @egemengol306
    @egemengol306 6 หลายเดือนก่อน

    For the life of me I don't understand the need for Redis
    When I need caching I always reach for in-memory caching libraries right in my codebase, reducing latency with development and deployment complexity at the same time, while staying featureful.
    If the language is memory hungry in-memory sqlite works really well for most of the cases
    If I want centralized state I reach for the database itself, Postgres is excellent
    Under which circumstances Redis would be the first choice?
    Edit 1: Multiple instances caching for mutable data would be one I suppose

  • @ebukaume
    @ebukaume 6 หลายเดือนก่อน

    What happens when the spawned task compeletes with an error? It seems we didn't completely solve the stale data problem.

    • @dreamsofcode
      @dreamsofcode  6 หลายเดือนก่อน +1

      Correct, and in a distributed system, this is even more difficult!

  • @Myrkytyn
    @Myrkytyn 3 หลายเดือนก่อน

    When to cache?

  • @neliosantos4014
    @neliosantos4014 4 หลายเดือนก่อน

    Amazing!! 😄

  • @SlavomirDanas
    @SlavomirDanas 6 หลายเดือนก่อน +1

    Woah, wouah, woah! Just 8 seconds into the video and I see inforgraphics with cache layer in the completely wrong spot.

  • @ivan_adamovich
    @ivan_adamovich 6 หลายเดือนก่อน

    I did not understand one thing: 50 ms is certainly good response time. but for the simplest api written in rust, there is somehow a lot, don't you think? (i use go im projects, so i'm noob in rust)

    • @illyias
      @illyias 5 หลายเดือนก่อน

      You won't need caching in a simple project, your database will be able to handle the load fine.

  • @pieter5466
    @pieter5466 6 หลายเดือนก่อน

    8:14 Makes you wonder whether there is *ever* a good use case for "random order"

  • @animanaut
    @animanaut 6 หลายเดือนก่อน

    if you want to enable client side caching there are also etag request/response headers that can be used as well. a whole other topic, but i believe they use hashes to let the backend decide to respond with either a potential big payload over the network or not if the client's hash code looks ok to what is pesent in the server db/cache already (returning http code 304 instead).

    • @hansenchrisw
      @hansenchrisw 6 หลายเดือนก่อน +1

      +1, though If-Modified-Since is a bit simpler and usually sufficient

    • @hansenchrisw
      @hansenchrisw 6 หลายเดือนก่อน

      @CesarLP96 search for HTTP conditional requests

    • @animanaut
      @animanaut 6 หลายเดือนก่อน

      developer pages from mozilla would be one recommendation from me, also known as mdn

    • @animanaut
      @animanaut 6 หลายเดือนก่อน

      @CesarLP96 mozilla developer pages would be one page, just search for etag

    • @animanaut
      @animanaut 6 หลายเดือนก่อน

      fyi, i answered multiple times now but yt refuses to show it for some strange reason. not sure you will see this comment as well. one example would be the mozilla developer network

  • @hosamhamdy258
    @hosamhamdy258 6 หลายเดือนก่อน

    great video
    can you make when to cache or not video too
    thanks in advance

  • @PhilfreezeCH
    @PhilfreezeCH 6 หลายเดือนก่อน

    Who ever thought caching was simple?
    Its one of those rare things thats hard on all levels. Its very difficult in hardware development, difficult in software and ridiculously difficult in networking, its just brutal.
    Plus it always requires a ridiculous amount of benchmarking and verification to make sure you don‘t accidentally degrade performance on certain workloads or even worse, mess up data.

  • @arcadierosca9818
    @arcadierosca9818 6 หลายเดือนก่อน

    Can you create a video on how to make video like that? it's amazing!!!

  • @fadhilinjagi1090
    @fadhilinjagi1090 6 หลายเดือนก่อน

    What of you deleted the cache entry right before you updated/deleted the record in the DB? Will this prevent the race condition?

    • @fadhilinjagi1090
      @fadhilinjagi1090 6 หลายเดือนก่อน

      I think that's optimistic mutation, if I'm not wrong.

  • @FinlayDaG33k
    @FinlayDaG33k 6 หลายเดือนก่อน +1

    There is a major issue tho... If your key expires, and suddenly 1K requests come in, you're now hitting the database with all 1K requests and may overload the database anyways.
    Not exactly ideal.

    • @dreamsofcode
      @dreamsofcode  6 หลายเดือนก่อน +2

      You're correct, this is known as the thundering herd problem.
      You can solve this by using something such like a single flight mechanism or connection pooling, again it's more complexity though.

    • @mind.journey
      @mind.journey 6 หลายเดือนก่อน +1

      I don't know if it's optimal, but what I usually do is never let the key expire, and instead just create a cronjob (or something similar) that periodically refreshes the key with updated data.

    • @FinlayDaG33k
      @FinlayDaG33k 6 หลายเดือนก่อน

      @@mind.journey This works depending on the goals yes.
      If it's data that you know will be highly saught after by your code, it can definitely work.
      However, you are now burdened with the task of guessing which data would benefit from it.
      It can also lead to you having a lot of data in the cache that you may only need once under the full-moon, thus wasting resources in fetching and keeping it cached.

  • @vinii2815
    @vinii2815 6 หลายเดือนก่อน

    hey sorry this is out of the topic of the video but will you make a new video about NvChad configuration? their new file structure is very confusing and I haven't seen anyone with an update tutorial for it yet

    • @dreamsofcode
      @dreamsofcode  6 หลายเดือนก่อน +1

      I'll be redoing the neovim content soon! I recommend staying with NVChad 2.0 for the mean time!

  • @Avanta1
    @Avanta1 6 หลายเดือนก่อน

    I'm not very familiar with async Rust, but is there any change of race condition when updating the cache? If a thread that was spawned later acquires the lock before an earlier spawned thread?

    • @dreamsofcode
      @dreamsofcode  6 หลายเดือนก่อน

      Youre correct.There absolutely is a chance. A race condition is introduced by making the update concurrent from the response.
      If you want to ensure 100% consistency then performing the update synchronously would be preferable!

    • @Avanta1
      @Avanta1 6 หลายเดือนก่อน

      @@dreamsofcode Cool, thanks for replying!

    • @dreamsofcode
      @dreamsofcode  6 หลายเดือนก่อน

      @@Avanta1Thanks for asking the question!

  • @Cal97g
    @Cal97g 6 หลายเดือนก่อน

    It’s not stale it’s just eventually consistent

  • @rando521
    @rando521 6 หลายเดือนก่อน

    so i have a question since i am new to rust and axum
    the appstate is some amalgamation of arc/mutex
    and you lock it everytime you want to access db or redis cache
    wouldnt this just mean you are making the asynchronous runtime semi-synchronous

    • @dreamsofcode
      @dreamsofcode  6 หลายเดือนก่อน

      It's a great question.
      My implementation in the video is naive, mainly due to simplifying the code as much as possible. The requests are still asynchronous but you're correct the lock would prevent concurrent requests both accessing the shared state.
      An improved implementation would be to use either a RW lock, or have a better abstraction of the state to only lock when needed (rather than at the start of the request)

    • @rando521
      @rando521 6 หลายเดือนก่อน

      thanks kinda new to rust and async

  • @frazuppi4897
    @frazuppi4897 6 หลายเดือนก่อน

    amazing video, will checkout aiven for sure

  • @M3t4lstorm
    @M3t4lstorm 6 หลายเดือนก่อน

    Note: In the write-through example, if your application crashes/errors/gets killed before the cache update is writen to redis (after the DB write) you will have stale data forever.

    • @dreamsofcode
      @dreamsofcode  6 หลายเดือนก่อน

      You mean until the TTL?

    • @liu-river
      @liu-river 6 หลายเดือนก่อน

      yeh, but if you do sync, update redis after successful dB write then you sacrifice speed. I guess you can implement some kind of rollback if either fails?

  • @its_maalik
    @its_maalik 5 หลายเดือนก่อน +1

    Adding a cache should be the last resort to achieving good performance. Majority of applications will do just fine without a cache if they nail the data modeling and query optimizations.

  • @perz1val
    @perz1val 6 หลายเดือนก่อน

    Looking at the comments I think you should've used a request that queries multiple tables of normalized data into a single object. Like /user/2/permissions is: user + user_role + role_permission + permission (list of permission names). Then the benefits are clear. Using cache to store a SELECT * FROM table; is a bad example.

    • @dreamsofcode
      @dreamsofcode  6 หลายเดือนก่อน

      Yeah, that's fair. I wanted to keep the interface as simple as possible so as not to distract from the caching itself.
      My original setup was doing a string search across 10m rows, but then it added more complexity to the examples (and at that point and index is still a better solution).

  • @backupmemories897
    @backupmemories897 6 หลายเดือนก่อน

    sometimes adding cache slows it down xD but scale it better xD because whenever u do something u call that cache system.. another step.

    • @dreamsofcode
      @dreamsofcode  6 หลายเดือนก่อน

      Absolutely!
      That's the problem in the case of inserts at first. It improved the performance of the reads on a cache hit, but caused the timings to increase by 66% on a cache miss.

  • @Ca1vema
    @Ca1vema 6 หลายเดือนก่อน

    Dunno what you're talking about, to add cache all I need is to put 2 lines in framework settings 🙃

  • @shady4tv
    @shady4tv 6 หลายเดือนก่อน

    ironic that a video about redis comes out just before everyone drops it for going closed source.

    • @dreamsofcode
      @dreamsofcode  6 หลายเดือนก่อน +3

      Bad timing!
      Although tbf, this can apply to any caching solution. + I get to review all of the forks that are coming

    • @shady4tv
      @shady4tv 6 หลายเดือนก่อน

      @@dreamsofcode Honestly the timing is perfect! Redis is hot in the news cycle right now and you're right - this video isn't really about 'Redis' persay. But it's actually a great introduction to people who are uninformed about the software and want to get up to speed on all that is happening with it right now. I hope you get hella views from this bud! :)

    • @HUEHUEUHEPony
      @HUEHUEUHEPony 6 หลายเดือนก่อน

      I mean it is only closed source if you are a big company

    • @mrmelon54
      @mrmelon54 6 หลายเดือนก่อน

      @@HUEHUEUHEPony no? The new licensing doesn't fall under the definition of open source, and isn't accepted by the open source initiative.

  • @lemonking4076
    @lemonking4076 6 หลายเดือนก่อน +1

    Nice video! But I don't understand why would a dev torture themself with rust

    • @dreamsofcode
      @dreamsofcode  6 หลายเดือนก่อน +1

      🤣🤣🤣😭😭😭

    • @lemonking4076
      @lemonking4076 6 หลายเดือนก่อน

      @@dreamsofcodeit's just way too verbose and not easily readable 😂🙀
      I hope this comment doesn't turn into a flamewar!!!

  • @sieunpark2160
    @sieunpark2160 6 หลายเดือนก่อน +3

    first place!

    • @ariseyhun2085
      @ariseyhun2085 6 หลายเดือนก่อน

      Ok

    • @itsme3217
      @itsme3217 6 หลายเดือนก่อน

      Is this your life achievement ?

    • @pythagoran
      @pythagoran 6 หลายเดือนก่อน

      Congratulations and/or I'm sorry to hear that

    • @sieunpark2160
      @sieunpark2160 6 หลายเดือนก่อน

      @itsme3217 yeah my mom is proud of me 😁

  • @IS2511_watcher
    @IS2511_watcher 6 หลายเดือนก่อน

    4:26 `.unwrap_or(None)` can be shortened to `.ok()` for `Result`, more idiomatic too.

  • @Cranked1
    @Cranked1 6 หลายเดือนก่อน +28

    Making the writing to the cache independent of the completion of the request can be dangerous because if writing to the cache fails, you have a big problem. It also could happen that the user already moves on to make another request before the cache was written in the previous request which results in wrong data. This can be completely inconsistent and you have no gurantees (e.g. like database ACID). Even a database transaction won't save you because the system can still fail between cache write and transaction commit.

    • @dreamsofcode
      @dreamsofcode  6 หลายเดือนก่อน +5

      Yep, 100%.
      In a distributed system, this is even more challenging as you'd likely need to lock the key in the cache as well, which is even more complexity.

    • @DryBones111
      @DryBones111 6 หลายเดือนก่อน +3

      @@dreamsofcode Eventual consistency is both a blessing and a curse. Just like how async colours your functions, eventual consistency colours your whole system.

  • @rodemka
    @rodemka 6 หลายเดือนก่อน +11

    Video checklist:
    ✓ Editor - Neovim
    ✓ DB - PostgreSQL
    ✓ Cache - Redis
    ✓ gRPC
    - Fulltext seraches: meilisearch/sonic/typesense and postgres tsvector/tsquery
    - Authentication and authorization - oauth2, saml, openid, jwt, etc. endless list
    - full axum course from "todo app" -> "url shortener app" -> "pocketbase like app"
    - templates engines + hype about htmx
    - reports from db - rust/go + db -> pdf creation
    Thank you for the inspiring high quality videos!

    • @dreamsofcode
      @dreamsofcode  6 หลายเดือนก่อน +2

      Thank you for the great suggestions!

  • @evccyr
    @evccyr 6 หลายเดือนก่อน +11

    I will do no push-ups for every like this comment gets. I'm sore from the last time.

    • @foziezzz1250
      @foziezzz1250 6 หลายเดือนก่อน

      Would like to join you in this

    • @martin4ata933
      @martin4ata933 6 หลายเดือนก่อน

      LETS GOO

  • @thienlacho860
    @thienlacho860 6 หลายเดือนก่อน +4

    With write through, you may face dual write problem. There maybe a success write to database, but a timeout in Redis call. In that case, there is a stale data still in redis. In my application, I use Debezium to capture the change in the database and Produce to a Kafka topic, then a background process consume those changes and apply a cache invalidation. In my opinion, cache delete is safer than change the cache, as one cache may be affected by many different action, and those action may come concurrently, if you change the cache in wrong order due to async, then the cache may be changed to wrong result. Just delete the key for safe and memory efficiency.

    • @dreamsofcode
      @dreamsofcode  6 หลายเดือนก่อน +1

      Debezium is pretty great. I wanted to showcase it in this video but it blew the scope out way too much!
      CDC caching is dope.

  • @kartik180rajesh1
    @kartik180rajesh1 6 หลายเดือนก่อน +2

    If you use the redis as a cloud hosting service isn't it defeating the purpose of the cache? The cache should be ideally as close to your backend service - either in your network or same instance memory

  • @penguindrummaster
    @penguindrummaster 6 หลายเดือนก่อน +4

    I like the final takeaway saying caching is not your first step, and that database optimization should always be a consideration. I've seen too many people complicate their tech stacks just to avoid tackling an otherwise simple problem. Much like C, just because SQL is old doesn't mean it isn't really good at certain tasks.

  • @LauriePoulter
    @LauriePoulter 6 หลายเดือนก่อน +1

    any tips for avoiding stale data when dealing with a 3rd party service that can be updated by other actors?

  • @Affax
    @Affax 6 หลายเดือนก่อน +2

    Welp, time to move to KeyDB or DragonflyDB, at least they both are redis API compatible haha

  • @watzyh
    @watzyh 5 หลายเดือนก่อน

    I never use redis for caching. It's a database. Other than simple key-value store, i use it for handling time dimension in the program (rate-limiting task & job queue) very useful for webserver which each request run separately.
    For caching, it's job for webserver like nginx. It's far-far more efficient & performant. I never have any issue with cache invalidation or using custom cache-key. You can control nginx cache programmatically just like redis cache.

  • @foreverexpanding
    @foreverexpanding 6 หลายเดือนก่อน

    Why not update the cache when we update the DB, in that case there would be no need to worry about it being stale

  • @petar567
    @petar567 6 หลายเดือนก่อน +1

    Great video. Thanks for the information, also I would appreciate it if you make a video of when to use cache and when not.

  • @marcing5380
    @marcing5380 6 หลายเดือนก่อน +1

    One major thing to remember about caching and caches (or in general where you have two separate sources of truth/data) is that you'll always run into eventual consistency so you shouldn't use it in every possible scenario. I.e. there is a non-zero time for the cache and DB to sync up in which the data is inconsistent but still readable. The only to avoid is I think explicit locking but that slows down the whole thing quite considerably - when you make an update to a table you lock everything related to it and unlock after the write and cache update have been completed.

  • @user-qr4jf4tv2x
    @user-qr4jf4tv2x หลายเดือนก่อน

    the perfect cache is when you can natively plug cache on a database

  • @peppybocan
    @peppybocan 6 หลายเดือนก่อน +2

    Unless you handle 1000s of concurrent users and they pay you nothing (free users) you don't need to worry about caches. The right DB design with the right sized DB node can handle 1000s of concurrent users. Once you start handling 10k-s of users, then you think about the caching, but at that point, it should be fairly easy to scale the particular parts of your DB, because you *know* what is slow.

    • @dreamsofcode
      @dreamsofcode  6 หลายเดือนก่อน +1

      100%
      Profiling your queries and using well placed indexes is always a better option.
      Sometimes it's not possible (such as hitting a remote AP)I, but if you have control of the database then it's always the better option

    • @parkourbee2
      @parkourbee2 6 หลายเดือนก่อน

      Even then, do I really need a cache? Why not just index what needs to be indexed?

    • @peppybocan
      @peppybocan 6 หลายเดือนก่อน

      yeah absolutely! Those limits are external, and that's when it matters. I think Redis is a viable option in that case.
      even things like session authentication can be done with a silly in-memory LRU cache and it will get you 90-95% to the goal.
      But people tend to be very quick to stuff the project with a billion dependencies.
      @@dreamsofcode

    • @dreamsofcode
      @dreamsofcode  6 หลายเดือนก่อน +1

      @@parkourbee2If you don't have access to the database? I'm thinking more like Remote API's etc where you don't control the data at all.
      for example, we had an API that hit NIST for CVE's and was incredibly slow, in that case, caching was a good solution.

    • @luca4479
      @luca4479 6 หลายเดือนก่อน

      Postgres has built-in caching which is already crazy performant

  • @archip8021
    @archip8021 6 หลายเดือนก่อน +1

    i have a table of about ~20 items that i need very, very often, and it rarely changes
    is this a good use case for caching? can a whole table be cached like this?

    • @dreamsofcode
      @dreamsofcode  6 หลายเดือนก่อน +3

      I think for that size, you're likely not going to need caching.
      Caching is more for when you have slower queries, such as aggregations or hitting an API that has poor performance.
      Adding in a cache adds in complexity and it's probably not worth the performance gain you might receive.

    • @arturpendrag0n270
      @arturpendrag0n270 6 หลายเดือนก่อน +1

      Cant you load them at the start of the request and use some singleton or put them in some "global" variable so you wont have to request the unless needed.
      Even if thats not the case the db usually has caching mechanisms for repeating queries so for such small size of records its probably unnecessary.

  • @Fanaro
    @Fanaro 6 หลายเดือนก่อน

    Please make a video on how you edit your videos!

  • @JuanPabloCisneros2207
    @JuanPabloCisneros2207 6 หลายเดือนก่อน

    Caching is always tricky. In the lazy loading presented, you can end hitting the dual write problem as postgres is wat slower than redis. If the system needs concurrency it could be a tricky bug to solve i think

  • @saxtant
    @saxtant 6 หลายเดือนก่อน

    You do you your hardware is pretty much taking care of this already?

  • @krateskim4169
    @krateskim4169 6 หลายเดือนก่อน

    I would like to know when to cache and when not to please

  • @nedimkulovac6394
    @nedimkulovac6394 5 หลายเดือนก่อน

    Man, this video is awesome. By far the best and clearest explanation I've come across. Thanks a ton!
    I would like to see more videos explaining cache strategies and when to use caching and when not to use it.

  • @neelg7057
    @neelg7057 6 หลายเดือนก่อน

    Which font is that in your nvim? :)

  • @betoharres
    @betoharres 6 หลายเดือนก่อน

    why did you make the Write Through Cache concurrent? there's a chance of two concurrent requests have a mismatch value returned based on what's in the database; maybe I'm missing something here

    • @dreamsofcode
      @dreamsofcode  6 หลายเดือนก่อน

      You're correct.
      However, even with it being serial, there's no guarantee in a distributed / horizontally scaled system of a race condition not occuring.
      With a cache, it's almost impossible to guarantee consistency without locking the actual cache itself. In a distributed system, that's going to be even more complex.

  • @saywaify
    @saywaify 6 หลายเดือนก่อน

    Can you please share your nvim setup (or at least the colorscheme) ?? It looks so fine

    • @perz1val
      @perz1val 6 หลายเดือนก่อน

      Colorscheme looks like catpuccin

  • @allroni
    @allroni 6 หลายเดือนก่อน

    Great video, as usual! 🙂

  • @Biowulf21
    @Biowulf21 6 หลายเดือนก่อน

    Love your videos man. Keep it up!

  • @ShimoriUta77
    @ShimoriUta77 6 หลายเดือนก่อน

    Rust code is so beautiful.

    • @dreamsofcode
      @dreamsofcode  6 หลายเดือนก่อน

      😅
      It's not known for it's beauty

  • @CottidaeSEA
    @CottidaeSEA 6 หลายเดือนก่อน

    Cache is all fun and games until the cache is automatically invalidated due to a timer and everyone hits the same slow query at the same time.

    • @dreamsofcode
      @dreamsofcode  6 หลายเดือนก่อน

      This is a good one! The thundering herd problem.
      There are ways to solve it, using something such as single flight, although again it adds more complexity. Much harder to solve across in a distributed system.
      I'll probably do a video on it as a few people have mentioned it!

    • @CottidaeSEA
      @CottidaeSEA 6 หลายเดือนก่อน

      @@dreamsofcode Last time I had that issue, I solved it by forcibly fetching and caching with cron. A bit of a hacky and antipattern way of solving it, but it works really well.

  • @youtube_user9921
    @youtube_user9921 6 หลายเดือนก่อน

    Hi. Can you also post tutorial lectures on nix?

    • @dreamsofcode
      @dreamsofcode  6 หลายเดือนก่อน

      Absolutely! I'll likely do it on my other channel which is more focused on Linux and FOSS. I've been playing with NixOS more on there

    • @youtube_user9921
      @youtube_user9921 6 หลายเดือนก่อน

      Can you tell me which channel it is?

  • @siya.abc123
    @siya.abc123 6 หลายเดือนก่อน +1

    Rust syntax 😭😭😭🤢🤢🤢🤢

    • @dreamsofcode
      @dreamsofcode  6 หลายเดือนก่อน +2

      I'm with ya.
      I think I'm gonna use Go more for demonstrating anything non language specific in the future!

  • @bavidlynx3409
    @bavidlynx3409 6 หลายเดือนก่อน

    The comments and the video made me realise that caching is rather unnecessary and creates a lot of overhead and issues so imma stay away from it

    • @TheHTMLCode
      @TheHTMLCode 6 หลายเดือนก่อน

      I don’t think that’s necessarily the best decision, if you don’t cache you will incur performance issues under certain circumstances. The example illustrated in this video is a very simple one which could have been solved by efficiently indexing your database, but as you scale or encounter more complex problems you may want to consider caching for latency sensitive functionality. At work we have a workflow that requires our operators to pick orders in a warehouse, fetching the pick list (all the instructions to carry out the picking of an order) takes around 500ms to generate. The pick list reflects the entire state of the current pick journey and uses a cache write through strategy to update the cached pick list after every scan in the warehouse. Without a cache, the front end would need to rebuild the list from the database every time it retrieved the next instruction, 500ms after completing a stop and fetching the next stop would suck, fetching from cache and having a result in 30ms is far better. The tradeoff here is maintaining the complexity of the cache in order to achieve the performance SLO (service level objective) we promised to our consumer (warehouse staff). For simple applications you may be able to keep away from caching but I’d definitely learn and keep it as a tool in your toolbox, I’m sure sometime in your career it’ll be useful :) hope that helps!

    • @Amejonah
      @Amejonah 6 หลายเดือนก่อน

      I use caching (through postgres, I should really switch to redis) currently to make values live after restart of the application, as requesting the data takes a lot of time and consumes rate limit tokens.

  • @temie933
    @temie933 6 หลายเดือนก่อน

    Can you create a how to arch video? Showing how you configured arch Linux.

  • @goldensunrayspone
    @goldensunrayspone 6 หลายเดือนก่อน

    the number one question you should be asking yourself when setting up a data layer is "does it matter HOW my data is stored?"
    if you discover that it doesn't matter at all, a flat json file is a decent option. If you discover that you need to connect several devices together, then a fast but scalable network database like postrgres or ravendb will work JUST file. If you discover that you are needing to request the data far more frequently than your systems are able to handle, THEN you need a cache.

    • @hansenchrisw
      @hansenchrisw 6 หลายเดือนก่อน

      +1, engineers often over complicate things. I think it was Dijkstra who said premature optimization is the root of all evil.

  • @realbootybabe
    @realbootybabe 6 หลายเดือนก่อน +1

    I like your videos so much! Thanks a lot 😄 How do you create your videos? What tools do you use? Maybe you want to create a video about that! 😎 thanks!

    • @dreamsofcode
      @dreamsofcode  6 หลายเดือนก่อน +3

      Thank you!
      Yeah, I will be doing a video on my process this year I hope. I need to find the time to do so! Will drop a community post when I do :)

    • @site.x9448
      @site.x9448 6 หลายเดือนก่อน

      @@dreamsofcodeawesome, thanks! Would be interesting to know as well!