You are correct! I didn't explain this caveat well in the video.
ปีที่แล้ว +22
I thought so too, but the video made me doubt whether I was confusing it for something else. The reason it's called AOF and not WAL is because it's not written ahead of the change itself 😁
@@mythbuster6126 I've used Supabase (which is a software stack that includes PostgREST and Auth, etc. using PG) for a few projects and it really speeds up development to not have to create tedious CRUD endpoints for everything. Database logic using PLpgSQL is nice but yeah, we also had the problem that changing something in the procedure code is annoying because you have to recreate the whole method in a migration.
I see this more as for this kind of situation: "Redis would be perfect for this use-case, except that for this little bit of functionality we need to do a 'relational' operation" And instead of going "welp, then let's discard Redis", you can go "for that case we can just setup [workaround] and this design would still work"
@@JohnWillikersmoney for infrastructure does not matter for small companies (in compression with SE salary). Worst thing happens when system has a lot of cache that can be removed with some tricks and better understanding of system as a whole.
Hey! Nice one! Online Engineering Lead at Ubisoft here... Sorted Sets in Redis are the go-to solution to do leaderboards in the game industry actually, but you're constantly optimizing to battle server costs since everything stays in memory, and larger leaderboards need machines with more memory, which happen to be more expensive. But honestly, as soon as you want persistence and you enable AOF for every write operation, you'll start to run into performance issues on Redis too. There's no magic bullet in engineering - weigh the pros and cons for each use-case and then choose the tool.
Hey! Awesome to hear from someone actually using Redis in this use case. If I may ask about your experience a bit more, how much memory was Redis utilizing per instance in a server or a pod in Kubernetes for a given X number of entries (X number of users most likely) in a leaderboard? I want to gauge if the in-memory tradeoff is worth it for the speedy performance compared to RDMS (I'm assuming Redis is much lightweight compared to those databases as it doesn't have a lot of those abstractions and advanced structures built-in). I'm very excited to hear about your experience on this.
@@Peter-bg1ku write locks and index rebuilding would be a nightmare in this kind of scenario. The former is somewhat a solved problem (most storage engine have row-level lock) but the latter is detrimental to the underlying behavior of relational db.
@@bepamungkas good point. Yeah, indexes would be a massive pita to manage. But what if you stored your relational DB in memory. Would it not be quicker?
You don’t need a cache if your database has enough memory. You don’t need a database if your cache has enough disk space. You don’t need any of them if you have no users.
@@jack171380 you don't need it if the data is not dynamic and no writes are performed, such data can be stored in a file too. I seen many people who don't really need writes and updates only use reads and store in db instead of a file
In my past experiments with using redis as a primary store it tended to not scale well once the project grew to a certain level of use and complexity. I often use it as a primary store for specific slices of a given application's data persistence strategy when and where it provides advantages over a traditional RDB. Inserts when using a RDB tend to be expensive (especially if the table has a bunch of indices) so Redis can really shine in very write heavy areas of your app like messaging, sensor data capture, etc.
So instead of using postgres I can use redis and recreate most of the SQL functionality myself, with less safety and more complexity and space for error... I struggle to see why to use this in any real world setting. Even for side projects, if I wanna set something up quickly, I don't wanna have to have to create indexing etc myself before being able to develop the app. I'd also be interested how the performance actually plays out once you start parsing long JSONs, maybe it will still be faster, but it's something you didn't consider. Also if you eventually decide to switch to postgres that could be a really hard rewrite.
For years now, once in a while some devs say redis can be used as more than a cache but they themselves never use it as the main db or even as a message broker(bc of its limitations), they only used it as a cache. If they use it as a main db thats only for a short time and end up switching back to relational dbs. So just because Redis can be use as more than a cache doesnt mean it should be. Redis is the ideal choice for caching similar to how SQL dbs are ideal in supporting multiple tables and joining those tables.
I see this as more of an exercise in what a tool can do, less so a "you should absolutely do this". He even goes over at the end of when you wouldn't want to do this and when you "might" want to. I don't think he's trying to push Redis as a be-all-end-all database. He just really likes the tool and is going over the capabilities to get people excited about it.
As always it depends on the workload etc., there isn't a single correct database for all projects. If you need an SQL database then use an SQL database. The example here of directly comparing it to SQL is a bit extreme and only to show that they can be done, not that they necessarily should. Now on the other hand if your workload just requires blazing fast lookups/inserts without complex relations, eg. maybe an auth, permission or chat service then using redis as the main database starts to make sense.
I agree with your points. Redis isn't a full on replacement for postgres (I'm a postgres maximalist personally). However, I do think for a lot of simpler applications, Redis is pretty viable as the starting data store before you even know what your data model may look like. It also helps to reduce complexity of your overall application in order to get deployed more quickly.
Next video idea: most people scoff at the idea of using filesystem as a database, but did you know we can recreate Postgres with fs in c++ and achieve similar functionality and safety
I used a file system as a key/value store db about 8 years ago and it was very simple and very fast! Granted looking back, it had many holes such as potential for race conditions as addressed in this video, it still did the job beautifully!
yeah this feels like the work of the smart quirky programmer who didn't want to use traditional databases because "i can do it myself with redis and it will be faster because redis" I don't think a serious software engineer would actually use redis like this for a work project
Yep, cool video and nice production quality, but it’s pretty irresponsible to seriously recommend doing this without many massive disclaimers.. I pity the junior devs who are going to watch this and blindly implement it in important projects in place of an RDB.
He is also missing another key point - if you give any traditional RDBMS (Postgres, Maria, whatever) enough memory for buffer cache, disable write-ahead logs and gonna be using index-organized tables, you will get what Redis gives you out of the box but with some free extra features.
True. At some point technologies become a vocabulary. So when you say "Redis", others read it as "Cache, I know how to use it". This dude just words and made his Esperanto with them. That's like using SQL as a message queue (which I've also encountered and ran away immediately).
@@harleyspeedthrust4013 without a database schema definition, how does one figure out what is going on in a redis database? With an RDBMS, you can inherit a database without docs and sort of figure out what's happening. Not sure if the same is true for Redis. Very good tutorial as I learnt some neat Redis tricks but I wouldn't recommend this to my worst enemy as the way to build apps with persistence.
You missed a key point. During snapshot redis will fork the process and dump what is in that forked process to disk. This means you need 2x the ram to perform this action but also it’s a global db lock during the forking process. If you have a small dataset this generally fine but if you have a large db this process will cause large spikes in latency when the snapshot begins.
Another aspect of this is the time it takes to reload large snapshots into RAM. After rebooting my server, I often see errors for the first 30-60s as my application tries to read/write from Redis, but Redis rejects the commands it’s still loading the snapshot into memory in its entirety.
On Unix, forking is fast - it only copies the page tables and sets them all to copy-on-write - you'll only incur 2x memory costs and anything resembling "global db lock" if you're regularly writing to the whole database. This is the reason Redis chose to implement snapshotting via forking in the first place.
Not entirely true. Forks, at least on linux, only marks the pages as copy on write, so they are only copied when one of the two processes write to them. It might be theoretically possible to end up with that happening, but very likely not. You are right that this will create some lag spikes when the page fault is triggered, but it does not need to copy all the memory.
Spoken like someone who has no idea what things even mean, bravo. What Redis does is giving lookups in sub milliseconds, providing a significant amount of data types (hashsets, linked list) that can be used to do distributed operations, which are necessary in cloud native applications. They also provide very fast pub/sub and stream patterns that can be used for message busses, none of which SQL provides. Also distributed locking that can be used to synchronize microservices along with probabilistic structures and much more. Plainly said, the Redis use-case is entirely different from SQL database. Their overlap is minimal at best. If you use a database for all of the things I just listed you are braindead. If you use Redis for data operations such as (aggregations, grouping) you are braindead as well.
Broadly agree but if you have a very specific application with clear and (more or less) fixed data definitions you do pay at run time for SQL's flexibility. Almost all SQL systems can be replaced by faster solutions tailored for the specific context. SQL's real power is that it's "good enough" for a vast range of requirements.
Amazing work! Not only the editing and presentation which is always so delightful, the idea itself of using Redis as a database and the persistency concerns are perfectly explained. THANK YOU!
I just started working on backend and we have a fairly complex setup. This video dropped just in time. Not only this tells you what redis can do or can't but is a small primer to starting with redis. Top content.
So now I know that I *could* sort of replicate a relational-ish storage model in Redis by reimplementing abstractions that are already in place in any RDBMS by hand at a lower abstraction level. Which is indeed quite interesting. The question is: Why should I, instead of just using an RDBMS?
You've been doing great. The content is presented at a reasonable speed, well at least for me. And the content its self has been very interesting and informative. Personally, I'm really interested to hear about your home lab, so please do an overview of it and maybe down the line a deep dive ;-) Thanks again, bud!
Our company uses Redis as a runtime database for our live chat system, and it saves off transcripts into a real RDB at various checkpoints. The footprint in redis for all of that is 400MB, and we saved an immense amount of maintenance/development cost doing it this way..
Totally agree with the video, the biggest drawback is the lack of higher level primitives but in some cases it's indeed worth implementing them yourself on top of redis. The company I work for had a chat system as part of our django + postgresql api that could handle around 1k messages per second before it started chocking up (can't remember the exact numbers, it's been years). Partly because of django websocket implementation and partly because of the database model designs, they could have been optimized further but the performance would have still been far from ideal. We reimplemented it as a nodejs microservice with redis as main database and based on tests a single instance could easily handle 10k+ messages per second, no specific optimizations just from switching language and database. After deploying to production it has never gotten high enough real world load to use more than a fraction of a single cpu core in the cluster.
It might be good to add that redis transactions are not nearly as powerful as SQL transactions. You can resort to using LUA scripts to fix that but this is really annoying to maintain and to write proper tests for.
I see some usecases in which you could use redis as you said, but for me the main drawback is data consistency. Having a very well defined schema of your data and the responsability of shaping it right makes things way more maintanable. That s my personal problem in general with nosql
For me, I fell in love with Redis when I had to do work on a logistics management system. Out of the blue, just reading the documentation, I found out that it has in-built geo location functionality and can easily determine the straight line distance between two points, accounting for the curvature of the earth (correct me if I'm wrong, but I believe under the hood, it uses the Great Circle function). Thanks for the vid.
I've already been using it as a primary database. Our data is very small and most of it is non-persistent (status hash tables that expire after 30 days). But we do keep record of transactions, which is an integer timestamp and a simple 41 character id string. I also commit an apparent faux pas, and I use the Logical DB set to organize my data. Apparently it's an anti-pattern because some people may implicitly think they're actually separate systems when they're not. But that doesn't really matter since the way I use them as almost like folders to help organize my data.
My main concern of using Redis as a main database is memory, not persistence. Most people know that Redis can persist data on disk. Memory is much more expensive and complicated to expend than simply adding storage. And the fact you need to think about scaling sooner, rather than later. Scaling now depends more on the amount of data you need to always keep in memory (even if you're not using it), not on the number of users(who typically generate profit). If only there was a way to offload unused data and only keep in memory frequently accessed entries? Oh right, caching! That's why I prefer to use it as a cache. Redis used to have Virtual Memory feature(to swap unused data to disk), which is now deprecated for some reason. One can use swap of the OS however, but it never explained if it's a good idea to do so. P.S. Scaling Redis seems to be easier, than scaling SQL databases, but this aspect never mentioned when talking about using Redis as a main database.
For personal projects I use IndexedDB and then I sync it to sqlite on the back end. With everything on the front end it makes it super simple to have things nice and speedy. Granted, I don't need to worry about live data this way. So, it isn't a good model for all apps.
I use it to lock out a request across multiple instances of an app. Where money and transactions are completed, I couldn’t have a user start more than one thread across multiple apps, so it’s GREAT to lock out that process single file by user
Good to know. Sometimes you just happen to have a redis instance and need a minimum of database functionality. In my former job I implimented a caching service, were the invalidation logic was very complex I ended up implimenting it in postgres, but we allready had a redis instance for caching auth sessions, so I could have done it there as well.
Killing the pod will just send sigterm and the DB can actually catch that and save the data. I think it might be much more interesting to have a program running in the background spam writing and then sigkilling the DB to see how frequently it saves to disk, if any data corruptions arise and how it handles these. It surely doesnt write every single action to disk without any batching, that would be too slow
You can configure the AOF, on my instance it was flushing the disk every 1 second. It does store every write operation and it does impact performance. If you want a fast cache and don't care about data loss then it's not worth enabling.
The awesome bomb you threw was *"dont use KEY" in production unless you're debugging, it's slow and blocking* . I seriously didn't know that. I searched it up and it is absolutely true.
Using Redis for a db is similar to using MongoDb or any key/value NoSql database. You outlined the cons quite well but addressed them with Redis datatypes and transactions. I think the big pro to using Redis as a NoSql database is the simplicity if one has a use case that doesn’t need complex querying. This is referred to as impedance mismatch I believe, where the dataset your UI + API match that of the database.
Very good and informative video, thank you. As a user of Postgresql since 2001, you are missing the biggest and most important feature of RDBMS -> data and reference integrity. This is unachievable by redis or any other nosql solution. Of course, if you have complex data models and you have the need and skills to write good SQL, redis falls short big time. (CTEs, Window functions, triggers, aggregates, etc) But if you are building something really simple or without structure, redis is great.
Your videos are amazing. So clear and well-paced. This is the first one of yours that ive come across. Definitely subscribing. Do you use after effects for all your anims?
Thank you! I'm transitioning more and more to Davinci Resolve (Fusion) entirely, but there's some things I still use After Effects for! Fusion is really powerful, but there's a lot less resources on how to use it.
@@dreamsofcodeyeah I used to use AE a little. I use Resolve now but have been a bit slow to dive into Fusion. It's actually pretty good from what I can tell. Thanks for letting me know. I' always interested to see how people make their videos cool.
I use tarantool, can support SQL, have append only file also, can store more than ram if using vinyl engine (not the default memtx engine) the gotchas is that no date data type (so i use epoch/timestamp without timezone), and the table/column name all capitalized so if you create table with lowercase/mixed-case you must quote it on query so can overcome all redis limitation.
for non OLTP use case i use clickhouse, so for log (compressed, searchable), metrics (using materialized view), and all events stored there, if i need to read it fast i update periodically i push it back to tarantool, so all use case can be fulfilled with combination of both oh also there's one more gotchas in tarantool, you cannot alter table to insert column in the middle (only at the end is possible), have to copy whole table, but since it's really2 fast 200k rps like with redis/aerospike, it's really easy to change schema with full copy
Once worked with an application that persisted all data in redis, and had a catastrophic data loss event: all data for the company's customers went up in smoke when the redis server restarted because it'd been misconfigured to have only the password for the server set, so I'd really not recommend using redis as your primary data storage
I think if you need a very high throughput for read/write operations, sure. But making redis a drop in replacement for something like postgres or mongo db.. It's not the same tool, it wasn't really designed to do the things you are suggesting so I feel you are trying to force a square peg in a round hole.
If I had to chose a favorite DB of any kind it would probably be redis. I also recommend anyone to build a simple crud app only using redis as the database and implementing all the operations with that. It’ll teach you so much about structuring data and how to query it.
@Dreams of Code , How would you classify a data base to be small or big? what if my writes to database is for 20,000 users with around 3000 concurrent requests to DB, Will Redis be ok in this scenario or Postgres might work out better? I will definitely go for a regression testing in both as the project itself is small - but still would like to know everyone's perspective as well.
I worked on a project where redis transactions out of the box were insufficient. Fortunately Redis also supports loading Lua scripts onto the Redis server. I think this is the only way to do certain business logic transactionally? But its cool, because in the analogy of this video, you can kinda point and say 'look. Stored procedures too!'
The lua integration is super powerful. I believe you're correct and all Lua scripting is done in transactions, which does allow for more business logic. I think the major caveat is that it locks the database so as long as it's not doing anything slow then it's all good.
Nice I like REDIS but there is a lot to manage and that is a lot of data sets, a normal DB lets me abstract this away with table index magic and in this case I might as well just use python/TCL/C++ with hahses, sets and arrays (though TCL will let me push raw commands in avoiding the need for an API) and write every transaction to a rotating log file with a threaded snapshot by time.
Thank you for this video, you tought me some new things about Redis! Did you leave out Redis JSON on purpose, because it's an awesome feature; it allows for storing JSON documents as values, and creating indexes on the actual JSON content. We're using it in our CQRS setup, both as message store as well as the Query model.
Postgres Transactional speed > Redis Transactional speed. Also Redis is single thread write/read so you need compex cluster and one command could lock everything. You used Redis correct, now you start using Redis not what is designed for. Seems like you doing bold claims without real usage in big systems those approaches.
Yep, you definitely have to consider the use cases for each approach. That being said, we've got a decent cluster than does some real heavy lifting, but we are using a 5 node cluster to achieve it with sharding.
if you are fine with consistency problems as well as don't care about constraints and atomic transaction - I cannot even think of any usecase - yea you might a key/value store which is in memory. for little bit playing around you usually do not need a database at all.
AOF persistence may be is fine if your doing a read-heavy workload, but limits your write performance to no better than a database that’s writing to disk. On the other side of the equation, if your willing to dedicate enough memory to your primary DB to hold your entire dataset in-memory you can accomplish that with fairly simple cache settings. There are still some things that redis will do better, but it’s also going to come with a lot of trade offs. With a big one being that you can’t really use an ORM, and that a lot of really large-scale databasing strategies just don’t work well with redis (at least without a ton of work)
At this point, I would like to see a video on how you make your TH-cam videos. Very clean and interesting. Please make a video on how you achieve this including all the fun and interesting animations. Thanks
it's a completely different database from redis, how can we compare it though ? Redis is for fast queries, postgres is for saving data, two completely different things.
Im working as a developer and i knew this stuff before i got there and kept wondering why people arent just considering redis as the db instead of just a chaching layer. I actually didn't know it could be used as a cache aswell before then 😂
I'm just learning programming, so from my perspective I'd say Cache always, as it seems to be really god at it, nothing can replace it & is fast to implement; as Database Manager makes no sense, as mature DM can do the work way faster. So this would be my default config until I need to change it; 'coz each is where they do a good job, so the default should be the more efficicent; therefore a new tool or expanding the capacity of either, should only be done when it's more efficient that way, in both ways: implementing and using
I don't think so. We're using it as a fast key value store in HA and in all our tests, failing any single redis node would result in lost data. It's not for that purpose.
Has noone made something like an SQL to REDIS compatibility layer? So I have a "create table" command and it does the appropriate steps for that in redis for me?
I would like to see performance and ease of use compared to an in-memory SQLite database To see if the ACID Compliance and the SQL features comes at cost, for the dev or the runtime speed
Just leery of the time cost of migrating from Redis to something else if requirements expand. But, assuming requirements are unlikely to change, then I've drank the cool-aid!
I would argue against using Redis as a primary database, although I'm sure there are use cases as you pointed out. But in general, a primary database will need to be more flexible than what Redis can provide without creating too complexity. This includes hundreds of tables, various indexes required, the primary data needs to be easily accessible by traditional SQL queries for different users or departments, ease of generating reports, hundreds of millions of records and replication. If you start with Redis and later migrate to relational database, that is not trivial. One of the biggest thing (that I think a primary database has to do) is generating reports with several joins and multiple criteria and really messy SQL that it's just not practical to use Redis. I think it could be fine using Redis as a database but not as primary database.
i really don't think redis as primary database, cache store is fine, I m worried how concrete is persistent thing of redis, also I know size of my T-SQL tabls and rows and colum width so i can predict overflow number, How would i ever know redis is reached memory full state all of sudden ! Postgresql has database backups so its a concrete relief!
There's def some good concerns. You can add some decent observability and scaling to a redis cluster to help prevent losing memory (that's what we currently have). It does require some more thought than postgres. That being said, horizontally scaling postgres is a little more complex than it is with redis IMHO. Although both of them do support it pretty well
@@dreamsofcode Also, I think , for small / medium scale data app your idea is fine, because it seems like storing data as key value pair so kinda No SQL, but as data grows large, search operation has to traverse through entire list! Whereas in Relational database , it keeps indexed ! so it searches quickly cz it knows what specific area to search in.
I used redis as a database for location data which on moving object is ephemeral my default. Then the geospatial store was super usefull finding objects that are closeby
I would truly appreciate a homelab setup from you, love your videos, God Speed!
Thank you so much!! I'm really looking forward to making the homelab content. Glad you are enjoying the channel as well!
Absolutely, thank you for making the content. You get exactly what you deserve :)
Oh yes please! Love this concept and the surprise factor!
Be aware that in AOF (Append Only File) persistence, Redis saves logs every second by default, not every write.
You are correct! I didn't explain this caveat well in the video.
I thought so too, but the video made me doubt whether I was confusing it for something else. The reason it's called AOF and not WAL is because it's not written ahead of the change itself 😁
You can make it log every write, but it's slow.
If redish will do that it looses much of it speed at writing operation.
I found Redis to be too slow, so I sped things up by no longer storing any data
Perfect example for: Just because we can doesn't mean we should.
But content wise this video is top notch
@@mythbuster6126 I've used Supabase (which is a software stack that includes PostgREST and Auth, etc. using PG) for a few projects and it really speeds up development to not have to create tedious CRUD endpoints for everything. Database logic using PLpgSQL is nice but yeah, we also had the problem that changing something in the procedure code is annoying because you have to recreate the whole method in a migration.
I see this more as for this kind of situation:
"Redis would be perfect for this use-case, except that for this little bit of functionality we need to do a 'relational' operation"
And instead of going "welp, then let's discard Redis", you can go "for that case we can just setup [workaround] and this design would still work"
Same feeling, you have to Plan ahead and do so many workaround to replace what in sql Is basically free.
@@RoyBellingan Its more then that, fully replicated redis is so expensive. What would be ~$600 USD in SQL s like ~$1300 in redis.
@@JohnWillikersmoney for infrastructure does not matter for small companies (in compression with SE salary). Worst thing happens when system has a lot of cache that can be removed with some tricks and better understanding of system as a whole.
Hey! Nice one! Online Engineering Lead at Ubisoft here... Sorted Sets in Redis are the go-to solution to do leaderboards in the game industry actually, but you're constantly optimizing to battle server costs since everything stays in memory, and larger leaderboards need machines with more memory, which happen to be more expensive.
But honestly, as soon as you want persistence and you enable AOF for every write operation, you'll start to run into performance issues on Redis too. There's no magic bullet in engineering - weigh the pros and cons for each use-case and then choose the tool.
Nice to see a fellow Indian here. Preach brother. ❤
Hey! Awesome to hear from someone actually using Redis in this use case. If I may ask about your experience a bit more, how much memory was Redis utilizing per instance in a server or a pod in Kubernetes for a given X number of entries (X number of users most likely) in a leaderboard? I want to gauge if the in-memory tradeoff is worth it for the speedy performance compared to RDMS (I'm assuming Redis is much lightweight compared to those databases as it doesn't have a lot of those abstractions and advanced structures built-in). I'm very excited to hear about your experience on this.
Why not use a relational database for the leaderboards?
@@Peter-bg1ku write locks and index rebuilding would be a nightmare in this kind of scenario. The former is somewhat a solved problem (most storage engine have row-level lock) but the latter is detrimental to the underlying behavior of relational db.
@@bepamungkas good point. Yeah, indexes would be a massive pita to manage. But what if you stored your relational DB in memory. Would it not be quicker?
You don’t need a cache if your database has enough memory. You don’t need a database if your cache has enough disk space. You don’t need any of them if you have no users.
@@jack171380 you don't need it if the data is not dynamic and no writes are performed, such data can be stored in a file too. I seen many people who don't really need writes and updates only use reads and store in db instead of a file
@@jack171380 love it!
In my past experiments with using redis as a primary store it tended to not scale well once the project grew to a certain level of use and complexity. I often use it as a primary store for specific slices of a given application's data persistence strategy when and where it provides advantages over a traditional RDB. Inserts when using a RDB tend to be expensive (especially if the table has a bunch of indices) so Redis can really shine in very write heavy areas of your app like messaging, sensor data capture, etc.
So instead of using postgres I can use redis and recreate most of the SQL functionality myself, with less safety and more complexity and space for error...
I struggle to see why to use this in any real world setting. Even for side projects, if I wanna set something up quickly, I don't wanna have to have to create indexing etc myself before being able to develop the app. I'd also be interested how the performance actually plays out once you start parsing long JSONs, maybe it will still be faster, but it's something you didn't consider. Also if you eventually decide to switch to postgres that could be a really hard rewrite.
For years now, once in a while some devs say redis can be used as more than a cache but they themselves never use it as the main db or even as a message broker(bc of its limitations), they only used it as a cache. If they use it as a main db thats only for a short time and end up switching back to relational dbs. So just because Redis can be use as more than a cache doesnt mean it should be. Redis is the ideal choice for caching similar to how SQL dbs are ideal in supporting multiple tables and joining those tables.
I see this as more of an exercise in what a tool can do, less so a "you should absolutely do this". He even goes over at the end of when you wouldn't want to do this and when you "might" want to. I don't think he's trying to push Redis as a be-all-end-all database. He just really likes the tool and is going over the capabilities to get people excited about it.
As always it depends on the workload etc., there isn't a single correct database for all projects. If you need an SQL database then use an SQL database. The example here of directly comparing it to SQL is a bit extreme and only to show that they can be done, not that they necessarily should. Now on the other hand if your workload just requires blazing fast lookups/inserts without complex relations, eg. maybe an auth, permission or chat service then using redis as the main database starts to make sense.
@@novadea1643 using redis for auth, permission etc is mostly to support the main db and not using it as the main db
I agree with your points. Redis isn't a full on replacement for postgres (I'm a postgres maximalist personally).
However, I do think for a lot of simpler applications, Redis is pretty viable as the starting data store before you even know what your data model may look like. It also helps to reduce complexity of your overall application in order to get deployed more quickly.
Dedicated HomeLab pls
Absolutely!
Let's gooo 🥳
Lets goo!
Yes, so we can figure out the networking and how to do it securely.
yayyy
Next video idea: most people scoff at the idea of using filesystem as a database, but did you know we can recreate Postgres with fs in c++ and achieve similar functionality and safety
This is a fun idea. Maybe even building a simple database from scratch!
basically a sqlite
I’ve used git as a versioned nosql fs db. Beware of inode exhaustion
use pastebin instead of the local file system and boom, cloud database! give me 500 million in funding plz k thnx
I used a file system as a key/value store db about 8 years ago and it was very simple and very fast! Granted looking back, it had many holes such as potential for race conditions as addressed in this video, it still did the job beautifully!
Can confirm: you were using redis the right way at the start. Enjoyed watching this but wowzer, I can't imagine inheriting a project setup this way...
yeah this feels like the work of the smart quirky programmer who didn't want to use traditional databases because "i can do it myself with redis and it will be faster because redis"
I don't think a serious software engineer would actually use redis like this for a work project
Yep, cool video and nice production quality, but it’s pretty irresponsible to seriously recommend doing this without many massive disclaimers.. I pity the junior devs who are going to watch this and blindly implement it in important projects in place of an RDB.
He is also missing another key point - if you give any traditional RDBMS (Postgres, Maria, whatever) enough memory for buffer cache, disable write-ahead logs and gonna be using index-organized tables, you will get what Redis gives you out of the box but with some free extra features.
True. At some point technologies become a vocabulary. So when you say "Redis", others read it as "Cache, I know how to use it". This dude just words and made his Esperanto with them. That's like using SQL as a message queue (which I've also encountered and ran away immediately).
@@harleyspeedthrust4013 without a database schema definition, how does one figure out what is going on in a redis database? With an RDBMS, you can inherit a database without docs and sort of figure out what's happening. Not sure if the same is true for Redis. Very good tutorial as I learnt some neat Redis tricks but I wouldn't recommend this to my worst enemy as the way to build apps with persistence.
You missed a key point. During snapshot redis will fork the process and dump what is in that forked process to disk. This means you need 2x the ram to perform this action but also it’s a global db lock during the forking process. If you have a small dataset this generally fine but if you have a large db this process will cause large spikes in latency when the snapshot begins.
Thank you for covering it! Definitely worth understanding when considering Redis.
Another aspect of this is the time it takes to reload large snapshots into RAM. After rebooting my server, I often see errors for the first 30-60s as my application tries to read/write from Redis, but Redis rejects the commands it’s still loading the snapshot into memory in its entirety.
yikes! key indeed
On Unix, forking is fast - it only copies the page tables and sets them all to copy-on-write - you'll only incur 2x memory costs and anything resembling "global db lock" if you're regularly writing to the whole database. This is the reason Redis chose to implement snapshotting via forking in the first place.
Not entirely true. Forks, at least on linux, only marks the pages as copy on write, so they are only copied when one of the two processes write to them. It might be theoretically possible to end up with that happening, but very likely not. You are right that this will create some lag spikes when the page fault is triggered, but it does not need to copy all the memory.
This made me appreciate SQL more. Look what they need to mimic a fraction of SQL power
Weird flex... and incorrect.
Is this a joke comment?
Spoken like someone who has no idea what things even mean, bravo.
What Redis does is giving lookups in sub milliseconds, providing a significant amount of data types (hashsets, linked list) that can be used to do distributed operations, which are necessary in cloud native applications. They also provide very fast pub/sub and stream patterns that can be used for message busses, none of which SQL provides. Also distributed locking that can be used to synchronize microservices along with probabilistic structures and much more.
Plainly said, the Redis use-case is entirely different from SQL database. Their overlap is minimal at best. If you use a database for all of the things I just listed you are braindead. If you use Redis for data operations such as (aggregations, grouping) you are braindead as well.
Likewise. I'm not sure why they all fear using SQL so much. Truly a strange phobia.
Broadly agree but if you have a very specific application with clear and (more or less) fixed data definitions you do pay at run time for SQL's flexibility. Almost all SQL systems can be replaced by faster solutions tailored for the specific context. SQL's real power is that it's "good enough" for a vast range of requirements.
Amazing work! Not only the editing and presentation which is always so delightful, the idea itself of using Redis as a database and the persistency concerns are perfectly explained.
THANK YOU!
Thank you very much!
I just started working on backend and we have a fairly complex setup. This video dropped just in time. Not only this tells you what redis can do or can't but is a small primer to starting with redis. Top content.
So now I know that I *could* sort of replicate a relational-ish storage model in Redis by reimplementing abstractions that are already in place in any RDBMS by hand at a lower abstraction level. Which is indeed quite interesting. The question is: Why should I, instead of just using an RDBMS?
0:01 not anymore 🥲
You've been doing great. The content is presented at a reasonable speed, well at least for me. And the content its self has been very interesting and informative. Personally, I'm really interested to hear about your home lab, so please do an overview of it and maybe down the line a deep dive ;-) Thanks again, bud!
Thank you! I'm glad you enjoyed it.
@@dreamsofcode I am also interested in your homelab. Everything very well explained in this video, a pleasure to watch and learn
Our company uses Redis as a runtime database for our live chat system, and it saves off transcripts into a real RDB at various checkpoints. The footprint in redis for all of that is 400MB, and we saved an immense amount of maintenance/development cost doing it this way..
Totally agree with the video, the biggest drawback is the lack of higher level primitives but in some cases it's indeed worth implementing them yourself on top of redis.
The company I work for had a chat system as part of our django + postgresql api that could handle around 1k messages per second before it started chocking up (can't remember the exact numbers, it's been years). Partly because of django websocket implementation and partly because of the database model designs, they could have been optimized further but the performance would have still been far from ideal.
We reimplemented it as a nodejs microservice with redis as main database and based on tests a single instance could easily handle 10k+ messages per second, no specific optimizations just from switching language and database. After deploying to production it has never gotten high enough real world load to use more than a fraction of a single cpu core in the cluster.
You are really great at this. Your explanations are concise and clear. The pacing is perfect. The editing is quite good.
It might be good to add that redis transactions are not nearly as powerful as SQL transactions.
You can resort to using LUA scripts to fix that but this is really annoying to maintain and to write proper tests for.
I think you've convinced me to stick with RDBMS + SQL as the primary database for my relational data 😁
Perfect explanation. All of the useful usecases you introduced and easy to understand with real-world examples. Thanks!
Really enjoyed this. It was cool to see the reasoning behind certain redis features. Thanks!
I see some usecases in which you could use redis as you said, but for me the main drawback is data consistency. Having a very well defined schema of your data and the responsability of shaping it right makes things way more maintanable. That s my personal problem in general with nosql
For me, I fell in love with Redis when I had to do work on a logistics management system. Out of the blue, just reading the documentation, I found out that it has in-built geo location functionality and can easily determine the straight line distance between two points, accounting for the curvature of the earth (correct me if I'm wrong, but I believe under the hood, it uses the Great Circle function). Thanks for the vid.
I was already using redis as a primary database, so seeing this video cheered me up!
Wow. Just wow. You totally delivered, forwarded this to my colleagues.
Thank you so much!
I've already been using it as a primary database. Our data is very small and most of it is non-persistent (status hash tables that expire after 30 days). But we do keep record of transactions, which is an integer timestamp and a simple 41 character id string. I also commit an apparent faux pas, and I use the Logical DB set to organize my data. Apparently it's an anti-pattern because some people may implicitly think they're actually separate systems when they're not. But that doesn't really matter since the way I use them as almost like folders to help organize my data.
My main concern of using Redis as a main database is memory, not persistence. Most people know that Redis can persist data on disk.
Memory is much more expensive and complicated to expend than simply adding storage.
And the fact you need to think about scaling sooner, rather than later. Scaling now depends more on the amount of data you need to always keep in memory (even if you're not using it), not on the number of users(who typically generate profit). If only there was a way to offload unused data and only keep in memory frequently accessed entries? Oh right, caching! That's why I prefer to use it as a cache. Redis used to have Virtual Memory feature(to swap unused data to disk), which is now deprecated for some reason. One can use swap of the OS however, but it never explained if it's a good idea to do so.
P.S. Scaling Redis seems to be easier, than scaling SQL databases, but this aspect never mentioned when talking about using Redis as a main database.
Did you watch until the end?
@@dreamsofcodenot yet
straightforward, well written, clear and simple videos, love them!
For personal projects I use IndexedDB and then I sync it to sqlite on the back end. With everything on the front end it makes it super simple to have things nice and speedy. Granted, I don't need to worry about live data this way. So, it isn't a good model for all apps.
I love this concept, but that api on indexedDB though... ooof, its not great
I can't believe how quick you made a 21 minute video about Redis feel. Great video
Thank you!
I use it to lock out a request across multiple instances of an app. Where money and transactions are completed, I couldn’t have a user start more than one thread across multiple apps, so it’s GREAT to lock out that process single file by user
Good to know. Sometimes you just happen to have a redis instance and need a minimum of database functionality. In my former job I implimented a caching service, were the invalidation logic was very complex I ended up implimenting it in postgres, but we allready had a redis instance for caching auth sessions, so I could have done it there as well.
Was using redis as my primary database, but still learned a lot from watching this
Killing the pod will just send sigterm and the DB can actually catch that and save the data. I think it might be much more interesting to have a program running in the background spam writing and then sigkilling the DB to see how frequently it saves to disk, if any data corruptions arise and how it handles these. It surely doesnt write every single action to disk without any batching, that would be too slow
You can configure the AOF, on my instance it was flushing the disk every 1 second. It does store every write operation and it does impact performance. If you want a fast cache and don't care about data loss then it's not worth enabling.
I will stick with a Relational Database
I’m reminded off the scene in Jurassic Park when Ian Malcolm says ‘Just because you can, doesn’t mean you should’. Or something like that.
The awesome bomb you threw was *"dont use KEY" in production unless you're debugging, it's slow and blocking* . I seriously didn't know that. I searched it up and it is absolutely true.
It's a rite of passage when debugging in prod at times!
High quality content thanks Dreams of Code.
Glad you enjoyed it!
Using Redis for a db is similar to using MongoDb or any key/value NoSql database. You outlined the cons quite well but addressed them with Redis datatypes and transactions. I think the big pro to using Redis as a NoSql database is the simplicity if one has a use case that doesn’t need complex querying. This is referred to as impedance mismatch I believe, where the dataset your UI + API match that of the database.
Definitely interested in your setup :P - thanks for the hard work on this content!
My pleasure! Glad you enjoyed it!
this is a great video, i hope redis doesn't change its license to a non-open source license
they already did,.
Very good and informative video, thank you.
As a user of Postgresql since 2001, you are missing the biggest and most important feature of RDBMS -> data and reference integrity. This is unachievable by redis or any other nosql solution.
Of course, if you have complex data models and you have the need and skills to write good SQL, redis falls short big time. (CTEs, Window functions, triggers, aggregates, etc)
But if you are building something really simple or without structure, redis is great.
Thank you for the beautiful content, infos, effort... thank you
This 20 minute video taught me so much more than all the other redis videos on the internet, thanks ❤
Your videos are amazing. So clear and well-paced. This is the first one of yours that ive come across. Definitely subscribing. Do you use after effects for all your anims?
Thank you!
I'm transitioning more and more to Davinci Resolve (Fusion) entirely, but there's some things I still use After Effects for! Fusion is really powerful, but there's a lot less resources on how to use it.
@@dreamsofcodeyeah I used to use AE a little. I use Resolve now but have been a bit slow to dive into Fusion. It's actually pretty good from what I can tell. Thanks for letting me know. I' always interested to see how people make their videos cool.
I use tarantool, can support SQL, have append only file also, can store more than ram if using vinyl engine (not the default memtx engine) the gotchas is that no date data type (so i use epoch/timestamp without timezone), and the table/column name all capitalized so if you create table with lowercase/mixed-case you must quote it on query
so can overcome all redis limitation.
for non OLTP use case i use clickhouse, so for log (compressed, searchable), metrics (using materialized view), and all events stored there, if i need to read it fast i update periodically i push it back to tarantool, so all use case can be fulfilled with combination of both
oh also there's one more gotchas in tarantool, you cannot alter table to insert column in the middle (only at the end is possible), have to copy whole table, but since it's really2 fast 200k rps like with redis/aerospike, it's really easy to change schema with full copy
Once worked with an application that persisted all data in redis, and had a catastrophic data loss event: all data for the company's customers went up in smoke when the redis server restarted because it'd been misconfigured to have only the password for the server set, so I'd really not recommend using redis as your primary data storage
It's also nice to know about ACID to understand why redis is not a good idea for a primary database. Overall nice video, thanks
I should do a video on ACID!
believe it or not, many non ACID databases are used as primary databases, such as MySQL(myISAM), MongoDB, CouchDB!
I think if you need a very high throughput for read/write operations, sure. But making redis a drop in replacement for something like postgres or mongo db.. It's not the same tool, it wasn't really designed to do the things you are suggesting so I feel you are trying to force a square peg in a round hole.
Such high-quality content we need more content like this on yt
If I had to chose a favorite DB of any kind it would probably be redis. I also recommend anyone to build a simple crud app only using redis as the database and implementing all the operations with that. It’ll teach you so much about structuring data and how to query it.
Redis persistence doesn't guarantee data is saved during interruption by design. MySQL does (and that's why it's slower).
@Dreams of Code , How would you classify a data base to be small or big? what if my writes to database is for 20,000 users with around 3000 concurrent requests to DB, Will Redis be ok in this scenario or Postgres might work out better? I will definitely go for a regression testing in both as the project itself is small - but still would like to know everyone's perspective as well.
Awesome work! Thanks! Please also do a seperate tutorial for "Python and Redis Queues"
I worked on a project where redis transactions out of the box were insufficient. Fortunately Redis also supports loading Lua scripts onto the Redis server. I think this is the only way to do certain business logic transactionally? But its cool, because in the analogy of this video, you can kinda point and say 'look. Stored procedures too!'
The lua integration is super powerful. I believe you're correct and all Lua scripting is done in transactions, which does allow for more business logic. I think the major caveat is that it locks the database so as long as it's not doing anything slow then it's all good.
Nice I like REDIS but there is a lot to manage and that is a lot of data sets, a normal DB lets me abstract this away with table index magic and in this case I might as well just use python/TCL/C++ with hahses, sets and arrays (though TCL will let me push raw commands in avoiding the need for an API) and write every transaction to a rotating log file with a threaded snapshot by time.
I am very interested in your updated Home Lab Setup, please do a video or post about it.😀
I always wondered what Redis was used for, this is a great video to go from zero to having some clue.
Glad you enjoyed it!
Thank you for this video, you tought me some new things about Redis!
Did you leave out Redis JSON on purpose, because it's an awesome feature; it allows for storing JSON documents as values, and creating indexes on the actual JSON content. We're using it in our CQRS setup, both as message store as well as the Query model.
+1 For the homelab setup video!
Postgres Transactional speed > Redis Transactional speed. Also Redis is single thread write/read so you need compex cluster and one command could lock everything.
You used Redis correct, now you start using Redis not what is designed for. Seems like you doing bold claims without real usage in big systems those approaches.
Yep, you definitely have to consider the use cases for each approach. That being said, we've got a decent cluster than does some real heavy lifting, but we are using a 5 node cluster to achieve it with sharding.
ayyeee congrats getting a home lab set up that's sick!!
Really good introduction to Redis! Huge thank you! I see you're a man of culture as well with those DrafonBallZ and Naruto references ❤
you should also cover the redis extensions. they're great
This is a great idea! I shall add it to my backlog
if you are fine with consistency problems as well as don't care about constraints and atomic transaction - I cannot even think of any usecase - yea you might a key/value store which is in memory. for little bit playing around you usually do not need a database at all.
Alright, I'll try it for the project I'm working. If this fails we'll be back with our pitchforks
Homelab video would be really interesting to watch!
AOF persistence may be is fine if your doing a read-heavy workload, but limits your write performance to no better than a database that’s writing to disk.
On the other side of the equation, if your willing to dedicate enough memory to your primary DB to hold your entire dataset in-memory you can accomplish that with fairly simple cache settings.
There are still some things that redis will do better, but it’s also going to come with a lot of trade offs. With a big one being that you can’t really use an ORM, and that a lot of really large-scale databasing strategies just don’t work well with redis (at least without a ton of work)
I only use it for sessions. Works well especially for storing OAuth flow acces tokens
At this point, I would like to see a video on how you make your TH-cam videos. Very clean and interesting. Please make a video on how you achieve this including all the fun and interesting animations.
Thanks
1:47 Exactly! Please create video on setting up homelab. By the way, I have only single laptop. :)
Dedicated HomeLab channel?!?
Yes, please
i second this so hard
What I would have really enjoyed would be comparisons to something like Postgres. When does Redis perform better, when Postgres?
it's a completely different database from redis, how can we compare it though ? Redis is for fast queries, postgres is for saving data, two completely different things.
1:50 that would be great!
We have switch from redis to keydb, pretty cool alternative
I'm gonna add a video to the backlog to check it out!
Great video. I'm convinced Redis is only good for the caching layer.
This video taught a lot about Redis and thinking in redis. But, I will still choose PostgresSQL as the primary one
Im working as a developer and i knew this stuff before i got there and kept wondering why people arent just considering redis as the db instead of just a chaching layer. I actually didn't know it could be used as a cache aswell before then 😂
I'm just learning programming, so from my perspective I'd say Cache always, as it seems to be really god at it, nothing can replace it & is fast to implement; as Database Manager makes no sense, as mature DM can do the work way faster. So this would be my default config until I need to change it; 'coz each is where they do a good job, so the default should be the more efficicent; therefore a new tool or expanding the capacity of either, should only be done when it's more efficient that way, in both ways: implementing and using
I don't think so. We're using it as a fast key value store in HA and in all our tests, failing any single redis node would result in lost data. It's not for that purpose.
Has noone made something like an SQL to REDIS compatibility layer? So I have a "create table" command and it does the appropriate steps for that in redis for me?
Great insights, thank you!
Please talk about use cases for redis and memory constraints or expectations per case load
Great idea!
Great vid man? Whats the zsh plugin you use to show the argumentos for every command? Or is ot just redis? Thanks!
Thank you! That is just the redis-cli. It's a really cool feature
Absolute great video, I can't thank enough 🙏 sir you are a master jedi in redis ...P/S your tom bombadil part caught my eyes :)
Thank you! I'm glad you noticed that!
silmarillion moment
Don't the Redis OM libraries take care of the first drawback you mentioned?
interesting with your terminal, can you share what theme did you used? 2:23
redis should sponsor you now
unrelated note but
yesterday i had my first dream of code
I would like to see performance and ease of use compared to an in-memory SQLite database
To see if the ACID Compliance and the SQL features comes at cost, for the dev or the runtime speed
Just leery of the time cost of migrating from Redis to something else if requirements expand. But, assuming requirements are unlikely to change, then I've drank the cool-aid!
I would argue against using Redis as a primary database, although I'm sure there are use cases as you pointed out. But in general, a primary database will need to be more flexible than what Redis can provide without creating too complexity. This includes hundreds of tables, various indexes required, the primary data needs to be easily accessible by traditional SQL queries for different users or departments, ease of generating reports, hundreds of millions of records and replication. If you start with Redis and later migrate to relational database, that is not trivial. One of the biggest thing (that I think a primary database has to do) is generating reports with several joins and multiple criteria and really messy SQL that it's just not practical to use Redis. I think it could be fine using Redis as a database but not as primary database.
Would like to see your video on your homelab. Was there any already? Or would be? Thanks!
i really don't think redis as primary database, cache store is fine, I m worried how concrete is persistent thing of redis, also I know size of my T-SQL tabls and rows and colum width so i can predict overflow number, How would i ever know redis is reached memory full state all of sudden ! Postgresql has database backups so its a concrete relief!
There's def some good concerns. You can add some decent observability and scaling to a redis cluster to help prevent losing memory (that's what we currently have). It does require some more thought than postgres.
That being said, horizontally scaling postgres is a little more complex than it is with redis IMHO. Although both of them do support it pretty well
@@dreamsofcode Also, I think , for small / medium scale data app your idea is fine, because it seems like storing data as key value pair so kinda No SQL, but as data grows large, search operation has to traverse through entire list! Whereas in Relational database , it keeps indexed ! so it searches quickly cz it knows what specific area to search in.
Definitely do a home lab setup tutorial!
Homelab video please. I have a homelab setup and would love to see yours!
I used redis as a database for location data which on moving object is ephemeral my default. Then the geospatial store was super usefull finding objects that are closeby