@@codeKarle This is easily the best system design channel on TH-cam. I'd pay to get such quality content. Haven't seen new videos from you in a while. I understand the effort that goes behind making a video, but please keep them coming.
He has disappeared from the scene. codeKarle is easily the best ever with crystal clear explanation and with very strong fundamentals. Undoubtedly is the best!
He has disappeared from the scene. codeKarle is easily the best ever with crystal clear explanation and with very strong fundamentals. Undoubtedly is the best!
Sir, where are you nowadays .. you are a gem sir. Please keep posting for people like me to excel in life! Thanks for your content and my contentment :)
Thanks for the great content! Love the way you explain concepts in great detail and have a consistent approach across all designs. Some ideas for future videos that we are looking forward to - 1. Designing google drive 2. Designing logging and alerting system 3. Designing deployment service 4. Designing shared docs like google docs.
Great material man. I have worked for 11 years at Groupon where we basically used very single of the technologies you are describing. All the information you share is GOLD for new and experienced engineer.s Keep it up. I have seen 2 of your videos so far and I'm already convinced that this is one of the best channels out there.
This is awesome. Something worth pointing out that Redis becomes a SPOF in this design -- we are relying too heavily on our Redis instance for the timeline cache and when it goes down we will end up thrashing the DB since no specific persistent mapping in available in Cassandra to handle the timeline feature. One suggestion would be to have a separate timeline table in Cassandra partitioned by the userid and sorted by tweet timestamp (desc) and tweet_id as the data field. So something like (user_id, tweet_ts, tweet_id). This table will be populated in an async manner whenever a new tweet happens (quite possibly as a new service which listens to the Kafka tweet stream). In this case, timeline request becomes a "top K" read of this table for a given user_id followed by a bulk tweet GET API for the list of tweet ids retrieved.
A better solution is to have a distributed redis cluster with sharding based on tweet_id using consistent hashing, or using sharding based on tweet_id + timestamp, so there is no bottleneck
Everytime before a tech interview I look for your videos and since you're not uploading a lot recently, youtube makes it really hard to dig those out. This tells me something about the work life balance you have right now, lol.
This is the best end-to-end flow I have seen for twitter system design. It would be great if you can have separate videos for scaling and load balancing redis, kafka, Cassandra and point to those like you did for urlsortener, asset service, etc !! Overall, thanks a lot :)
At first I was about to dismiss ir bcz of the sound, then the diagrams caught my eye and gave it a chance. Now I consider myself lucky to find this channel 😃
He has disappeared from the scene. He is easily the best ever with crystal clear explanation and with very strong fundamentals. Undoubtedly is the best!
While describing, I felt the sound quality can be improved. When you are asking about feedback and comments at the last of the video, it is very clear.
So basically timeline creation is Active users : using redis Live users : immediately send via WS passive users : create when they come online when celebs tweets, normal users : using a pull based approach + updating in redis when celebs tweets, other celebs : if they are live sent over WS, else update timeline in redis
Thanks for the video , awesome content and very detailed explanation. But I have a query on what would be the partition key for cassandra cluster , if tweetId is the partition key then How do we generate the user home timeline ,Wouldn't it be slow read if we query all cluster nodes and gather the tweets
I have the same doubt. on one hand the recommendation is to have it pre-computed but then if data is shared based on tweet so will the home timeline is pre-computed on a single server. Really appreciate the clarification on this
So there are multiple things stored here, for different kinds of query patterns. First is the tweetId(Partition Key), tweet details, text, time, etc... This would be used by use cases where we need to process a particular tweet, let's say to show a tweet on the UI or to fetch the details/content of a list of tweet ids, which a lot of services will call. The other use case is where we query tweets of a user. Here we'll store userId(PK), tweet Id, time, some other metadata... This is where the query happens on the User Id. . Now if we want to fetch tweets of a user, we'll first query the second data store, get the tweet ids, and then create the response by querying the first data store and getting the tweet contents. Similarly, other use cases can also be handled. Hopefully, that answers :)
Thanks for the great content.. wondering how we will mark the user active, inactive and passive .. which service will do that .. I beleive some worker. Is it something based on last timestamp that user has logged in and twiter service will check if that user is active or ?passive .... also how tweet processor will get the data about the user that its live?
@codeKarle This content is easy to understand but it goes into great depth and talks about shortcomings and ways to address them to maintain scale/distributed nature. I am amazed at how easily the content is presented. It's very high quality indeed. Please put out more content!. One suggestion if I may mention - please spend couple of minutes for scale (network requests per min/hr/day/month, DB storage needed per week/month/year, Redis size considerations) as well a very hight level DB schema design at least for important tables. This will help us a ton. Kudos to your efforts!
Hey Sandeep, thanks for the video, it is really helpful. I have two questions 1.Can I use DynamoDB instead of Cassandra as database of tweets? 2. For streaming service, it use Hadoop, what data is stored in Hadoop, is that same data in Cassandra? Thanks
Thanks for the awesome content and explanation. Your contents are always awesome!! It would be great if you could guide a little bit about high level schema design and api design as well. It will help get the visualisation of how to store and access the data.
No one can draw and explain these in 45 minutes, while also constantly being interrupted by the interviewer's questions. Through these videos you are putting wrong expectations in interviewers head :)
Thanks for posting these informational videos. It seems like the subtitles for the twitter design video are out of sync roughly between the 30 and 40 min mark. Good luck with your future videos.
Thanks for the video. Great detail and depth. I have a couple of questions .. What's asset service? Why do you need both, Tweet injestion service and tweet service. Can tweet service not write (and read) to Cassandra "and" send the tweet to Kafka? I'm wondering why do we need a separate service for only injestion ... More the services, higher the cost in terms of maintenance, deployment etc ..
For Asset Service, refer to: th-cam.com/video/lYoSd2WCJTo/w-d-xo.html . It is basically a video serving platform for all the video/ image content, and you can think of it to be similar to TH-cam/ Netflix. The main reasons having two services were: 1. Tweet injestion is something that runs relatively at a much lower scale than Tweet Service. So, a small spike in read traffic on Tweet Service, can potentially impact the injestion flow big time. 2. The number of DCs in which Tweet Service would be present could be more than the number of DCs in which the Injestion service is present because Tweet Service is being called by tons of other components as well. 3. There is a good probability that both these flows are maintained by different teams. You are right about the maintenance cost, that would increase, that's probably worth it. Deployment cost would roughly be similar, since at the very core, the number of CPUs and Cores would still be approximately same whether we have one service doing it all or two services if we have a good utilization of the hardware. Hope that answers :)
Great Video. I have a question - "Tweet processor puts back tweets for live users into kafka to be pickeup by notificaion service" - here we have to maintain seperate topic or partition for each user right? wont it be too many topics/partitons?
Sandeep - Kudos to you for very clear explanation. I guess for creating timeline for user who follow famous/Hot users, you were describing the 'pull' approach as opposed to the 'push' which can be used for regular users.
11:35 Fundamental question: How do you know if your cache is reliable and up to date? We have eviction policy alright, will cache entry be cleared out for a particular user with any write/update to User DB? Thats the only way cache can go out of sync?
I would like if you mention what an interviewer can ask or how he can cross question. Some mock interviews will also help. It looks easy when you are not cross-questioned and going in your own way.
thx for sharing the vide, the content is pretty good. The only issue is that many videos are quite lengthy which wont be able to explain during a 45-min interview. I am wondering whether its possible to shorten your video?
If i shorten then most people would have lot of questions, content/explanation would get reduced, which will make it bad eventually. When in an interview, you should be able to cover this content easily within 30-35 mins giving enough time for questions-answers.
@@codeKarle In Amazon, we would get 30-40 mins to go through almost 10 System design steps ranging from Requirements to discussing caching and LB, Scaling etc. Can we cut short the component design and only discuss the important ones?
@@indranilthakur3605 I guess you can prioritize what to discuss with the interviewer on your own. @codeKarle He is doing a great job taking us to all the granular details. Kudos to him for all his efforts :)
Thanks for the content! One question, you mentioned that you only cache the feed for active users. So as part of tweet processor service, where you are creating the cache entry for user's feed, how do you identify if the followers of a given user is an active user?
This is the most detailed and best videos for twitter system design. I've couple of questions/comments here .. How about integration with S3 for storing any images or videos and using something like CDN to speed up the content delivery like videos.. ? Do you think they can be part of this system ? Please clarify.
Thanks for the great content! Love the way you explain concepts in great detail . But I have a query that user service and graph service have the different database. then how user are linked with follower user. This is many to many kind relation. Are we storing the user detail Graph service DB.
Hi @codeKarle, thanks for great content. I have few questions 1) in timeline redis cache, we are storing only tweetID or content of tweet as well? 2) delete/edit of tweet will also be going though same flow as create tweet? 3) There will be upvote count for each tweet also, are we storing upvote count also in same timeline redis list? because this will change very frequently.
1. Just the tweet Id. Tweet content can be fetched at runtime from a different cache which stores tweet Id-content mapping. 2. Delete edits can be handled using the same flow, but I would rather just update or delete the tweet content and update the corresponding cache that stores tweet id-content mapping. When the timeline needs to be shown, the updated content would be shown to the users and if the tweet is deleted, it'll be ignored. 3. Upvotes would be stores separately like you mentioned and the data would be merged later while the timeline is rendered.
Awesome content. I think it's better that "Tweet Ingestion Service" directly push the tweet info into Kafka only and then Tweet Service reads from Kafka and store it in Cassandra. Ingesing in both Cassandra and Kafka at the same time can make Tweet Ingestion Service a bit slower and also we cannot ingest in Kafka and Cassandra atomically from Tweet Ingestion Service. Thoughts ?
It need not be atomic; those are two different flows and can be done in parallel. Sometimes, pushing to Kakfka might fail, sometimes inserting into Cassandra might fail and these failures need to be handled independently. I personally disagree with only pushing to Kafka because the tweet action would no longer be synchronous and you won't have a ACK for the client who is waiting for their tweet to be successfully posted.
Why do we need a separate service to GET the tweets? is it because there will be lots of GET requests for timeline generation? Thanks again Sandeep for your effort.
Hi! Your designs are very simple and clear. Congratulations! Just wondering... why don´t you use KSQL´s tables to keep the users and graph updated instead of REDIS, so when another microservice needs the information it only have to lookup at the KSQL view. For example, the tweet processor should access this information directly without have to ask to the Graph service.
System design is very subjective. Actual implementation is very different from the way explained and much more complex technologies being used. Every day, tons of new tweets and tons of new users added globally. This is very high level design specification. You can have your own design which suits the purpose.
Hi - Your System Design Videos are awesome!! Good Job! You have covered all the core points(scaling issues, mitigation, etc). By the way, Could you Please post one or two videos on How to distinguish between System Design vs Object oriented design questions? for e.g, Can we apply the same technique for questions like Design an Elevator or Design a parking lot system, etc. Appreciate your help! Thanks again!
Thanks for video. Can you please make video for Designing distributed counter where it tells how many users have opened the website at any point of time ?
Great video..One specific question though how many records do we keep in the cache for each user and what happens when a user has seen all the cached tweets, DB query?
If capacity permits, we can store forever. This however will involve a lot of cost, and that might not be acceptable. In that case we can keep some timewindow let's say 4 weeks of content in Cache for each user. There is no right or wrong thing here, it's just a cost vs latency tradeoff. Post the 4 weeks, we can query the DB to get other tweets that should be shown to the user using the same flow as that of Passive users. Once a user has seen all the tweets, we will not have any more tweets to show and we can maybe then show some users that this person can follow, to make the UX look decent enough.
Amazing video! Can you please how would you handle a single point of failure. Example: what if the tweet ingestion service is not available due to failure. How would we handle that
Thanks!! There is usually one simple way to do that, which is to distribute it across multiple nodes, across multiple DCs and do the same for Databases. Also, having smaller work done by each service would be good as it reduces the impact of something doing down, which takes the conversation to a microservices/soa architecture. But no matter what we do, there would always be something that breaks, designing the system in a way that we can minimize the impact and be able to replay the transactions that came in during the outage period.
Thanks for creating such awesome content. I have a query. How do we handle a case when the Tweet Ingestion Service is able to persist Tweet info in Cassandra but unable to persist in Kafka ( due to system crash or some network failure) or vice versa ?
Why don't use Kafka as source of truth and add a kafka connector that reads from kafka and publishes to Cassandra? Otherwise, we would need some reconciliation service between Cassandra and Kafka to make sure that both datasources are aligned or we need to make sure that both Cassandra and Kafka are available and the write was ACKed in the injection service by both Cassandra and Kafka. I see that there is a lot of REDIS but it would make more sense to have always KAFKA as a broker and add connectors that can write to REDIS? Maybe not in all use cases but most of the ones are not latency sensitive.
Thanks bro for such a detailed design..appreciate all the hard work. A naive question, doesn't storing tweets in Elastic search cluster result in duplication of data (they are already present in Casandra) ? In other words, can't Search Service just work out of Casandra ? Am I missing anything (pros/ cons) ?
if you see some articles regarding elastic search, they are set in a way that will be synced with mongo/cassandra to handle the search. I think it is for scalability, using elastic search instead of searching in the cassandra.
How come no one talks about delete-tweet flow? What happens to pre-computed timeline feeds? Would you delay timeline creation? or would you track which tweet is part of which pre-computed timeline?
Hi sir, loved your explanation. Keep up the great work, will check out your other videos too.. Could you suggest me a good book that gives me an intuition into designing such data systems? Thank you :)
@CodeKarle Why are all your designs anti microservice architecture like User Service being single point of truth for every user and can be queried by other services whereas in Microservices, Each service does not call other service directly. This question is something I have been asked in my interviews, can you please let me know how to respond to such questions as I follow mostly your approach in interviews.
Thanks for the great video. Why is the User DB hosted in a MySQL database instead of a NoSQL store like MongoDB? Is it because you are already using a Redis cache?
Great video. I too have same query regarding MySQL. Looking at the scale and the fact that data would not be updated much shouldn't NoSQL be used here? Also, the maintenance of MySQL would be very expensive. Even if redis cache is being used database layer is still SQL and more expensive to shard. Any thoughts anyone?
@codeKarle This tutorial series is amazing. thankyou for creating this. quick question- why not use Cassandra for graph db? there are limited types of queries needed for this - like "get all followers by userId" or "get all followees by userId" so we can store data in cassandra like - followerId - [list of followee ids] followe id - [list of follower ids] and then accessing all the followers of all the followees of a user should be faster than using traditional MySQL approach and therefore will improve performance for things like feed generation. Do we need ACID property for graph db? updates in these tables are mainly to append new follower or new followee to lists, which should be fine without RDBMS as well?
Because Cassandra doesn't support updates ....in production the system will have huge tombstones.... Cassandra is good for profile details not for delete and update...I follow and unfollow very frequently which will cause huge tombstones
You are soon to become everyone's go-to system design mentor...keep up the good work
It's great to hear this!! Thanks for the nice words!
Do share the channel with your colleagues, it helps :)
@@codeKarle This is easily the best system design channel on TH-cam. I'd pay to get such quality content. Haven't seen new videos from you in a while. I understand the effort that goes behind making a video, but please keep them coming.
He has disappeared from the scene. codeKarle is easily the best ever with crystal clear explanation and with very strong fundamentals. Undoubtedly is the best!
@@zombiecrab0 I saw his course on udemy
Thanks, Sandeep, for the great content. I cleared a FAANG interview by watching your videos for the role of Sr. TPM. Cheers!!
He has disappeared from the scene. codeKarle is easily the best ever with crystal clear explanation and with very strong fundamentals. Undoubtedly is the best!
Sir, where are you nowadays .. you are a gem sir. Please keep posting for people like me to excel in life! Thanks for your content and my contentment :)
Part of handling users separately was genius I was very much confused while reading any other documentation . Thank you for putting this up together.
Thanks for the great content! Love the way you explain concepts in great detail and have a consistent approach across all designs. Some ideas for future videos that we are looking forward to -
1. Designing google drive
2. Designing logging and alerting system
3. Designing deployment service
4. Designing shared docs like google docs.
great explanation, even better than paid system design resources.
Great material man. I have worked for 11 years at Groupon where we basically used very single of the technologies you are describing. All the information you share is GOLD for new and experienced engineer.s Keep it up. I have seen 2 of your videos so far and I'm already convinced that this is one of the best channels out there.
This is awesome. Something worth pointing out that Redis becomes a SPOF in this design -- we are relying too heavily on our Redis instance for the timeline cache and when it goes down we will end up thrashing the DB since no specific persistent mapping in available in Cassandra to handle the timeline feature. One suggestion would be to have a separate timeline table in Cassandra partitioned by the userid and sorted by tweet timestamp (desc) and tweet_id as the data field. So something like (user_id, tweet_ts, tweet_id). This table will be populated in an async manner whenever a new tweet happens (quite possibly as a new service which listens to the Kafka tweet stream).
In this case, timeline request becomes a "top K" read of this table for a given user_id followed by a bulk tweet GET API for the list of tweet ids retrieved.
Cassandra is really bad at aggregation
A better solution is to have a distributed redis cluster with sharding based on tweet_id using consistent hashing, or using sharding based on tweet_id + timestamp, so there is no bottleneck
Everytime before a tech interview I look for your videos and since you're not uploading a lot recently, youtube makes it really hard to dig those out. This tells me something about the work life balance you have right now, lol.
His current company policy doesn't allow to make videos on TH-cam. It is against policies of many companies
This is the best end-to-end flow I have seen for twitter system design.
It would be great if you can have separate videos for scaling and load balancing redis, kafka, Cassandra and point to those like you did for urlsortener, asset service, etc !!
Overall, thanks a lot :)
At first I was about to dismiss ir bcz of the sound, then the diagrams caught my eye and gave it a chance. Now I consider myself lucky to find this channel 😃
awesome explanation from Initial requirement gathering to deep design. I love your videos man. you deserve more subs/views.
Thanks!! Just getting started, we'll get there soon :)
Do share the videos with your friends/colleagues to help to get there sooner :)
He has disappeared from the scene. He is easily the best ever with crystal clear explanation and with very strong fundamentals. Undoubtedly is the best!
While describing, I felt the sound quality can be improved. When you are asking about feedback and comments at the last of the video, it is very clear.
I hope you get to a million subs soon man. This is amazing, just amazing!
So basically timeline creation is
Active users : using redis
Live users : immediately send via WS
passive users : create when they come online
when celebs tweets, normal users : using a pull based approach + updating in redis
when celebs tweets, other celebs : if they are live sent over WS, else update timeline in redis
I am so happy that I found this channel. Simply the best.
This is my first comment ever on youtube.
bhai! just wanted to say, you are making me watch all your videos! its just amazing! Keep posting more videos please
Top Quality Content. Covers both breadth and depth keeping the simplicity.
I really enjoy a lot watching your videos. Its meditation to me.
That's great to hear!! Thanks for the kind words :)
Thanks for the video , awesome content and very detailed explanation. But I have a query on what would be the partition key for cassandra cluster , if tweetId is the partition key then How do we generate the user home timeline ,Wouldn't it be slow read if we query all cluster nodes and gather the tweets
I have the same doubt. on one hand the recommendation is to have it pre-computed but then if data is shared based on tweet so will the home timeline is pre-computed on a single server. Really appreciate the clarification on this
So there are multiple things stored here, for different kinds of query patterns.
First is the tweetId(Partition Key), tweet details, text, time, etc... This would be used by use cases where we need to process a particular tweet, let's say to show a tweet on the UI or to fetch the details/content of a list of tweet ids, which a lot of services will call.
The other use case is where we query tweets of a user.
Here we'll store userId(PK), tweet Id, time, some other metadata... This is where the query happens on the User Id. .
Now if we want to fetch tweets of a user, we'll first query the second data store, get the tweet ids, and then create the response by querying the first data store and getting the tweet contents. Similarly, other use cases can also be handled.
Hopefully, that answers :)
Thanks for the great content.. wondering how we will mark the user active, inactive and passive .. which service will do that .. I beleive some worker. Is it something based on last timestamp that user has logged in and twiter service will check if that user is active or ?passive .... also how tweet processor will get the data about the user that its live?
@codeKarle This content is easy to understand but it goes into great depth and talks about shortcomings and ways to address them to maintain scale/distributed nature. I am amazed at how easily the content is presented. It's very high quality indeed. Please put out more content!. One suggestion if I may mention - please spend couple of minutes for scale (network requests per min/hr/day/month, DB storage needed per week/month/year, Redis size considerations) as well a very hight level DB schema design at least for important tables. This will help us a ton. Kudos to your efforts!
The best explanation of twitter system design.
This is such a great channel. So glad to have found this!
Hey Sandeep, thanks for the video, it is really helpful. I have two questions
1.Can I use DynamoDB instead of Cassandra as database of tweets?
2. For streaming service, it use Hadoop, what data is stored in Hadoop, is that same data in Cassandra?
Thanks
Thanks for the awesome content and explanation. Your contents are always awesome!! It would be great if you could guide a little bit about high level schema design and api design as well. It will help get the visualisation of how to store and access the data.
Thanks for the great content. Please keep on posting newer topics.
You deserve million views!
No one can draw and explain these in 45 minutes, while also constantly being interrupted by the interviewer's questions. Through these videos you are putting wrong expectations in interviewers head :)
overall great video!! but we need more information about sharding. other system designs have used a same db for user + tweets to make things easy..
Ek Number !! you are making a big impact to all learners... keep up the great work 👍
Thanks for posting these informational videos. It seems like the subtitles for the twitter design video are out of sync roughly between the 30 and 40 min mark. Good luck with your future videos.
Thanks for the video. Great detail and depth.
I have a couple of questions ..
What's asset service?
Why do you need both, Tweet injestion service and tweet service. Can tweet service not write (and read) to Cassandra "and" send the tweet to Kafka? I'm wondering why do we need a separate service for only injestion ... More the services, higher the cost in terms of maintenance, deployment etc ..
For Asset Service, refer to: th-cam.com/video/lYoSd2WCJTo/w-d-xo.html . It is basically a video serving platform for all the video/ image content, and you can think of it to be similar to TH-cam/ Netflix.
The main reasons having two services were:
1. Tweet injestion is something that runs relatively at a much lower scale than Tweet Service. So, a small spike in read traffic on Tweet Service, can potentially impact the injestion flow big time.
2. The number of DCs in which Tweet Service would be present could be more than the number of DCs in which the Injestion service is present because Tweet Service is being called by tons of other components as well.
3. There is a good probability that both these flows are maintained by different teams.
You are right about the maintenance cost, that would increase, that's probably worth it. Deployment cost would roughly be similar, since at the very core, the number of CPUs and Cores would still be approximately same whether we have one service doing it all or two services if we have a good utilization of the hardware.
Hope that answers :)
@@codeKarle Yes, thanks a lot for answering
Great Content, Everything explained with such simplicity. It's going to help everyone. Keep posting
Great Video. I have a question - "Tweet processor puts back tweets for live users into kafka to be pickeup by notificaion service" - here we have to maintain seperate topic or partition for each user right? wont it be too many topics/partitons?
Sandeep - Kudos to you for very clear explanation. I guess for creating timeline for user who follow famous/Hot users, you were describing the 'pull' approach as opposed to the 'push' which can be used for regular users.
11:35 Fundamental question: How do you know if your cache is reliable and up to date? We have eviction policy alright, will cache entry be cleared out for a particular user with any write/update to User DB? Thats the only way cache can go out of sync?
I would like if you mention what an interviewer can ask or how he can cross question. Some mock interviews will also help. It looks easy when you are not cross-questioned and going in your own way.
thx for sharing the vide, the content is pretty good. The only issue is that many videos are quite lengthy which wont be able to explain during a 45-min interview. I am wondering whether its possible to shorten your video?
If i shorten then most people would have lot of questions, content/explanation would get reduced, which will make it bad eventually.
When in an interview, you should be able to cover this content easily within 30-35 mins giving enough time for questions-answers.
@@codeKarle In Amazon, we would get 30-40 mins to go through almost 10 System design steps ranging from Requirements to discussing caching and LB, Scaling etc. Can we cut short the component design and only discuss the important ones?
@@indranilthakur3605 I guess you can prioritize what to discuss with the interviewer on your own. @codeKarle He is doing a great job taking us to all the granular details. Kudos to him for all his efforts :)
@@deathstrokebrucewayne +1
Very well explained. Thank you !
you make it really easy for me , and your explanation is really great.
Great explanation. I love your videos. But one suggestion. Please use a proper microphone hanging on your neck area. The sound is not very crisp.
Informative video. It would be helpful if you had covered how user followers/graph db would be sharded.
great video. one of the few good ones on TH-cam
Thanks for the content! One question, you mentioned that you only cache the feed for active users. So as part of tweet processor service, where you are creating the cache entry for user's feed, how do you identify if the followers of a given user is an active user?
This is the most detailed and best videos for twitter system design. I've couple of questions/comments here .. How about integration with S3 for storing any images or videos and using something like CDN to speed up the content delivery like videos.. ? Do you think they can be part of this system ? Please clarify.
Thanks for the great content! Love the way you explain concepts in great detail . But I have a query that user service and graph service have the different database. then how user are linked with follower user. This is many to many kind relation. Are we storing the user detail Graph service DB.
Very good video and channel. Finally I find a nugget of gold in this sea of garbage.
Such simple and perfect explanation - Loved it
Hi @codeKarle,
thanks for great content.
I have few questions
1) in timeline redis cache, we are storing only tweetID or content of tweet as well?
2) delete/edit of tweet will also be going though same flow as create tweet?
3) There will be upvote count for each tweet also, are we storing upvote count also in same timeline redis list? because this will change very frequently.
1. Just the tweet Id. Tweet content can be fetched at runtime from a different cache which stores tweet Id-content mapping.
2. Delete edits can be handled using the same flow, but I would rather just update or delete the tweet content and update the corresponding cache that stores tweet id-content mapping. When the timeline needs to be shown, the updated content would be shown to the users and if the tweet is deleted, it'll be ignored.
3. Upvotes would be stores separately like you mentioned and the data would be merged later while the timeline is rendered.
its very lucid to watch the videos however i suggest you stand at one of the corner of the board so that the whole board could be visible.
Awesome content.
I think it's better that "Tweet Ingestion Service" directly push the tweet info into Kafka only and then Tweet Service reads from Kafka and store it in Cassandra. Ingesing in both Cassandra and Kafka at the same time can make Tweet Ingestion Service a bit slower and also we cannot ingest in Kafka and Cassandra atomically from Tweet Ingestion Service.
Thoughts ?
Good point. no chance of loosing in case of back pressure to service by throwing data to Kafka immediately.
It need not be atomic; those are two different flows and can be done in parallel. Sometimes, pushing to Kakfka might fail, sometimes inserting into Cassandra might fail and these failures need to be handled independently. I personally disagree with only pushing to Kafka because the tweet action would no longer be synchronous and you won't have a ACK for the client who is waiting for their tweet to be successfully posted.
Amazing video. Very well explained. Thanks for posting a ton of great content Sandeep!
Thanks Nishant!
Thanks for your time sharing the knowledge.
nice content. I am preparing for sde2, will update the results here. ty.
Any updates?
Why do we need a separate service to GET the tweets? is it because there will be lots of GET requests for timeline generation?
Thanks again Sandeep for your effort.
Hi! Your designs are very simple and clear. Congratulations! Just wondering... why don´t you use KSQL´s tables to keep the users and graph updated instead of REDIS, so when another microservice needs the information it only have to lookup at the KSQL view. For example, the tweet processor should access this information directly without have to ask to the Graph service.
System design is very subjective. Actual implementation is very different from the way explained and much more complex technologies being used. Every day, tons of new tweets and tons of new users added globally. This is very high level design specification. You can have your own design which suits the purpose.
Redis is cache, will be faster than limited memory mapped index file data in Kafka.
Very nicely articulated system design video. 👍🏻👍🏻
Great Lesson sir. Can we have a learning session on how to start implementing these, what all needs to be considered etc. etc.
Excellent! Thank you Sandeep!!
Thanks. Awesome explanation. Loved your way of teaching
Awesome content sir.
Would you please create new videos for 'System Design for Proximity Server'?
thanks.
13:00 why is graph service not using neo4j etc i.e. noSQL graph db instead is simply using relational mysql ?
Hi - Your System Design Videos are awesome!! Good Job! You have covered all the core points(scaling issues, mitigation, etc). By the way, Could you Please post one or two videos on How to distinguish between System Design vs Object oriented design questions? for e.g, Can we apply the same technique for questions like Design an Elevator or Design a parking lot system, etc. Appreciate your help! Thanks again!
Very nice content Sandeep. Any plans to make system design on document storage system like google doc, dropbox?
Really like the architecture you make with Kafka at the heart of them. Have you used ksqldb? Do you bring it in your architecture?
Here, can we not add any blob storage in Asset service, from where CDN will fetch static files?
BTW great video. Will be watching others as well.
Complete graph can be seen at 35:45
Thanks for video.
Can you please make video for Designing distributed counter where it tells how many users have opened the website at any point of time ?
Great video..One specific question though how many records do we keep in the cache for each user and what happens when a user has seen all the cached tweets, DB query?
If capacity permits, we can store forever. This however will involve a lot of cost, and that might not be acceptable. In that case we can keep some timewindow let's say 4 weeks of content in Cache for each user. There is no right or wrong thing here, it's just a cost vs latency tradeoff.
Post the 4 weeks, we can query the DB to get other tweets that should be shown to the user using the same flow as that of Passive users.
Once a user has seen all the tweets, we will not have any more tweets to show and we can maybe then show some users that this person can follow, to make the UX look decent enough.
Great content. If I may add some feedback: The sound is pretty bad, you might want to use a microphone on your T-shirt.
When a user post the tweet you are saving in Casandra. Can you please explain the strategy you are using to store the tweet for all the followers.
Amazing video! Can you please how would you handle a single point of failure. Example: what if the tweet ingestion service is not available due to failure. How would we handle that
Thanks!!
There is usually one simple way to do that, which is to distribute it across multiple nodes, across multiple DCs and do the same for Databases. Also, having smaller work done by each service would be good as it reduces the impact of something doing down, which takes the conversation to a microservices/soa architecture.
But no matter what we do, there would always be something that breaks, designing the system in a way that we can minimize the impact and be able to replay the transactions that came in during the outage period.
Thanks for creating such awesome content.
I have a query. How do we handle a case when the Tweet Ingestion Service is able to persist Tweet info in Cassandra but unable to persist in Kafka ( due to system crash or some network failure) or vice versa ?
V good
But DB shud we not use NoSQL like cassandra
Why don't use Kafka as source of truth and add a kafka connector that reads from kafka and publishes to Cassandra? Otherwise, we would need some reconciliation service between Cassandra and Kafka to make sure that both datasources are aligned or we need to make sure that both Cassandra and Kafka are available and the write was ACKed in the injection service by both Cassandra and Kafka.
I see that there is a lot of REDIS but it would make more sense to have always KAFKA as a broker and add connectors that can write to REDIS? Maybe not in all use cases but most of the ones are not latency sensitive.
Bro make a video on LinkedIn system design and data base architect
Thanks bro for such a detailed design..appreciate all the hard work.
A naive question, doesn't storing tweets in Elastic search cluster result in duplication of data (they are already present in Casandra) ?
In other words, can't Search Service just work out of Casandra ? Am I missing anything (pros/ cons) ?
if you see some articles regarding elastic search, they are set in a way that will be synced with mongo/cassandra to handle the search. I think it is for scalability, using elastic search instead of searching in the cassandra.
Can you also describe how do we built trending tweet flow?
which graph service do you recommend using... Neo4j or something??
very hard to watch other system design videos now, these are pretty comprehensive videos.
How come no one talks about delete-tweet flow?
What happens to pre-computed timeline feeds? Would you delay timeline creation? or would you track which tweet is part of which pre-computed timeline?
Very helpful content. Only one suggestion: change a better mic. The audio quality is very poor. Thanks.
Great video, great channel. Thank you agian!
Quick suggestion : Pls use chapter features to break video. It will help for coverage of topics and have flexibility to skip a chapter/topic.
Hi sir, loved your explanation. Keep up the great work, will check out your other videos too..
Could you suggest me a good book that gives me an intuition into designing such data systems? Thank you :)
There should obviously be a post-cache in front of the Cassandra cluster. I'm not too sure of read efficiency of the Cassandra
Details kicked ass! Awesome...
is it worth to mention hadoop any more in this day and age?
do you actually need the graph db, or just keeping a Uid:follower db and a Uid:following db would surffice?
why do we use a MySQL DB for the followers etc? isn't a KV type of storage better? or even some Graph DB?
Great content videos. any plans to make videos on 1) BookMyShow 2) DropBox ?
@CodeKarle Why are all your designs anti microservice architecture like User Service being single point of truth for every user and can be queried by other services whereas in Microservices, Each service does not call other service directly. This question is something I have been asked in my interviews, can you please let me know how to respond to such questions as I follow mostly your approach in interviews.
Any service can call other service. There are 2 patterns. Orchestration and choreography
Good explanation!!
Thanks for the great video. Why is the User DB hosted in a MySQL database instead of a NoSQL store like MongoDB? Is it because you are already using a Redis cache?
Great video. I too have same query regarding MySQL. Looking at the scale and the fact that data would not be updated much shouldn't NoSQL be used here? Also, the maintenance of MySQL would be very expensive. Even if redis cache is being used database layer is still SQL and more expensive to shard. Any thoughts anyone?
What will be the key in redis when search service is using it? there can be many phrases that refer to the same search term
What should be the partition key in database for twitter?
As per this design, wouldn't we violate our NFR of quick rendering of timeline for passive users?
@codeKarle This tutorial series is amazing. thankyou for creating this. quick question-
why not use Cassandra for graph db? there are limited types of queries needed for this - like "get all followers by userId" or "get all followees by userId"
so we can store data in cassandra like -
followerId - [list of followee ids]
followe id - [list of follower ids]
and then accessing all the followers of all the followees of a user should be faster than using traditional MySQL approach and therefore will improve performance for things like feed generation.
Do we need ACID property for graph db?
updates in these tables are mainly to append new follower or new followee to lists, which should be fine without RDBMS as well?
Agreed with this also we can store user information as well in Cassandra keeping in mind with the scale of users sharded with user id
Because Cassandra doesn't support updates ....in production the system will have huge tombstones.... Cassandra is good for profile details not for delete and update...I follow and unfollow very frequently which will cause huge tombstones
U a System Design God