Thank you for sharing the design....I am a Business Analyst beginner, and the talk is a bit fast for me to follow so I could not catch up with the part about Pagination, I wonder in which aspect a Business Analyst should know about it and how deep as I heard some seniors talk about it and I have no idea how this knowledge would help...Thank you in advance and looking forwards to more of your videos :)
This content extends beyond interviews, it's the essence of the world of SaaS, something every software engineer MUST eventually run into in this day and age
I've seen multiple system design interview prep videos but this one is by far the most eye-opening and practical explanation. Thank you for posting this video!
this was so easy to understand going from basic design and then introducing the components based on complexity, scaling and needs rather than thinking about them at the very first. Thanks and looking forward for more such design videos
Great video! We should probably add/consider some details on how to manage followers relationships to perform fanout tasks. One idea could be to use a separate graph database and possibly a distributed cache on top of the database. Also, for follow/unfollow API we can be more consistent with RESTful rule as follows: Follow POST /following Unfollow DELETE /following with UserId as the parameter for both.
Here are some of the issues in this design. 1. The cache and timeline construction are the most difficult to solve, yet we know too little on how it is arranged 1.1. Sure, if you have only 2 users, it will scale well. But, what if you have 180M followers to Musk? Will you fan-out 180M cache and database updates for a single tweet? 1.2. It completely ignores geography. What if the DB shard for Elon is in Texas and his followers are spread across the world? 1.2.1 Where do you store data for a user from a specific country? What if they travel? 2. Social graph. 2.1. It sounds like on every tweet write, you query the social graph to figure out where to fan out. How does this work to scale? How many shards do you need to read to know even where to fan out too? 2.2. What if a user unfollows? 2.3. Where is the graph stored. The design presented will not scale well to the millions and billions of users.
Thanks for the thoughts. You're right that this design fans out to all followers, so we're sacrificing write speed for users with many followers in order to maintain read performance for the majority of users. As you note, there's many other details we could dive into, such as using a social graph and implementing geographic sharding. Thanks for watching!
This is amazing content. An alternative design perhaps could rely much more heavily in Kafka. Saving all the tweets in a Topic/partition and saving to the DB after 1y (or whatever) old. In this way you could retrieve the timeline easily and also stream the new tweets. The DB would be more simple and perhaps we could get rid of the Mem Cache...
Great video! Enjoyed watching it. One thing really bothered me - that a write API would have to calculate and produce messages for followers’ timelines. I would probably make it produce messages with write operations, than have some consumers to process what update goes where and produce new message to notify users. Although, even this split wouldn’t allow for some more agile logic, ie prioritizing tweets going to timelines based on dynamic factors like time of the day, breaking news, change in users preferences.
Really glad you liked it. This is a super interesting point to bring up, and I agree that separating the timeline logic from the write API would make the system more maintainable. And as you mentioned, introducing a queue between the write API and this hypothetical timeline service would make tweeting faster on the user end while enabling the write API to handle higher loads. As far as I know, tweets always stream to the top of the feed and the rest of the timeline never changes, so this approach should work fine for "dynamic" timeline algorithms as well (but let me know if I'm misunderstanding). Stay tuned for more content :D
@@interviewpen Thank you for replying. It’s a great point about the tweets going on top of accumulated timeline. I believe, it would work for most services with timeline/feed.
Thanks for the content. I have the following questions: - Where does sharding logic reside? I think it must be application doing sharding. Pls correct if wrong. - How does using tweetId+timestamp actually helps in preparing the timeline? For timeline need tweets from the folks the user is following and the approach mentioned at 21:57 doesn't help(Is it to do something with using Joins as it's a relational db?). The useful thing would be IMHO to have all tweets pertaining to a timeline on a single shard as if it's on multiple shards then thats lot of requests across shards to fetch the tweets.
1. Generally whatever database you choose will include sharding, you're just responsible for picking the key! MongoDB is a good example of a database that supports sharding out of the box. 2. Using tweet ID and timestamp allows the data to be distributed evenly across shards, meaning there aren't certain shards that have significantly more traffic or more data than others. You're right--to get a user's timeline, the user would have to query every node, but as long as the data is indexed properly within each node, this will still result in each node doing less work and will allow us to handle higher traffic. There's no great way to shard by timeline directly (ex. think about what happens when a user has no followers, where do their tweets go?), but the Redis cache should help the problem as it is organized by user. There's tons of ways to set up a sharded database and each has pros and cons, so great question! Thanks for watching!
Could someone also explain the issue with pages and fetching n tweets at 6:25? What I understood is that with new tweets the backend needs to ensure that it carefully calculates the "n" tweets keeping new tweets that's coming into system. But it's a potential candidate such that even if new tweets come we can keep appending them to top which means earlier we have tweets 1-10(assuming n as 10) and let's say. new tweets came then it will be (1-7)+3 new tweets.
Maybe I misunderstood the proposal, but how exactly is the memory cache going to work if it is an in-memory solution? That necessarily has to take into account the number of active users. For example, say we have 1MIL active users per day, than we need to maintain a cache of 1MIL entries (1 entry for each user) with 100 tweet each (this is only for day 1, simplification). If we store the tweet ID only, that could potentially work as it means we need 1MIL users * 100 records of size in the order of bytes - say 16bytes (random number). In this scenario we would need 1.6GB of memory which sounds reasonable for a cache, although we would need to fetch each tweet content still which in turns sounds a bit messy. On the other hand if we need to store the tweet content AND tweet ID, we would require roughly 224GB of memory assuming we had 16 bytes TweetID and 140 bytes of content, which sounds not feasible. EDIT1: typos 😅
Good thinking, all your math sounds reasonable. However, I disagree that 224GB of memory is unreasonable...there's plenty of systems with that much memory or more. It is also possible to shard an in-memory cache if the dataset becomes too large to fit on one node. There's also a number of other caching strategies for a system like this that may be more or less effective :) Thanks for watching!
Awesome Video, very well explained. Subbed and Liked. Just like to add some thought into this design, i understand that there is always pros and cons to any system design However i would like to point out a potential issue related to the websocket connect to push event back to client to display a popup of sorts to let the client perform the API call to fetch the new timeline. Based on the final design, The logic between Client LB READ API makes sense, the LB can have sticky session based load balancing to hit the same READ API instance, as i believed the READ API is scaled horizontally correct? However, does that mean this design the READ API every scaled instance will have to be a unique consumer group, else if the collection of READ API share the same consumer group there can be an event where - Client Connects Server1 - Server3 picks up the event but does not have the connection to Client to push the new timeline update. So, if every scaled instance of the READ API is using a unique consumer group in Kafka, then the event can be "fanned out" to all instances. This design will resolve the issue, but leads to many events dropped or consumed and ignored. Another point is that for this event there is no need to add more than one partition for the topic as the there is only 1 consumer instance running per unique group ID. Feel free to point out if any inconsistency in my explanation here.
Really great thoughts, thank you! You're right that usually the data would have to be fanned out to all consumers in a setup like this. Consuming and ignoring a message shouldn't be too much overhead, but at a very large scale it could become problematic. Another approach would be to use a separate topic for each user--this would mean an API node could subscribe to only the users it's connected too, but it adds a ton of overhead in the Kafka cluster itself. Perhaps a more traditional message queue like RabbitMQ might be better for this use case--we could set up a queue for each API node, and when a user connects, its API node could create a binding to route that user's data to its queue. Hope that helps!
I'm not sure I understand using tweet id + timestamp as a shard key -- doesn't each tweet have a unique id? wouldn't that lead to as many shards as there are tweets? (and no tweet has multiple timestamps so...) if it were twitter id (uid) i think it makes sense, since you want to request the tweets of a given user over a given time period.
Sharding by user could work, but we run into some problems since data from certain users is being accessed far more frequently than others. It's OK to have as many shards as tweets--a single node in our database cluster can be responsible for many records with different shard keys.This does lead to a lot of queries having to check every node in the cluster, but that's why we have the cache!
I wouldn't have gone down the Sharding approach with cache for a write database, too complex. Writes are a small percentage of traffic. You still have consistency problems (CAP theorem) anyway , and all you have done is add hops and latency and reduce your SLA. A write queue would have been more simple IMHO.
This interview misses critical discussions about the way we send new tweets to the user. I think this system doesn't work as expected. specially having a limit list in Memcached. For example, what if someone has 30 million followers?
Yes, having a large number of followers would increase latency when making a new tweet. This is a tradeoff-we’re optimizing for fast timeline loads over fast posts. Thanks for watching!
Thanks for watching! Visit interviewpen.com/? for more great Data Structures & Algorithms + System Design content 🧎
Thank you for sharing the design....I am a Business Analyst beginner, and the talk is a bit fast for me to follow so I could not catch up with the part about Pagination, I wonder in which aspect a Business Analyst should know about it and how deep as I heard some seniors talk about it and I have no idea how this knowledge would help...Thank you in advance and looking forwards to more of your videos :)
This content extends beyond interviews, it's the essence of the world of SaaS, something every software engineer MUST eventually run into in this day and age
Absolutely!
The MOST CLEAR design of Twitter!
Thanks!
I’m watching every video here multiple times. Y’all are filling a niche that is much needed.
thanks!
I've seen multiple system design interview prep videos but this one is by far the most eye-opening and practical explanation. Thank you for posting this video!
Thanks for watching!
this was so easy to understand going from basic design and then introducing the components based on complexity, scaling and needs rather than thinking about them at the very first. Thanks and looking forward for more such design videos
Thanks for watching 👍
Great video! We should probably add/consider some details on how to manage followers relationships to perform fanout tasks. One idea could be to use a separate graph database and possibly a distributed cache on top of the database.
Also, for follow/unfollow API we can be more consistent with RESTful rule as follows:
Follow POST /following
Unfollow DELETE /following
with UserId as the parameter for both.
Sounds good, thanks!
This is truly awesome, I love the complex things explained in all the videos, thanks!! (waiting for more)
Thanks for the kind words - we'll be posting more!
Wow! This type of System Design Interview I was looking for the last few weeks...⭐
Glad you enjoyed it!
Here are some of the issues in this design.
1. The cache and timeline construction are the most difficult to solve, yet we know too little on how it is arranged
1.1. Sure, if you have only 2 users, it will scale well. But, what if you have 180M followers to Musk? Will you fan-out 180M cache and database updates for a single tweet?
1.2. It completely ignores geography. What if the DB shard for Elon is in Texas and his followers are spread across the world?
1.2.1 Where do you store data for a user from a specific country? What if they travel?
2. Social graph.
2.1. It sounds like on every tweet write, you query the social graph to figure out where to fan out. How does this work to scale? How many shards do you need to read to know even where to fan out too?
2.2. What if a user unfollows?
2.3. Where is the graph stored.
The design presented will not scale well to the millions and billions of users.
Thanks for the thoughts. You're right that this design fans out to all followers, so we're sacrificing write speed for users with many followers in order to maintain read performance for the majority of users. As you note, there's many other details we could dive into, such as using a social graph and implementing geographic sharding. Thanks for watching!
This is amazing content.
An alternative design perhaps could rely much more heavily in Kafka. Saving all the tweets in a Topic/partition and saving to the DB after 1y (or whatever) old.
In this way you could retrieve the timeline easily and also stream the new tweets. The DB would be more simple and perhaps we could get rid of the Mem Cache...
Thanks! Interesting thoughts, would be curious to see how using Kafka in that way would perform in practice :)
Thank you very much for this, may you be rewarded with abundant goodness👍🙏
thanks - more videos coming!
Great video! Enjoyed watching it. One thing really bothered me - that a write API would have to calculate and produce messages for followers’ timelines. I would probably make it produce messages with write operations, than have some consumers to process what update goes where and produce new message to notify users. Although, even this split wouldn’t allow for some more agile logic, ie prioritizing tweets going to timelines based on dynamic factors like time of the day, breaking news, change in users preferences.
Really glad you liked it. This is a super interesting point to bring up, and I agree that separating the timeline logic from the write API would make the system more maintainable. And as you mentioned, introducing a queue between the write API and this hypothetical timeline service would make tweeting faster on the user end while enabling the write API to handle higher loads. As far as I know, tweets always stream to the top of the feed and the rest of the timeline never changes, so this approach should work fine for "dynamic" timeline algorithms as well (but let me know if I'm misunderstanding). Stay tuned for more content :D
@@interviewpen Thank you for replying. It’s a great point about the tweets going on top of accumulated timeline. I believe, it would work for most services with timeline/feed.
Continue this series.this channel is so underrated 🎉❤
We will!! More content coming.
Thanks for the content. I have the following questions:
- Where does sharding logic reside? I think it must be application doing sharding. Pls correct if wrong.
- How does using tweetId+timestamp actually helps in preparing the timeline? For timeline need tweets from the folks the user is following and the approach mentioned at 21:57 doesn't help(Is it to do something with using Joins as it's a relational db?). The useful thing would be IMHO to have all tweets pertaining to a timeline on a single shard as if it's on multiple shards then thats lot of requests across shards to fetch the tweets.
1. Generally whatever database you choose will include sharding, you're just responsible for picking the key! MongoDB is a good example of a database that supports sharding out of the box.
2. Using tweet ID and timestamp allows the data to be distributed evenly across shards, meaning there aren't certain shards that have significantly more traffic or more data than others. You're right--to get a user's timeline, the user would have to query every node, but as long as the data is indexed properly within each node, this will still result in each node doing less work and will allow us to handle higher traffic. There's no great way to shard by timeline directly (ex. think about what happens when a user has no followers, where do their tweets go?), but the Redis cache should help the problem as it is organized by user. There's tons of ways to set up a sharded database and each has pros and cons, so great question!
Thanks for watching!
Could someone also explain the issue with pages and fetching n tweets at 6:25? What I understood is that with new tweets the backend needs to ensure that it carefully calculates the "n" tweets keeping new tweets that's coming into system.
But it's a potential candidate such that even if new tweets come we can keep appending them to top which means earlier we have tweets 1-10(assuming n as 10) and let's say. new tweets came then it will be (1-7)+3 new tweets.
visual presentation of how thought are coming up with any solution is great, but I would rather find a solution to 'celebrity effect' for that twitter
well explained design! Was quite useful.
thanks!
Great content. It is not clear how to fetch the user's tweet when the key is slowflake id and it is distributed over multiple nodes ?
good stuff, learned some things!
btw, what iPad app do you use for sketching?
Thanks! We use GoodNotes.
Great content, thanks guys.
Sure - thanks for watching
This is Great!! Could you pls cover designing a chat system and Monitoring system(time series DB), if possible. Thank!
We'll add it to the list, thanks for watching!
@@interviewpen Also I think it would be helpful to add some memory/storage estimates and global audience (multiple regions) use case.
will do!
Maybe I misunderstood the proposal, but how exactly is the memory cache going to work if it is an in-memory solution?
That necessarily has to take into account the number of active users. For example, say we have 1MIL active users per day, than we need to maintain a cache of 1MIL entries (1 entry for each user) with 100 tweet each (this is only for day 1, simplification).
If we store the tweet ID only, that could potentially work as it means we need 1MIL users * 100 records of size in the order of bytes - say 16bytes (random number). In this scenario we would need 1.6GB of memory which sounds reasonable for a cache, although we would need to fetch each tweet content still which in turns sounds a bit messy.
On the other hand if we need to store the tweet content AND tweet ID, we would require roughly 224GB of memory assuming we had 16 bytes TweetID and 140 bytes of content, which sounds not feasible.
EDIT1: typos 😅
Good thinking, all your math sounds reasonable. However, I disagree that 224GB of memory is unreasonable...there's plenty of systems with that much memory or more. It is also possible to shard an in-memory cache if the dataset becomes too large to fit on one node. There's also a number of other caching strategies for a system like this that may be more or less effective :) Thanks for watching!
Awesome Video, very well explained. Subbed and Liked. Just like to add some thought into this design, i understand that there is always pros and cons to any system design
However i would like to point out a potential issue related to the websocket connect to push event back to client to display a popup of sorts to let the client perform the API call to fetch the new timeline.
Based on the final design, The logic between Client LB READ API makes sense, the LB can have sticky session based load balancing to hit the same READ API instance, as i believed the READ API is scaled horizontally correct?
However, does that mean this design the READ API every scaled instance will have to be a unique consumer group, else if the collection of READ API share the same consumer group there can be an event where
- Client Connects Server1
- Server3 picks up the event but does not have the connection to Client to push the new timeline update.
So, if every scaled instance of the READ API is using a unique consumer group in Kafka, then the event can be "fanned out" to all instances. This design will resolve the issue, but leads to many events dropped or consumed and ignored. Another point is that for this event there is no need to add more than one partition for the topic as the there is only 1 consumer instance running per unique group ID.
Feel free to point out if any inconsistency in my explanation here.
Really great thoughts, thank you! You're right that usually the data would have to be fanned out to all consumers in a setup like this. Consuming and ignoring a message shouldn't be too much overhead, but at a very large scale it could become problematic. Another approach would be to use a separate topic for each user--this would mean an API node could subscribe to only the users it's connected too, but it adds a ton of overhead in the Kafka cluster itself. Perhaps a more traditional message queue like RabbitMQ might be better for this use case--we could set up a queue for each API node, and when a user connects, its API node could create a binding to route that user's data to its queue. Hope that helps!
@@interviewpen awesome, yes good approach as well. Thank you for sharing!
I'm not sure I understand using tweet id + timestamp as a shard key -- doesn't each tweet have a unique id? wouldn't that lead to as many shards as there are tweets? (and no tweet has multiple timestamps so...)
if it were twitter id (uid) i think it makes sense, since you want to request the tweets of a given user over a given time period.
Sharding by user could work, but we run into some problems since data from certain users is being accessed far more frequently than others. It's OK to have as many shards as tweets--a single node in our database cluster can be responsible for many records with different shard keys.This does lead to a lot of queries having to check every node in the cluster, but that's why we have the cache!
Amazing.
Thanks! We have a lot more content coming.
@@interviewpen cool
I wouldn't have gone down the Sharding approach with cache for a write database, too complex. Writes are a small percentage of traffic. You still have consistency problems (CAP theorem) anyway , and all you have done is add hops and latency and reduce your SLA. A write queue would have been more simple IMHO.
Right, we'd have to go into the math to see if a single node could handle the writes and storage size. Thanks for watching!
Right.
Respect 🙌🙌🙌 🇮🇳💝💝💝💝💝
thanks for watching!
why we dont create 2 load balancer for write and read each?
Yes-we could absolutely do that. This is mostly an infrastructure question of how we want to manage ingress into our system.
Arriving here after threads by instagram is out. Pretty sure someone from meta saw this video and gave the idea to the higher management, lol
🤠 thanks for watching
"120 chars as we on the normal Twitter" ... the good old days :D
:)
What app do you guys use to draw, its so beautiful 😭
We use GoodNotes on an iPad. Thanks!
This interview misses critical discussions about the way we send new tweets to the user. I think this system doesn't work as expected. specially having a limit list in Memcached. For example, what if someone has 30 million followers?
Yes, having a large number of followers would increase latency when making a new tweet. This is a tradeoff-we’re optimizing for fast timeline loads over fast posts. Thanks for watching!
Client supplying timestamp "is a bad idea. Let's do it anyway"
👍
👍
I thought I knew everything 😂
Lol :) Glad you enjoyed it
这字写的也太丑了,及其影响观感
ok - we'll try to write a bit neater next time! Thanks for watching!
@@interviewpen favour