Excellent video, I think in system design I tend to think it's going to have many more moving parts but this shows that sometimes it's just client-server-DB on steroids.
Yeah it’s true, this is a relatively simple example, can get potentially way crazier when taking about other common system design questions like say, how would you build Uber, or how would you build Spotify, or something
What a coincidence, i am actually building a url shortener as a personal project and your video has provided me with more than enough information to build on what i already have. Great video.
Heh, if you get a short url domain, you are bound by law to create an url shortener app. I got the last single letter domain in my country, so of course I did exactly that. Not going to post the link here, don't know if my tiny server would handle the traffic
The starting part where he explained the calculations of storage and other stuff gave me a GOOSEBUMP, just imagine explaining the same flow when the interviewer asks you this!!
Saw this video, watched the rest of your videos, learned a bunch, and came back to say thanks. I really liked the system design videos. Seems like the algorithm liked it too. Keep up the good work and see you in the next one.
Great video. Couple of questions- 1. You said there shouldn't be more than 1 short urls for a single long url. So do you check in the database if a long url already has a short url or does the "hash" function takes care of it? 2. Do you just return the same short url for a different person if he has requested for long url already created by a previous person? If yes then why do you store the userId in the table
yes! great questions. 1. In theory the hashing technique (or the random keygen + dedicated db technique) should always create unique hashes so there should be no collisions. But it's cheap to just double check the DB and make sure no shortURL already exists so there are no duplicates/collisions. So we can just go ahead and do that as well. 2. That's a great point, I honestly hadn't really considered. I think for user experience you would want to give person 2 a fresh unique tinyURL (especially if they are requesting a custom tiny url). So there would be 2 different entries in the database where the keys are different tiny hashes, but the values both include the same long URL. so to your point, yeah the userID might not be necessary
Food for thought: What if we did not want the tiny url to be stored forever? Say we want it to be available for a short span of say 2 hours? What's the approach?
Hi there, I really loved the way you explain, excellent video! Do you mind to share the tools you used to record this video and create this presentation?
great video, that demonstrate the importance of thinking a bit in advance, before start coding. Eventually we end up with a cache lookup system. I have some questions... 1) Do you consider validating the URLs? Is there a limitation? What if someone would basically start to use this as a free cache store... 2) Are these tiny URLs are public or do you need the access keys to get to the real one? 3) I am wondering if you could possibly use a simple counter and hash that, instead of the whole URL. That would be faster and the hash would have a great distribution as well. 4) If you have the same hash for the same URL, it would be hard to delete the entry later, since other client has the reference. However, that could be a "prime" feature
Great video, you have a subscriber! Had a couple of questions about the shortening approaches: 1. On the key-generation approach: What's the rationale behind pre-generating keys? Are you trying to avoid a uniqueness check at creation time? Would the UNIQUE index on an SQL table be too big/slow? 2. On the hashing approach: Does the hash function guarantee equal distribution amongst the buckets? Not sure if picking the first letter out of the hash guarantees that. If not, perhaps re-hashing the hash with a function that guarantees a uniformly random output might do the trick. All this to say that skewed shards might be a big problem.
Hi In article with base64 conversion there is a problem to address. We have a counter that is shared across servers, so that counter is critical section and race condition might arrive as diff servers can read same value and generate same short-url. We must ensure mutual exclucion in that case. Correct me if i am wrong :)
Great video. Principal Engineer here, learned a thing or two and you touched on just enough. I liked the stats, however depending on how popular this service is these things would change a bit. An idea for more content would be to break these up into something like a hobby project, medium size (whatever that means, and enterprise level (what you displayed here). Regardless loved it!
The estimation of cache size would be good enough? I think we should estimate cache size based on shorten urls that are more getting heated then others url. So, If it's only 20% of shorten urls that are created with in a month, we can roughly expect that it would be ((500 bytes * 100 mil) / 100) * 20 = around 9 GB.
When talking about choosing between RDS vs NoSQL, IMO I was a little uncertain when it mentioned strong (RDS) vs eventual consistency (NoSQL). To RDS with a single instance, it might be confident to state that it can align with strong consistency, but when comes with replication nodes, RDS may also not guarantee consistency
No particular reason for those specific numbers. This is a napkin calculation so the idea is to say “here we’ll get way more reads than writes”. for this exercise it wouldn’t make much difference if the ratio was 400:1 or 1000:50 etc.
Crazy to think how much there is to even a simple seeming thing and then you realize there is still so much missing like authentication, authorization, payment handling, plans, email confirmations, internationalization of the website, possibly rate limiting for non paying customers…
Thanks for the great video! It covers many important points. However, I think the SQL vs. NoSQL explanation isn't entirely accurate. Eventual consistency isn't exclusive to NoSQL databases; both relational (SQL) and NoSQL databases can typically be configured to replicate writes either synchronously or asynchronously. The ACID properties relate to transaction management and do not address data consistency across database replicas. Word "Consistency" in case of ACID means "DB ensures that a transaction can only bring the database from one consistent state to another, preserving database invariants like key uniqueness etc"
I think a more realistic solution would use cloud providers and their services which for this case is even more simple with some KV storage and serverless functions
So, you just make up those numbers and define your own requirements and through a bunch of servers to do these things. I don't understand why people take system design interview. that's just seem to me common sense to load balance your requirements. Any software dev should know them.
How exactly will I earn money with that scale? 😮 Service should also provide some statistics, so users know if they draw any traffic and when. Add these to your calculations - data + load You said 100 years - where is expiry date? How often do you check when to delete? 100 years idea is just stupid overkill, stretching whole budget, give few years or check traffic to each link and if something seems not used then delete it.
one of those videos where you get most from viewing time - very concise, effective, no side bs, covers edge cases etc 💜
Amazed by the simplicity with which you explained everything, I think I will never forget a url shortner design now
Excellent video, I think in system design I tend to think it's going to have many more moving parts but this shows that sometimes it's just client-server-DB on steroids.
Yeah it’s true, this is a relatively simple example, can get potentially way crazier when taking about other common system design questions like say, how would you build Uber, or how would you build Spotify, or something
Excellent and concise video with no fluff. Thank you so much!
Solid, concise and the perfect warm up for every time I'm doing a last minute refresh before an interview.
Excellent video. Well done
What a coincidence, i am actually building a url shortener as a personal project and your video has provided me with more than enough information to build on what i already have. Great video.
Heh, if you get a short url domain, you are bound by law to create an url shortener app. I got the last single letter domain in my country, so of course I did exactly that. Not going to post the link here, don't know if my tiny server would handle the traffic
@@jonragnarssonwhat's a short url domain? also, does that mean x (twitter) should provide a url shortening service on the new domain as well?
15 mins of video and hours of value, great video Loved it👍
Would love to see more videos like this on designing systems
10/10 explanations and visuals. We need more!
This is the best explanation of this I have seen. Thank you!
Amazing channel! It's been some time since I enjoy a content so much! please keep the good work
The starting part where he explained the calculations of storage and other stuff gave me a GOOSEBUMP, just imagine explaining the same flow when the interviewer asks you this!!
Love this content wish there was more system design videos like this on TH-cam. Thank you.
Simple and just deep enough for non coder to understand. Awsome !
Really good explanation.
Great explaination...............best one so faaaaaaar
Great video
Love the look into the details
Very nice breakdown, reveals a lot of useful information.
What a nice piece of knowledge and thought delivery 👏👏
Loved your explanation. Looking forward for more videos.
Just one word - Perfect. Something I was looking for! Thanks
Love the way you explain things in a smiple way! Would love to see more system design viddoes from you!
thanks! just published a new one
@@codetour YESSSSIR! Can't wait to watch it tonight when I get home!
Great video and easy to follow explanation! Looking forward to more
Really amazing content. Keep it up!
Great video! Got an interview next week which I was told I’m gonna be designing a system similar to the one in the video, this helped me a lot.
Awesome! Appreciate you watching
Great video, It is simple, informative and ease to understand
Thanks a lot
i m just loving it
Great tutorial🎉❤
I love this kind of videos covering the theoretical part of programming! Great information on system architecture😁
Simple and yet scalable design compare to others. Thanks a lot...
Glad the algo picked up your video your channel is really good
Saw this video, watched the rest of your videos, learned a bunch, and came back to say thanks.
I really liked the system design videos. Seems like the algorithm liked it too. Keep up the good work and see you in the next one.
Very nice
Do more system design videos. Absolutely loved it 👍
I'm glad I discovered thia channel!
Amazing.. Bravo 👍👍
Great video. Super informative.
Appreciate you!
Love It, simple and informative
Thank you so much for the quality content. Please make a series of system design! I feel so struggle and inefficient when designing complex system.
crazy good explanation
Thanks for the video
Amazing simplicity bro, keep rocking
Learnt a lot of concepts in this one. Thank you so much!!😅
pleaaase more of this , the format is brilliant
I love the content that youtube has finally started droppijg in my recommended
Awesome .. keep going and do more videos about SD ♥
I think you should have expanded on how to compute that tinyurl. Its more relevant than explaining the lru which was very superficial
@Codetour very clear explaination. Could you share which tool you are using for drawing?
Great presentation ! One question though, how would you determine in advance which links are considered "hot" ?
subscribed👍
Great video. Couple of questions-
1. You said there shouldn't be more than 1 short urls for a single long url. So do you check in the database if a long url already has a short url or does the "hash" function takes care of it?
2. Do you just return the same short url for a different person if he has requested for long url already created by a previous person? If yes then why do you store the userId in the table
yes! great questions. 1. In theory the hashing technique (or the random keygen + dedicated db technique) should always create unique hashes so there should be no collisions. But it's cheap to just double check the DB and make sure no shortURL already exists so there are no duplicates/collisions. So we can just go ahead and do that as well.
2. That's a great point, I honestly hadn't really considered. I think for user experience you would want to give person 2 a fresh unique tinyURL (especially if they are requesting a custom tiny url). So there would be 2 different entries in the database where the keys are different tiny hashes, but the values both include the same long URL. so to your point, yeah the userID might not be necessary
@@codetour Thanks for the reply. Looking forward to more videos on system design questions. Cheers 🍻
👌👌
waiting new Systems designs ❤❤
Hands down top 2 system design vid on TinyURL on this site.
Food for thought: What if we did not want the tiny url to be stored forever? Say we want it to be available for a short span of say 2 hours? What's the approach?
Hi there, I really loved the way you explain, excellent video!
Do you mind to share the tools you used to record this video and create this presentation?
great video, that demonstrate the importance of thinking a bit in advance, before start coding. Eventually we end up with a cache lookup system.
I have some questions...
1) Do you consider validating the URLs? Is there a limitation? What if someone would basically start to use this as a free cache store...
2) Are these tiny URLs are public or do you need the access keys to get to the real one?
3) I am wondering if you could possibly use a simple counter and hash that, instead of the whole URL. That would be faster and the hash would have a great distribution as well.
4) If you have the same hash for the same URL, it would be hard to delete the entry later, since other client has the reference. However, that could be a "prime" feature
what's the name of the whiteboard
seems pretty cool
Photoshop
Great video, you have a subscriber! Had a couple of questions about the shortening approaches:
1. On the key-generation approach: What's the rationale behind pre-generating keys? Are you trying to avoid a uniqueness check at creation time? Would the UNIQUE index on an SQL table be too big/slow?
2. On the hashing approach: Does the hash function guarantee equal distribution amongst the buckets? Not sure if picking the first letter out of the hash guarantees that. If not, perhaps re-hashing the hash with a function that guarantees a uniformly random output might do the trick. All this to say that skewed shards might be a big problem.
Hi
In article with base64 conversion there is a problem to address.
We have a counter that is shared across servers, so that counter is critical section and race condition might arrive as diff servers can read same value and generate same short-url.
We must ensure mutual exclucion in that case.
Correct me if i am wrong :)
Very useful! 🙌👨💻
in 5:07, I don't think that the API key should be needed for the redirection endpoint
Subscribed 😅😅
Thank you
Great video. Principal Engineer here, learned a thing or two and you touched on just enough. I liked the stats, however depending on how popular this service is these things would change a bit. An idea for more content would be to break these up into something like a hobby project, medium size (whatever that means, and enterprise level (what you displayed here).
Regardless loved it!
The estimation of cache size would be good enough? I think we should estimate cache size based on shorten urls that are more getting heated then others url. So, If it's only 20% of shorten urls that are created with in a month, we can roughly expect that it would be ((500 bytes * 100 mil) / 100) * 20 = around 9 GB.
where does the 70GB cache storage come from? Given 60TB total storage, wouldn't 20% of 60TB be around 12 TB?
When talking about choosing between RDS vs NoSQL, IMO I was a little uncertain when it mentioned strong (RDS) vs eventual consistency (NoSQL). To RDS with a single instance, it might be confident to state that it can align with strong consistency, but when comes with replication nodes, RDS may also not guarantee consistency
Is there any reason for the assumption of 200:1 read: create ratio? If, then please explain.
No particular reason for those specific numbers. This is a napkin calculation so the idea is to say “here we’ll get way more reads than writes”. for this exercise it wouldn’t make much difference if the ratio was 400:1 or 1000:50 etc.
Super mindful video, you better not get lost in any section of the video or you'll end up like, is this gibberish? 😂
Awesome explanation 🎉
Crazy to think how much there is to even a simple seeming thing and then you realize there is still so much missing like authentication, authorization, payment handling, plans, email confirmations, internationalization of the website, possibly rate limiting for non paying customers…
Thanks for the great video! It covers many important points. However, I think the SQL vs. NoSQL explanation isn't entirely accurate. Eventual consistency isn't exclusive to NoSQL databases; both relational (SQL) and NoSQL databases can typically be configured to replicate writes either synchronously or asynchronously. The ACID properties relate to transaction management and do not address data consistency across database replicas. Word "Consistency" in case of ACID means "DB ensures that a transaction can only bring the database from one consistent state to another, preserving database invariants like key uniqueness etc"
I think a more realistic solution would use cloud providers and their services which for this case is even more simple with some KV storage and serverless functions
"slightly inspired" by Sandeep's article 🤔
So, you just make up those numbers and define your own requirements and through a bunch of servers to do these things.
I don't understand why people take system design interview. that's just seem to me common sense to load balance your requirements. Any software dev should know them.
what's with the holding your mic trend 😂
Using a microphone improves the quality of your audio much more than for example just recording straight into your phone/camera
@@codetour That's not what I meant. I get the point of using a good mic. But why not just put it on your shirt instead of holding it in your hand 😉
the sound quality is actually slightly better when the mic head doesn't rub against fabric@@blizzy78
can people really do all this math in their head that quick?
How exactly will I earn money with that scale? 😮
Service should also provide some statistics, so users know if they draw any traffic and when. Add these to your calculations - data + load
You said 100 years - where is expiry date? How often do you check when to delete?
100 years idea is just stupid overkill, stretching whole budget, give few years or check traffic to each link and if something seems not used then delete it.
You're a surface-level scammer.
Tell me more
@@codetour I'll be honest with you. I've been shadowed-banned many times, so I just assume all of the comments I post will never be seen.