Rate Limiter Design Deep Dive with Google SWE! | Systems Design Interview Question 7

Jordan has no life

มุมมอง 10 813

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 4 พ.ย. 2024

ความคิดเห็น • 29

@pl5778 ปีที่แล้ว ⁺⁴
This video is awesome. I think there are alot of little gold nuggets in how you explain things. For example when you talked about the the single/multi thread for redis.
@ouvreboite87 9 หลายเดือนก่อน ⁺²
Nice video :)
To extend a bit (if someone is looking to implement an actual distributed rate limiter):
The « list of events » approach for the rolling window is not optimal: the memory usage and cpu usage grows as you get closer to the limit (more events to store and iterate in the list).
The « leaky bucket » (or the similar « token bucket ») algorithm would be a better solution: constant time + constant memory + bursting is built-in.
It also requires some sort of « transactional » support.
@jordanhasnolife5163 9 หลายเดือนก่อน
Thanks for that, perhaps I should have covered leaky bucket! Nonetheless, could definitely be useful to reduce space overhead here, maybe it would allow us to do more caching on the load balancer.
@kaushikbv 2 ปีที่แล้ว ⁺⁴
Was asked this exact question in a coding interview.
I was able to come up with the exact approach of the rolling window that you mentioned.
Wrote the Pseudocode for the same.
But the interviewer wanted a working code for this in the 45 minute interview.
I was a bit baffled with the expectation.
@jordanhasnolife5163 2 ปีที่แล้ว ⁺²
First of all, glad the video helped! You can't always pass every interview, as sometimes the expectations really are high, but sounds like you're getting closer! Just use the interview as a way to self reflect and figure out what you need to work on.
@6eeykwr2 2 ปีที่แล้ว ⁺⁴
lol love the “rate limited” segments in the video lmao
@username911 5 หลายเดือนก่อน ⁺¹
AFAIK Redis supports keyspace notifications and that includes expire as well. That makes me think of the following algorithm for sliding window approach:
unless already exists create a consistent ip /& user based hash => redis set with expiry of allowed rate limit duration
if already the hash already exists create the key "guid:random_hash" as key and the consistent hash as value with the rate limit duration as ttl, and add it to the set too
remove the "guid:random_hash" from the set when the expire event fires for that key
when a request comes check O(1) if the set count has less than rate-1 items, if not reject, else carry on with above implementation.
What do you think?
@jordanhasnolife5163 5 หลายเดือนก่อน ⁺¹
Seems reasonable, I imagine that for TTLs Redis is basically doing the process that we've described
@karenronen6618 ปีที่แล้ว ⁺¹
Thanks for making this video :) I was wondering about the bit of the write back cache, isn't Redis already a cache? it is an in-memory key store, trying to understand the benefit of using the write-back cache if we already have a low latency cache ...
@jordanhasnolife5163 ปีที่แล้ว ⁺¹
That's correct but you still induce a network call to use it, whereas caching the result on the actual server itself avoids that
@raj_kundalia 8 หลายเดือนก่อน ⁺¹
thank you!
@mrdanp123 2 ปีที่แล้ว ⁺¹
Another great video. I've been reading about this a bit and redis can give atomicity with Lua scripts or MULTI/EXEC commands. Also a great data structure you can use for the rolling window is the redis sorted set where you use the timestamp as the score. Then you can perform an efficient range query to get all elements in your window.
@jordanhasnolife5163 2 ปีที่แล้ว
Yep exactly!
@stephenchan7944 ปีที่แล้ว ⁺¹
Hey Jordan,
Do you have any advice for knowing what metrics to track for capacity estimates? I often find it challenging to pin down exactly what I'm calculating capacity estimates for and for what granularity.
As always, thanks for the great content!
@jordanhasnolife5163 ปีที่แล้ว ⁺¹
I still have difficulty for this too! I think the main thing is not to focus too much on a specific number as a result, but moreso to get a sense of where the bottlenecks in your system lie, whether that's reads or writes or anything else. The estimates themsves can be very crude.
@aniruddhkhera510 2 ปีที่แล้ว ⁺¹
Hey Jordan, great video as always!
I have a question on the write-back cache approach. Although the benefit is it saves the network call to Redis but this will complicate the system.
How would you keep this local cache updated in a distributed rate-limiter setup? If the LB picks a rate-limiter instance based on how busy it's then there are a lot of problems around consistency and probably need a mechanism to sync the local caches.
The other way I could think of is using hashing or better consistent hashing to solve the problem of hotspots yet again we land to the same problem of keeping the local cache updated.
What do you think?
I would think of gossip protocol to sync local caches or setting a low threshold to expire the local record, probably near around the rate-limit time.
@jordanhasnolife5163 2 ปีที่แล้ว
I think that using consistent hashing on something like a userID will make sure that all of a users requests go to the same server so that you can cache them there
@sanjivsingh091 9 หลายเดือนก่อน
@@jordanhasnolife5163 are we suggesting sticky session thing here ?
@serhiisoboliev2385 2 ปีที่แล้ว ⁺¹
Nice content, please, keep going
@mickeyp1291 9 หลายเดือนก่อน ⁺¹
Hey Jordan, qualityvideo, sloppy backend this time.
you got so into the rate .limiter algo, you forgot the service per endpoint implementation,
it needs another cache/redis that when the user connects to the rate limiter will push the user rate/default rate into hash so the rate llimiter knows what to do. it also needs a mysql database for the crud of the rate limiter you also forgot about.
API
routes:
/rates/userid//service/
/rates/ip//service/
- add POST {ms:
or
day/
}
- del DEL
- update PUT {ms:
or
day/
}

- get GET -> response status (200(ok), 400(bad format), 422(invalid)
info_message: { state: [specified | default ]
type: [in ms | per day ]
count: long
id: [ip | userkey] :
}
@jordanhasnolife5163 9 หลายเดือนก่อน
I think the re-made version of this video (from like 2 weeks ago) was probably less shabby in this regard. Good point though, you can totally do some partitioning per service. I didn't touch too much on rate limits per user, but I do agree that if these were to exist you would need another form of storage to initiate those.
@faceTheCandidates ปีที่แล้ว ⁺¹
Great content.
I think in your performance estimate you make an underlying assumption that there is only one API end point you are guarding. In reality, you might want to at least mention that you are rate limiting across multiple services or multiple APIs within a service.
@jordanhasnolife5163 ปีที่แล้ว
Fair point! Though for the sake of the design all else should remain more or less the same (you just have to duplicate it lol)
@stephenirving9846 ปีที่แล้ว
This feels like a weird service to make. I'm glad I watched this to go into it, but having an api and load balancer layer are going to add 30 ms at a minimum in my experience when it comes to API gateway on AWS. Even up to 100 ms if your team uses Lambdas for your api layer.
On our AWS account, we just made our own elastic cache table and hit that directly because ultra low latency is really important for us.
I totally get that if you are offering this service to people outside your AWS account, you would need to have a load balancer and api layer.
@jordanhasnolife5163 ปีที่แล้ว
Yep I think you've accurately described why the write back cache is so important
@unknowrabbit666 9 หลายเดือนก่อน ⁺¹
Just finished a interview about Rate Limiting... I am so f*cked, but next time I will get it right, tks a lot!
@jordanhasnolife5163 9 หลายเดือนก่อน ⁺¹
That's how it goes! No worries, you gotta fail a few times to know where to improve!
@francisliubin ปีที่แล้ว ⁺¹
So is race condition a real problem here? If yes, any way to solve it besides lock... Thank you.
@jordanhasnolife5163 ปีที่แล้ว
You could always just hash things with userID and timestamp included and hope for no collisions, otherwise you'll have to assume that key collisions are possible which lead to write conflicts

ต่อไป

เล่นอัตโนมัติ

Twitter Search/ElasticSearch Design Deep Dive with Google SWE! | Systems Design Interview Question 8