Mission Critical rate limiting should happen outside your FastAPI. Ideally you should use the rate limiting offered by your API Gateway so your service doesn't go down if someone tries to DDOS your API. You should only use rate limiting in your app for the "good actors" who aren't caught by the API Gateway rate limiter but are going above their account-specific limit.
@@motbus3 well thats fair... but in that case you probably don't need rate limiting I guess the more appropriate answer would be to have rate limiting at the API Gateway or Load Balancer level. Since API Gateways aren't ubiquitous
@@JohnWalz97 maybe you need. Maybe someone abuses your api or compromises your ability to interact with it. Maybe you accidentally enable something that spawns hundreds of thousands requests to your serverless API because you and that API spends money somewhere else. Not that I did that 😶🌫️🙈🙉🙊
Hi. I really love your videos, I have several years of work experience in different python projects and still find your videos interesting and educational. But I have to be honest this is the first video I have to criticise. I think when talking about rate-limiter showing the implementation that uses Redis would be much better as that is more the real world example you'll actually use. The moment you're running serveral services in cloud with auto-scaling you need one point where you can store and validate rate-limits of your API. And having the knowledge of correctly seting limits for different roles or some access levels and working with Redis and Lua is definitely a great thing to know.
Your fastapi tutorials are always straigthforward and well explained.Please a tutroial on how to handle file uploads and working with pdfs in FastAPI.I would love to get your guide on this.
Great video. I actually implemented my own rate limit heavingly based on suggestions from co-pilot ;-) and I probably should go back and replace it with slowapi. I have a few calls which I needed to protect (generating BTC Lightning invoices) which I really must avoid getting called many times and my limits do the job. It's very important to remember that most deployments of FastAPI (say in Docker containers) run multiple workers so you absolutely must have them sync up their rate limit info via Redis or something like that.
It's interesting sometimes to see how the other side lives. I work exclusively on in-house projects so any API I create is consumed by internal customers and there's not been a hard need for rate limiting. I can see how it's a valuable thing for APIs accessible via public IP addresses and I can also see how it could be handy to prevent flooding.
4:55 Network chuck does it the best (the writing on screen) essentially, make yourself smaller, write on the remaining space, since the background is not an even color, choosing a color that matches fits everywhere with sufficient contrast is difficult, solution is to outline the drawing with a contrast color, like black outline for white text... (that what windows does when putting text on the screen with infinite possible desktop background)
Hi Arjan, I often rewatch your Design Patterns video to help improve my workflow. I'm a data engineer, and now I'm working on building a frontend for my ML projects. I used to use Streamlit, but recently I built a frontend (Sveltekit and tailwindcss) using the Claude Sonnet model with Cursor IDE (Composer). Could you share some advice on how to structure our workflow to stay true to design patterns in this age of AI-assisted coding? Thank you!
Rate limiting is something most people don't think of. I didn't think of it for years, until a project I was working on was bombarded and suddenly we saw 500s everywhere (which was a blessing in disguise, because we found a leak that we never realized was there). Anyway, since 2018 I always add at least some limiting to every project, and most of the times it's as easy as adding a decorator or a middleware. Or even better, outside of the main application(s), like on the gateway level, if you utilize that kind of solutions. Weird that frameworks don't add it by themselves as an option.
I've just bumped rate limiter (slowapi) on one of prd instances. I recommend this: very easy to implement, Redis is optional (must in multi instance environments), but easy to hookup and configure.
I'm wondering if the first basic solutions (IP, api-key) work in Flask. Flask doesn't use asynchronous code, but concurrent, so doesn't it create usage: dict[str, list[float]] = {} for each request?
It is better to solve SIMPLE rate limit in your proxy. Proxy is specialized for the network tasks and blazing fast, for example Nginx. -- Note: I am at the begin of video so maybe it is further. edit: COMPLEX rate limiting - yes additional logic, like checking per user account rate, has to be solved in app logic layer with access to storage With that said do not forget to set generous SIMPLE rate limits on your proxy (usually locked on client IP). Note: I am biased because we host our own proxy in production so we are able to set up our own rules
✅ Get the FREE Software Architecture Checklist, a guide for building robust, scalable software systems: arjan.codes/checklist.
This is one of those details that separates hobby coding from enterprise level coding. 👏
Mission Critical rate limiting should happen outside your FastAPI. Ideally you should use the rate limiting offered by your API Gateway so your service doesn't go down if someone tries to DDOS your API. You should only use rate limiting in your app for the "good actors" who aren't caught by the API Gateway rate limiter but are going above their account-specific limit.
Yes. If you have an API gateway.
Maybe you are just running a service in your backyard
@@motbus3 well thats fair... but in that case you probably don't need rate limiting
I guess the more appropriate answer would be to have rate limiting at the API Gateway or Load Balancer level. Since API Gateways aren't ubiquitous
@@JohnWalz97 maybe you need. Maybe someone abuses your api or compromises your ability to interact with it.
Maybe you accidentally enable something that spawns hundreds of thousands requests to your serverless API because you and that API spends money somewhere else. Not that I did that 😶🌫️🙈🙉🙊
@@motbus3 lmao... well ive also done that before so your not alone 😭
Hi. I really love your videos, I have several years of work experience in different python projects and still find your videos interesting and educational. But I have to be honest this is the first video I have to criticise. I think when talking about rate-limiter showing the implementation that uses Redis would be much better as that is more the real world example you'll actually use. The moment you're running serveral services in cloud with auto-scaling you need one point where you can store and validate rate-limits of your API. And having the knowledge of correctly seting limits for different roles or some access levels and working with Redis and Lua is definitely a great thing to know.
Your fastapi tutorials are always straigthforward and well explained.Please a tutroial on how to handle file uploads and working with pdfs in FastAPI.I would love to get your guide on this.
Great video. I actually implemented my own rate limit heavingly based on suggestions from co-pilot ;-) and I probably should go back and replace it with slowapi. I have a few calls which I needed to protect (generating BTC Lightning invoices) which I really must avoid getting called many times and my limits do the job.
It's very important to remember that most deployments of FastAPI (say in Docker containers) run multiple workers so you absolutely must have them sync up their rate limit info via Redis or something like that.
It's interesting sometimes to see how the other side lives. I work exclusively on in-house projects so any API I create is consumed by internal customers and there's not been a hard need for rate limiting. I can see how it's a valuable thing for APIs accessible via public IP addresses and I can also see how it could be handy to prevent flooding.
So usefull. Thanks !
Glad it was helpful!
4:55
Network chuck does it the best (the writing on screen)
essentially, make yourself smaller, write on the remaining space,
since the background is not an even color, choosing a color that matches fits everywhere with sufficient contrast is difficult,
solution is to outline the drawing with a contrast color, like black outline for white text...
(that what windows does when putting text on the screen with infinite possible desktop background)
Hi Arjan,
I often rewatch your Design Patterns video to help improve my workflow. I'm a data engineer, and now I'm working on building a frontend for my ML projects. I used to use Streamlit, but recently I built a frontend (Sveltekit and tailwindcss) using the Claude Sonnet model with Cursor IDE (Composer). Could you share some advice on how to structure our workflow to stay true to design patterns in this age of AI-assisted coding?
Thank you!
Rate limiting is something most people don't think of. I didn't think of it for years, until a project I was working on was bombarded and suddenly we saw 500s everywhere (which was a blessing in disguise, because we found a leak that we never realized was there). Anyway, since 2018 I always add at least some limiting to every project, and most of the times it's as easy as adding a decorator or a middleware. Or even better, outside of the main application(s), like on the gateway level, if you utilize that kind of solutions.
Weird that frameworks don't add it by themselves as an option.
Slowapi is a good solution for that
Shouldn't a missing API key be a 401 response?
Have you tried the fastapi-limiter package? How would you say it compares to slowapi?
I have tried that, but it seems that package needs Redis in order to work and I wanted something a bit more flexible.
I've just bumped rate limiter (slowapi) on one of prd instances.
I recommend this: very easy to implement, Redis is optional (must in multi instance environments), but easy to hookup and configure.
How can you detect IP spoofing
Why not fail2ban?
I'm wondering if the first basic solutions (IP, api-key) work in Flask. Flask doesn't use asynchronous code, but concurrent, so doesn't it create usage: dict[str, list[float]] = {} for each request?
It is better to solve SIMPLE rate limit in your proxy. Proxy is specialized for the network tasks and blazing fast, for example Nginx. -- Note: I am at the begin of video so maybe it is further.
edit:
COMPLEX rate limiting - yes additional logic, like checking per user account rate, has to be solved in app logic layer with access to storage
With that said do not forget to set generous SIMPLE rate limits on your proxy (usually locked on client IP).
Note: I am biased because we host our own proxy in production so we are able to set up our own rules