I'm glad you uploaded a video about FastAPI. We prefer it over Flask. There are 2 topics where we need some help. 1.) Hosting - How to deploy the app so that other can access it via web? And how to manage the cloud infrastructure? 2.) Frontend - There are now plenty of frameworks and libraries. The standard approach is probably JavaScript, HTML and CSS. But I'm wondering what you think about pure Python libraries like Taipy, FastUI and reflex. What do you think is the best approach here? We would highly appreciate your input. Thanks! Keep up the great work! 💪💪👍👍
In production the async endpoint should not be used. An async function (coroutine) will be executed in the main thread event loop, and like the event loop in JS inside the browser, it can only execute one coroutine at a time. Running the synchronous, cpu intensive `model.predict` inside the async endpoint will make your prediction endpoint frozen and wait for the underlying cpu predicting the images, so the QPS of your handler is at most one. Better options could be: 1) Using a synchronous function as the inference endpoint, 2) create a threadpoolexecutor outside of the async function, and use `loop.run_in_executor()` with the threadpoolexecutor declared as it will run the model inside the thread, or 3) use poolexecutor similar to option 2. The problem for option 3 is that multiprocesses requires pickling and you might have to tweak your model case by case. Also, pickling the model and deserialize in the application api server doesn't reveal the identity and method signatures of that model. If you are the only one who train and deploy that might not be a big problem, but in production you might want to use some inferencing frameworks like Onnxruntime which you just serialize your trained model to the preferred format first (onnxruntime has a very small package size compared to other DL libraries which makes the deployment dependency smaller). Lastly, running scikit-learn model in python doesn't utilize the all the cores in your cpu, whereas other packages usually have higher utilization proportion. I understand that the model in this video is small in size and is a POC, so with the small size running async and pickling is fine. However, for just some even better CV and NLP models (e.g. BERT) it is nearly impossible to adopt the same approach as in this tutorial.
@cheukmingau983 hello, I'm currently facing a problem very similar to what you described, is there somewhere I can message you to get more info on this?
I've been using fastAPI for a couple of years now. Just starting with AI models. I had planned on calling models with fastAPI. See if I can do that with a ViT model I've been working with.
@@tthcan8038 technically, pickle serializes a python object (e.g., a scikit learn model in this case) to disk and enables to load the binary data straight into the python object when reading.
Great video! I'm a web developer and new to ML. Do you have the source code for this project in a Github repo? I would really love to try this out locally.
ANTIALIAS was removed in Pillow 10.0.0 (after being deprecated through many previous versions). Now you need to use PIL.Image.LANCZOS or PIL.Image.Resampling.LANCZOS. (This is the exact same algorithm that ANTIALIAS referred to, you just can no longer access it through the name ANTIALIAS.)
I'm glad you uploaded a video about FastAPI. We prefer it over Flask.
There are 2 topics where we need some help.
1.) Hosting - How to deploy the app so that other can access it via web? And how to manage the cloud infrastructure?
2.) Frontend - There are now plenty of frameworks and libraries. The standard approach is probably JavaScript, HTML and CSS. But I'm wondering what you think about pure Python libraries like Taipy, FastUI and reflex. What do you think is the best approach here? We would highly appreciate your input. Thanks!
Keep up the great work! 💪💪👍👍
This was excellent, the capabilities this opens up is really powerful. Good job as always.
We use FastAPI more than django and flask, can you please create video on langchain and fastapi as well?
+1
In production the async endpoint should not be used. An async function (coroutine) will be executed in the main thread event loop, and like the event loop in JS inside the browser, it can only execute one coroutine at a time. Running the synchronous, cpu intensive `model.predict` inside the async endpoint will make your prediction endpoint frozen and wait for the underlying cpu predicting the images, so the QPS of your handler is at most one.
Better options could be: 1) Using a synchronous function as the inference endpoint, 2) create a threadpoolexecutor outside of the async function, and use `loop.run_in_executor()` with the threadpoolexecutor declared as it will run the model inside the thread, or 3) use poolexecutor similar to option 2. The problem for option 3 is that multiprocesses requires pickling and you might have to tweak your model case by case.
Also, pickling the model and deserialize in the application api server doesn't reveal the identity and method signatures of that model. If you are the only one who train and deploy that might not be a big problem, but in production you might want to use some inferencing frameworks like Onnxruntime which you just serialize your trained model to the preferred format first (onnxruntime has a very small package size compared to other DL libraries which makes the deployment dependency smaller). Lastly, running scikit-learn model in python doesn't utilize the all the cores in your cpu, whereas other packages usually have higher utilization proportion.
I understand that the model in this video is small in size and is a POC, so with the small size running async and pickling is fine. However, for just some even better CV and NLP models (e.g. BERT) it is nearly impossible to adopt the same approach as in this tutorial.
@cheukmingau983 hello, I'm currently facing a problem very similar to what you described, is there somewhere I can message you to get more info on this?
@@alexandrosmaragkakis737 perhaps here? You can state your situation just with the minimal details
You are making requested videos. Thank you 💯
Exactly what I was looking for! Thanks man!
I've been using fastAPI for a couple of years now. Just starting with AI models. I had planned on calling models with fastAPI. See if I can do that with a ViT model I've been working with.
Can we see a hosting video off the same
why not use model directly instead of pickle?
Pickle makes your model remember the weights, so you need to fit only once
@@tthcan8038 technically, pickle serializes a python object (e.g., a scikit learn model in this case) to disk and enables to load the binary data straight into the python object when reading.
Great video! I'm a web developer and new to ML. Do you have the source code for this project in a Github repo? I would really love to try this out locally.
For me it correctly guesses only numbers 4, 6. For the rest it says they're 7 or 5.
Why do I have this error: 'module 'PIL.Image' has no attribute 'ANTIALIAS''? @10:41
ANTIALIAS was removed in Pillow 10.0.0 (after being deprecated through many previous versions). Now you need to use PIL.Image.LANCZOS or PIL.Image.Resampling.LANCZOS.
(This is the exact same algorithm that ANTIALIAS referred to, you just can no longer access it through the name ANTIALIAS.)
@@bobfreeman7349 That worked, thank you. I used this and it worked:
pil_image = pil_image.resize((28, 28), PIL.Image.LANCZOS)
I like this guy
Please ensure that your Discord server remains joinable. Thanks!
What is your daily Linux distro ❤
popOs
Arch linux hyprland
He is using Linux mint . I am using ubuntu
i am not entire sur i understand how that worst but thx a lot for the video