Great video. It is very complete!!! I only have one question: Why didnt you use a cloud hosting for the vector database? (Like Pinecone, for example). I mean, in Production enviroment is more efficient, isnt it? Thank you for your content! Keep it going :)
Great video, really helpful. Please can you create a video that covers the other 3 features you suggested i.e. adding a Web UI (e.g. React & Next.js), Authentication (e.g. Clerk) and Payments (e.g. Stripe). Thanks.
that's quite of a different topic mate, he's a data scientist, not the entire IT department... but you can clearly test this same app using basic html, css and js webpage calling the API endpoints.
Thanks for the feedback! Glad you found the video helpful. That's a great suggestion for future content. I actually do have some (outdated) videos on all of those on my channel already, but it's a great suggestion to integrate them all with a Python AI/RAG app for sure.
This is a great video and I love all your tutorials. You should call out which models you are using. It took me a long time to figure out which model was used for the embeddings since you did not pass the Model ID as a parameter there.
Thanks for the kind words! You're absolutely right about calling out the models - that's a great suggestion. For the embeddings, I used the default model in the AWS Bedrock SDK. I'll make sure to explicitly mention model IDs in future tutorials. Appreciate you pointing that out!
Awesome video bro! Why not use AWS Kendra for the RAG block? You woulda save at least half of the video and Infra issues you faced. Still good for learning, nice job!
Kendra only free for the 1st month, and it costs $1+ per hours after that. You can totally using EC2 instance with that budget. Meanwhile, Lambda got permanent 1 millions free request/ month. Actually perfect for quick API calling task. API Gateway also free for the first year If you got 5000+ user/ monthly and the usage can go way above Lambda barrier , then EC2 is the more optimal. But honestly, I say just buy a physical server at that point
If we use lama3 as Model how can we install and use it in production ? if i remember correctly it need a really big CPU so the price is must insanely expensive
Yeah, the rag-sdk-infra step can be a bit tricky to grasp at first. It's using AWS CDK, which is a pretty big topic on it's own, and out of scope for this tutorial. I'll probably make more CDK tutorials in the future to cover it in more detail.
What happens in the database if the info on the pdf was wrong? Is it easy to correct? Is it easy to delete the wrong info and populate with the new one?
Great question! The video doesn't cover this specific scenario, but here's the gist: 1. Wrong info: It stays in the database until corrected. 2. Correction: Usually pretty straightforward. You'd update the vector database entries. 3. Deletion and repopulation: Yep, totally doable. You'd remove the old embeddings and add new ones. The challenge will mostly be around figuring out which "chunk" has changed in the database (so you can update it).
Great video why you didn't use azure open ai llm because most of the companies using closed openai like azure so.. We can learn azure open ai services so create video with azure open ai llm with multimodal support rag application ( input multiple pdfs with images and tables and text) and integrate with streamlit.. post the vdo by using azure open api key Azure open ai embedding and azurechatopenai for multiple pdfs rag application.. Using azure open api end point there are no videos on TH-cam so it would be helpful.. 😊
Great content. 🫡 Just finished building this. How to use this model for Excel Files where it is needed to check the whole file to provide answers. It basically checks just the relevant chunks and does not summarize the whole file for .xlsx files. Can you take this up in your next tutorial?
Your video really helps me a lot. Thanks!
Wow! Thank you so much for the Super Thanks :D
Hats off to you for making such an amazing and complete tutorial. Next video request: 1. Guardrails 2. Multi-tenancy 3. JWT Auth implementations.
God Bless you man! Thank you for your work!
Great video. It is very complete!!! I only have one question: Why didnt you use a cloud hosting for the vector database? (Like Pinecone, for example). I mean, in Production enviroment is more efficient, isnt it? Thank you for your content! Keep it going :)
Great video, really helpful. Please can you create a video that covers the other 3 features you suggested i.e. adding a Web UI (e.g. React & Next.js), Authentication (e.g. Clerk) and Payments (e.g. Stripe). Thanks.
basically create a business with this? lol
that's quite of a different topic mate, he's a data scientist, not the entire IT department... but you can clearly test this same app using basic html, css and js webpage calling the API endpoints.
That's basically a basic webdev tutorial lol has nothing do to with data analysis or ML/LLM
Thanks for the feedback! Glad you found the video helpful. That's a great suggestion for future content. I actually do have some (outdated) videos on all of those on my channel already, but it's a great suggestion to integrate them all with a Python AI/RAG app for sure.
This tutorial is great. Thank you Bro for that.
Great video, that was exactly what I needed. It just worked. Thank you so much.
Glad it worked out for you! Thanks for watching 😊
Great content man! Really appreciate it.
Thanks! Really glad you found it helpful!
This is an amazing tutorial, thank you very much for this kind of content!
Instant like and sub
This is a great video and I love all your tutorials. You should call out which models you are using. It took me a long time to figure out which model was used for the embeddings since you did not pass the Model ID as a parameter there.
Thanks for the kind words! You're absolutely right about calling out the models - that's a great suggestion. For the embeddings, I used the default model in the AWS Bedrock SDK. I'll make sure to explicitly mention model IDs in future tutorials. Appreciate you pointing that out!
It is really useful. Thanks a lot.
Awesome video bro! Why not use AWS Kendra for the RAG block? You woulda save at least half of the video and Infra issues you faced. Still good for learning, nice job!
Kendra only free for the 1st month, and it costs $1+ per hours after that. You can totally using EC2 instance with that budget.
Meanwhile, Lambda got permanent 1 millions free request/ month. Actually perfect for quick API calling task.
API Gateway also free for the first year
If you got 5000+ user/ monthly and the usage can go way above Lambda barrier , then EC2 is the more optimal. But honestly, I say just buy a physical server at that point
that's pure gold, ty!
Amazing video!
Thanks! Glad you enjoyed it! 😊
awesome, more videos about AI, please🙂
Thanks a lot! This is awesome!
Thanks! Glad you found it awesome 😊
I love your videos and wanted to know which software do you use for representing your architecture diagrams, they look wonderful?
Thanks! I use Excalidraw: excalidraw.com/
This was great! Thank you.
If we use lama3 as Model how can we install and use it in production ? if i remember correctly it need a really big CPU so the price is must insanely expensive
It would be great if you could provide more details on rag-sdk-infra step. It is hard to get what it is and what happens behind the scenes.
Yeah, the rag-sdk-infra step can be a bit tricky to grasp at first. It's using AWS CDK, which is a pretty big topic on it's own, and out of scope for this tutorial. I'll probably make more CDK tutorials in the future to cover it in more detail.
can yo make a tutorial deploying it to streamlit?
Would this be beneficial to integrate into a chat based app?
Why don't you use a WebSocket for communication with the ChatBot?
What happens in the database if the info on the pdf was wrong? Is it easy to correct? Is it easy to delete the wrong info and populate with the new one?
Great question! The video doesn't cover this specific scenario, but here's the gist:
1. Wrong info: It stays in the database until corrected.
2. Correction: Usually pretty straightforward. You'd update the vector database entries.
3. Deletion and repopulation: Yep, totally doable. You'd remove the old embeddings and add new ones.
The challenge will mostly be around figuring out which "chunk" has changed in the database (so you can update it).
How about using Cloudflare?
Yeah, Cloudflare could be another great option, but I haven't explored it much myself so I can't really say much about it.
Great video why you didn't use azure open ai llm because most of the companies using closed openai like azure so.. We can learn azure open ai services so create video with azure open ai llm with multimodal support rag application ( input multiple pdfs with images and tables and text) and integrate with streamlit..
post the vdo by using azure open api key
Azure open ai embedding and azurechatopenai for multiple pdfs rag application..
Using azure open api end point there are no videos on TH-cam so it would be helpful.. 😊
Great content. 🫡
Just finished building this. How to use this model for Excel Files where it is needed to check the whole file to provide answers. It basically checks just the relevant chunks and does not summarize the whole file for .xlsx files.
Can you take this up in your next tutorial?
Love your content!
Love your content!
Love your content!