Hey this is amazing and i kindly request you to upload some videos how can we work with pdf document extraction for text ,tables, images graphs etc.. in the documents for rag application
Thanks for great video. Is it possible to take both input image and text from user and query this? For example, user will upload its car image and ask about similar cars with lowest price based on the uploaded image. Then the system retrieve related car image and text from database.
@@engineerprompt I have actually built a RAG chatbot using Langchain for my organisation. The pdf that we load usually contains lots of tables and few images. So far it is giving good responses from those PDFs. But ya if there is a method to extract these non text datas more efficiently, I'll definitely want to integrate with my chatbot.
I wonder how much time before we will be able to run this locally, and then what would be a good model. So far from my testing nothing could compare to GPT-4... Thanks for the video
This is nice demo but really useless in real world scenarios because you can maybe extract those images from wiki, but you can not from specific PDF file.. but it is still nice demo, but not very useful in real world projects where you need to build specific app .. still good thing for someone who wants to learn
Can u make the same thing using free api models cause gpt api ain't free. Also a guide to host it on a cloud would also be great. End to end app deployed on cloud
If you want to learn RAG Beyond Basics, checkout this course: prompt-s-site.thinkific.com/courses/rag
thank you, keep it coming chief great work !
These rag videos are super interesting
thanks
Hey this is amazing and i kindly request you to upload some videos how can we work with pdf document extraction for text ,tables, images graphs etc.. in the documents for rag application
Such great code explanation and layout... so many Gist-able functions...thanks!!
Thanks for great video. Is it possible to take both input image and text from user and query this? For example, user will upload its car image and ask about similar cars with lowest price based on the uploaded image. Then the system retrieve related car image and text from database.
Is there a cost to using the API keys here? Wondering if this can be built into an application at scale
Hi, I had a small doubt. Doesn't the Langchain's document loaders extract image from the document?
No, by default, its does not. You can use something like unstructedio that can extract images and tables. Will create a video on it soon.
@@engineerprompt I have actually built a RAG chatbot using Langchain for my organisation. The pdf that we load usually contains lots of tables and few images. So far it is giving good responses from those PDFs. But ya if there is a method to extract these non text datas more efficiently, I'll definitely want to integrate with my chatbot.
@@engineerprompt any updates on this.
Can u use pdf containing images instead of this text data and image data
hey will this code not run in windows only in colab?
Excellent tutorial!
Can you share the .ipynb please
I wonder how much time before we will be able to run this locally, and then what would be a good model. So far from my testing nothing could compare to GPT-4... Thanks for the video
CLaude 3.5 sonnet is far more performant than any model now
local vision models have still a long way to go. But hopefully we will have something "good enough" soon.
This is nice demo but really useless in real world scenarios because you can maybe extract those images from wiki, but you can not from specific PDF file.. but it is still nice demo, but not very useful in real world projects where you need to build specific app .. still good thing for someone who wants to learn
well, it is a light. it is more near to extract images from a PDF , excel, libreoffice, csv,
Can u make the same thing using free api models cause gpt api ain't free. Also a guide to host it on a cloud would also be great. End to end app deployed on cloud
Why it’s exactly 10x better?! Maybe it’s just better?
Can we get the code?
link is in the video description.