@@alejandro_aoI want to create a multi llm chatbot for telecommunication, is it a way to connect with you apart from TH-cam so that I can share the problem statement with you ?
You should look into llamaparse rather than unstructured. The amount of content I've indexed into the vector db would take 15 days with unstructured, vs. with llamaparse it only takes a few hours. Plus, you can make the api calls async as well.
@@daniellopez8078 Unstructured is free but its slow. Its default with langchain. Lllamaparse offers a free plan, in which it gives 1000 free pages to parse daily.
@@daniellopez8078 Unstructured is free. They have a open source ver and a propriety version. The propriety version is paid, and apparently offers better quality. The free unstructured is slow. Llamaparse is fast, and it gives you 1000 pages free per day.
Thank you for the video. Just curious, how to go about persisting the multivector database? What data sources are available that cater to such requirements? Also, how do we go about getting an image as an input from the user, so the language model can relate to it within the documents and predict an answer!
Great tutorial, very detailed. Just one question, any options to link the text chunk that describes the image as the context of the image to create more accurate summary of the image?
beautiful question. totally. as you can see, the image is actually one of the `orig_elements` inside a `CompositeElement`. and the `CompositeElement` object has a property called `text`, which contains the raw text of the entire chunk. this means that instead of just extracting the image alone like i did here, you can extract the image alongside the text in its parent `CompositeElement` and send that along with the image when generating the summary. great idea 💪
Yes, absolutely. just use the langchain ollama integration and change the line of code where i use ChatOpenAI or ChatGroq. Be sure to select multimodal models when dealing with images though
What do you recommend or how do you suggest that the conversion of a PDF of images (text images) to text can be automated? The problem is that traditional OCR does not always do the job well, but ChatGPT can handle difficult images.
in this example, i embedded them with the rest of the text. if you want to process them separately, you can always extract them from the `CompositeElement` like i did here with the images. then you can maybe have a LLM explain the equation and vectorize that explanation (like we did with the description of the images). in my case, i just put them with the rest of the text, i feel like that gives the LLM enough context about it.
Bro I literally came to back to get your old video on PDFs and you already have an update. Thank You!
Idk, i just finally found most understandable AI Explanation Content. Thank you Alejandro
glad to hear this :)
@@alejandro_aoI want to create a multi llm chatbot for telecommunication, is it a way to connect with you apart from TH-cam so that I can share the problem statement with you ?
the best touch is when you add front-end
good job
hey! i'll add a ui for this in a coming tutorial 🤓
i was about to learn from the previous video. But you brother. just bring more gold.
you’re the best
Excellent!!!! Thank you Alejandro
Amazing Toturial
Hey dude what are you using to screen record? Mouse sizing and movement looks super smooth id like to create a similar style when giving tutorials
hey there, that's this screen studio app for mac developed by the awesome Adam Pietrasiak @pie6k, check it out :)
thanks for great content, how can we modify this to user local LLM , Ollama3.2 and Ollama-vision
You should look into llamaparse rather than unstructured. The amount of content I've indexed into the vector db would take 15 days with unstructured, vs. with llamaparse it only takes a few hours. Plus, you can make the api calls async as well.
i LOVE llamaparse. i'll make a video about it this month
Do you know if unstructured is open-source (meaning free)? Do you know any other free alternative to unstructured?
@@daniellopez8078 Unstructured is free but its slow. Its default with langchain. Lllamaparse offers a free plan, in which it gives 1000 free pages to parse daily.
@@daniellopez8078 Unstructured is free. They have a open source ver and a propriety version. The propriety version is paid, and apparently offers better quality. The free unstructured is slow. Llamaparse is fast, and it gives you 1000 pages free per day.
Thank you for the video. Just curious, how to go about persisting the multivector database? What data sources are available that cater to such requirements? Also, how do we go about getting an image as an input from the user, so the language model can relate to it within the documents and predict an answer!
Good one!!, did you see any opensource alternatives like Markers?
Great tutorial, very detailed. Just one question, any options to link the text chunk that describes the image as the context of the image to create more accurate summary of the image?
beautiful question. totally. as you can see, the image is actually one of the `orig_elements` inside a `CompositeElement`. and the `CompositeElement` object has a property called `text`, which contains the raw text of the entire chunk. this means that instead of just extracting the image alone like i did here, you can extract the image alongside the text in its parent `CompositeElement` and send that along with the image when generating the summary. great idea 💪
Very nice, is it possible to be done with local LLM like Ollama model?
Yes, absolutely. just use the langchain ollama integration and change the line of code where i use ChatOpenAI or ChatGroq. Be sure to select multimodal models when dealing with images though
Nice
What do you recommend or how do you suggest that the conversion of a PDF of images (text images) to text can be automated? The problem is that traditional OCR does not always do the job well, but ChatGPT can handle difficult images.
# til
tks for you video.
is possible using crewAI in the same example ?
Hi Bro, Can you create a video for Multimodal RAG: Chat with video visuals and dialogues.
this sounds cool! i’ll make a video about it!
Thanks @@alejandro_ao
any idea to install poppler tesseract libmagic in windows maqhine?
what about mathematical equatons?
in this example, i embedded them with the rest of the text. if you want to process them separately, you can always extract them from the `CompositeElement` like i did here with the images. then you can maybe have a LLM explain the equation and vectorize that explanation (like we did with the description of the images). in my case, i just put them with the rest of the text, i feel like that gives the LLM enough context about it.
@@alejandro_ao thanks for the context i was stuck at this for a week now