How to build a Streamlit UI for Local PDF RAG [Ollama models]

The How-To Guy

มุมมอง 6 540

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 18 พ.ย. 2024

ความคิดเห็น • 117

@tonykipkemboi 4 หลายเดือนก่อน ⁺¹
If you like this video, you may also like these related videos:
👉🏾In Part 1 of this series, we create the RAG pipeline: th-cam.com/video/ztBJqzBU5kc/w-d-xo.html
👉🏾In Part 2, We created a Streamlit UI for local Ollama models: th-cam.com/video/bAI_jWsLhFM/w-d-xo.htmlsi=oHwvPjGLxO7l-HXL
What kind of videos would you like to see more of?
Please let me know in the comments below!
@junaidbadshah9343 2 หลายเดือนก่อน ⁺¹
I looking for 2 weeks this type of video but finally I got it Thank you brother.
@THE-AI_INSIDER 4 หลายเดือนก่อน ⁺⁵
the most awaited video on YT for RAG!
@THE-AI_INSIDER 4 หลายเดือนก่อน ⁺¹
we also need a robust CSV RAG, and other amazing RAG + other exciting videos as well!
@THE-AI_INSIDER 4 หลายเดือนก่อน ⁺¹
also please increase your frequency of dropping videos. content like urs is Gold
@tonykipkemboi 4 หลายเดือนก่อน
@@THE-AI_INSIDER thank you. I am planning on doing this more often.
@tonykipkemboi 4 หลายเดือนก่อน ⁺³
@@THE-AI_INSIDER I'm working on the CSV/Excel/structured data RAG
@THE-AI_INSIDER 4 หลายเดือนก่อน ⁺¹
@@tonykipkemboi you rock
@RobSeward7 2 หลายเดือนก่อน
Great intro to RAG with a nice UI to boot. Well done. For others following along, I had a ton of problems using Chroma. It would lose it's mind between requests (the vector_db became empty for some reason). I had to swap in a milvus vector_db. When I did that the whole bit of code became much more reliable. I was able to interrogate a PDF document for about 45 minutes. The responses were cogent with the gemma2 model and sometimes useful! Note I had no problem with Chroma in the previous published video. I am thinking this is an interaction problem with Chroma and streamlit's state mechanism. Which is kind of strange, as I would expect it to just be an in-memory dictionary or something. Anyway milvus worked for me on this project.
@DataProfessor 4 หลายเดือนก่อน ⁺²
Great video Tony!
@tonykipkemboi 4 หลายเดือนก่อน ⁺¹
Thank you, @DataProfessor! 😊
@rickyS-D76 4 หลายเดือนก่อน ⁺¹
Awesome, it will be great to include the requirements file as, having problem with "pdfplumber", which version it should be.
@tonykipkemboi 4 หลายเดือนก่อน ⁺¹
Yes, will update this today with better README instructions as well.
@anurag040891 2 หลายเดือนก่อน ⁺¹
Great Video Tony, I want to know if what is the method to display the metadata of the pdf while publishing the answer ?
@tonykipkemboi 2 หลายเดือนก่อน
@@anurag040891 more of like citations? i did experiment with it a bit but wasn't happy with it to add it to the video.
@saulojesusbricenowong2758 4 หลายเดือนก่อน ⁺¹
Hi from Peru!!! Great video!! thanks!!
@tonykipkemboi 4 หลายเดือนก่อน
@@saulojesusbricenowong2758 thank you 😊
@muhdkahfi5870 3 หลายเดือนก่อน ⁺⁵
why does the streamlit disconnected after the embedding process is completed?
@tonykipkemboi 3 หลายเดือนก่อน ⁺¹
What do you mean when you say disconnected?
@gilfcr8620 2 หลายเดือนก่อน
@@tonykipkemboi Hello Tony and thank you for your amazing job, i firu out the same error actually :
2024-09-10 18:12:42 - INFO - HTTP Request: GET localhost:11434/api/tags "HTTP/1.1 200 OK"
2024-09-10 18:12:45 - INFO - HTTP Request: GET localhost:11434/api/tags "HTTP/1.1 200 OK"
2024-09-10 18:12:45 - INFO - Creating vector DB from file upload: mypdf.pdf
2024-09-10 18:12:45 - INFO - File saved to temporary path: C:\Users\XXX\AppData\Local\Temp\tmp9feff731\assurance-prevoyance_com21426.pdf
2024-09-10 18:12:45 - INFO - Document split into chunks
2024-09-10 18:12:46 - INFO - Anonymized telemetry enabled. See docs.trychroma.com/telemetry for more information.
OllamaEmbeddings: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:01
What do you think about that ? i'll try to find a solution on my side.
@gilfcr8620 2 หลายเดือนก่อน
@@tonykipkemboi Hello Tony ands thank you for your amazing job ! I figure out the same error actually :
2024-09-10 18:12:42 - INFO - HTTP Request: GET localhost:11434/api/tags "HTTP/1.1 200 OK"
2024-09-10 18:12:45 - INFO - HTTP Request: GET localhost:11434/api/tags "HTTP/1.1 200 OK"
2024-09-10 18:12:45 - INFO - Creating vector DB from file upload: maypdf.pdf
2024-09-10 18:12:45 - INFO - File saved to temporary path: C:\Users\XXX\AppData\Local\Temp\tmp9feff731\assurance-prevoyance_com21426.pdf
2024-09-10 18:12:45 - INFO - Document split into chunks
2024-09-10 18:12:46 - INFO - Anonymized telemetry enabled. See docs.trychroma.com/telemetry for more information.
OllamaEmbeddings: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:01
What do you think ?
@arutchelvana33 หลายเดือนก่อน
@@tonykipkemboi Same issue, After ollamaEmbeddings: 100% log , program comes out and it does not even log "Vector DB Create". Program ended with command prompt showing program stopped. Please help
@Drundle หลายเดือนก่อน
same issue
@pranavmagadi2224 4 หลายเดือนก่อน ⁺²
This is really cool! Have you tried adding highlights to PDF for citation?
@tonykipkemboi 4 หลายเดือนก่อน
I haven't actually but thought about it. Probably a future videoi on this for sure.
@dhatric6418 4 หลายเดือนก่อน ⁺¹
Every time I close, I have to upload the pdf again and it creates chunks and embeddings from scratch even if it is the same pdf.
And it always says no question in the prompt. But generated queries is working.I am using Llama3
@tonykipkemboi 4 หลายเดือนก่อน
@@dhatric6418 first of all, i found llama3 not to be the best for this IMO and that's why i used Mistral. The other thing is that Streamlit refreshes the page from top to bottom every time to interact with a widget on the app. I added session state to mitigate this but if you refresh the page, it automatically starts a new fresh session.
@dhatric6418 4 หลายเดือนก่อน ⁺¹
@@tonykipkemboi yeah It worked thank you so much
@techchitti2.o118 4 หลายเดือนก่อน ⁺¹
Awesome. i have one question here , how to handle pdf file content like two columns in same page?
@tonykipkemboi 4 หลายเดือนก่อน
You can have a selectbox where you have a drop down to pick a pdf name and it gets displayed on the container on the left column. Otherwise if you decide to create two columns for pdf files, then the 3rd column for chat will not be big enough and clearly readable for the user.
@techchitti2.o118 4 หลายเดือนก่อน
@@tonykipkemboi , How can handle if PDF file content with two different layouts on the same page be effectively managed? How does llm model determine the context of the chunks/tokens? Ex : aklassapart.wordpress.com/wp-content/uploads/2010/12/screenshot.png
@DigitaTransforma 4 หลายเดือนก่อน ⁺²
Nice tutorial Bro. How can you host applications like this that is powered by Ollama? Especially low cost approach
@tonykipkemboi 4 หลายเดือนก่อน ⁺¹
Thank you! I believe you can spin up an EC2 instance on AWS and download Ollama + your model of choice then upload your code to the same instance. This is an example that I haven't deployed or tested yet, but theoretically it should work.
@DigitaTransforma 4 หลายเดือนก่อน ⁺¹
@@tonykipkemboi nice. Thanks for the knowledge. 👍🏽
@tonykipkemboi 4 หลายเดือนก่อน
@@DigitaTransforma 👌
@bhagavanprasad 4 หลายเดือนก่อน ⁺¹
A valuable information and excellent delivery. Thank you so much.
I am subscribing to your channel.
If I have any questions how do I post it to you?
Thank you once again
@tonykipkemboi 4 หลายเดือนก่อน
@@bhagavanprasad thank you. Just post it in the comments, I'm pretty responsive.
@MohanishPrerna 3 หลายเดือนก่อน
Hello Sir, First thanks for making this video. I am trying this solution in my local Windows Machine but while uploading File getting below error, "OSError: [WinError 126] The specified module could not be found. Error loading "C:\Data\GharAdhar\workspace\ollama_pdf_rag\venv\lib\site-packages\torch\lib\fbgemm.dll" or one of its dependencies." Can you please quide me what I am missing.
@MyHarshitgola 2 หลายเดือนก่อน ⁺¹
Great content. I'm getting an error when I try to upload the file on the streamlit interface. I also tried to run the local_ollama_rag.ipynb file in juptyer notebook and get the same error when I execute upload pdf tab. can someone advice how to resolve this?
OSError: No such file or directory: '/Users/nltk_data/tokenizers/punkt/PY3_tab'.
@tonykipkemboi 2 หลายเดือนก่อน
Thank you. Maybe try installing "nltk" package to see if it resolves the issue?
@MyHarshitgola 2 หลายเดือนก่อน
@@tonykipkemboi Are there any code dependencies for the files to be sourced from the folder 'PY3_tab'. When I installed nltk, I only saw folder PY3 not PY3_tab.
@UrveshkumarKoshti 4 หลายเดือนก่อน ⁺¹
Nice video! Could you please let me know which python version you use in your code mentioned in your video?
@tonykipkemboi 4 หลายเดือนก่อน
I used Python v3.12
@UrveshkumarKoshti 4 หลายเดือนก่อน ⁺¹
Did you create the Python Virtual enviornment to create and run this code?
@tonykipkemboi 4 หลายเดือนก่อน
Yes, I usually create a venv for each project I am working on; good practice. I also have videos under the Python tutorials playlist that demonstrate how to do it and automate it as well.
@abhishekm6703 3 หลายเดือนก่อน
Is it possible to make a video about a chatbot using Groq and open source models and open embeddings which is shareable and used by others.
It must be pre-trained on data for example Google drive link containing videos, photos, multiple pdf files, website urls.
@HassanAli-tv6fc 4 หลายเดือนก่อน ⁺²
can u please make another video how to deploy this streamlit UI online or any website???
@tonykipkemboi 4 หลายเดือนก่อน ⁺¹
Deployment is easy using Streamlit Community Cloud. The challenge is how to get the LLM model to production/deployed. Might need a different deployment strategy like using something like AWS EC2 to load Ollama + setup the code to run there in a container. Could work but might not be the best solution since accuracy is sacrificed a lot.
@HassanAli-tv6fc 4 หลายเดือนก่อน ⁺¹
@@tonykipkemboi a tutorial will be really helpful for beginners
Your support will be appreciated ❤️❤️
@tonykipkemboi 4 หลายเดือนก่อน ⁺¹
@@HassanAli-tv6fc I hear you. I'll add it to my list of videos.
@HassanAli-tv6fc 4 หลายเดือนก่อน ⁺¹
@@tonykipkemboi it will be really helpful coz you are the 1 we understand easily ❤️
@akshatsingh2148 4 หลายเดือนก่อน ⁺²
It's giving me the below error on running the streamlit app:
ConnectError: [WinError 10061] No connection could be made because the target machine actively refused it
Please look into it
@tonykipkemboi 4 หลายเดือนก่อน
How are you running the app in the terminal?
@akshatsingh2148 4 หลายเดือนก่อน
@@tonykipkemboi I wrote "streamlit run app.py" -> the usual way
@ConfusedSurvivor 4 หลายเดือนก่อน
I wrote "streamlit run app.py" -> the usual way
@JDP-uq7zn 4 หลายเดือนก่อน
I am having the same issue
@tonykipkemboi 4 หลายเดือนก่อน
@@JDP-uq7zn I assume that you're using a Windows system right?
@javytechnologies5779 4 หลายเดือนก่อน ⁺¹
Hey,
PermissionError: [Errno 13] Permission denied: 'C:\\Users\\Javis\\AppData\\Local\\Temp\\tmp94faevfg'
I keep getting this error despite having the required permissions. Any ideas?
@tonykipkemboi 4 หลายเดือนก่อน
@@javytechnologies5779 can you share more error logs? Check out this StackOverflow issue that is related to yours for more options stackoverflow.com/questions/36434764/permissionerror-errno-13-permission-denied...
I'd also recommend pasting the error to chatgpt and let it help you with troubleshooting because this seems to be a file error issue.
@ziayounasch 4 หลายเดือนก่อน ⁺¹
I am also facing the same issue still couldn't resolve it... Tried alot of things but no success... I have also created an issue on your GitHub Repo... Please guide.
@tonykipkemboi 4 หลายเดือนก่อน
@@ziayounasch try the solution above
@ziayounasch 4 หลายเดือนก่อน
@@tonykipkemboi I have tried multiple solutions from the link you mentioned but no success... I don't know what is wrong? Still wondering
@accidentalUser6657 2 หลายเดือนก่อน
Did you try set execution policy
@michelecarbonella2189 4 หลายเดือนก่อน ⁺¹
How many documents is it possible upload if I change this argument accept_multiple_files=False to True?
@tonykipkemboi 4 หลายเดือนก่อน ⁺¹
You can theoretically upload a billion files but who does that, right. The only thing you'd have to keep in mind is if the documents are not of the same topic, then the context will be messed up and random unless you create several collections for each different file. You'd also need to handle retrieval of context from the correct collection.
@accidentalUser6657 2 หลายเดือนก่อน ⁺¹
Can you upload a tutorial where from scratch, we can create AI assistant on data we have....eg : HR assistant...
@tonykipkemboi 2 หลายเดือนก่อน
I think you can adapt this tutorial so long as your documents are all PDFs. You can also easily modify to other document types.
@kreddy8621 4 หลายเดือนก่อน ⁺¹
Brilliant , great content. Cheerz
@tonykipkemboi 4 หลายเดือนก่อน
@letlive1796 3 หลายเดือนก่อน ⁺¹
Can we deploy this in streamlet?
if yes Can you make a video about it.
@tonykipkemboi 3 หลายเดือนก่อน ⁺¹
You can deploy it but you will also need to deploy an instance of Ollama. You could deploy it to any VPS like EC2 -> download Ollama -> load Streamlit
@okra3000 4 หลายเดือนก่อน ⁺¹
Is it possible to do this on a r-pi 5 with NVMe M.2 SSD 1TB?
@tonykipkemboi 4 หลายเดือนก่อน ⁺¹
mmmh that's interesting. I think it might work. all you need is to download Ollama on it and the models as well. memory seems to be good but I'd be interested in actually trying it out. Let me know if it works.
@BootBoot-rl1kv 2 หลายเดือนก่อน
Can anyone help as I am unable to upload a pdf file. I am getting the following error message AxiosError: Request failed with status code 403
@spotnuru83 หลายเดือนก่อน
did you write the code exactly or are there any changes, please check that first, also make sure the versions of various packages and modules are latest or not.
@spotnuru83 หลายเดือนก่อน
Really nice to see the next video, that is to create a UI for chatting with PDF file, but I am facing one issue here. I am getting an error at the statement loader.load() which gives the following error. ImportError: DLL load failed while importing onnx_cpp2py_export: A dynamic link library (DLL) initialization routine failed.
This all started after I installed the streamlit and ollama packages, and because of this even the earlier code, started throwing error at the same line of code. any help in resolving this issue will be of great help, also in the previous "non ui based chatting with PDF", i had to install langchain_community and not langchain, only then it worked earlier.
@spotnuru83 หลายเดือนก่อน ⁺¹
I tried using the packages that you specified in the requirements file, but with this a new problem is coming, the moment when its doing the ollamaembeddings the program is exiting silently without throwing any error, I think other's also mentioned the same thing
@spotnuru83 หลายเดือนก่อน
I think the errror " ImportError: DLL load failed while importing onnx_cpp2py_export: A dynamic link library (DLL) initialization routine failed. " is coming after installation of streamlit
@arutchelvana33 หลายเดือนก่อน
@@spotnuru83 Same issue here
@bhagavanprasad 4 หลายเดือนก่อน ⁺¹
I am getting error saying "Question: (None specified"
Based on the provided context, I will answer the question.
Question: (None specified)
Since there is no specific question, I will assume that the goal is to summarize Bhagavan Prasad's work experience and skills.
Am I doing any mistake?
@tonykipkemboi 4 หลายเดือนก่อน
@@bhagavanprasad did you enter a question when the input box popped up?
@bhagavanprasad 4 หลายเดือนก่อน ⁺¹
@@tonykipkemboi Thank you very very much for quick reply.
I am giving the input at streamlit UI, but still facing the issue.
By the way, I am running the code as is, without any modifications.
Meantime I am also trying to debug the issue
@tonykipkemboi 4 หลายเดือนก่อน
@@bhagavanprasad it seems to say that you didn't ask a question, which is a bit weird. Can you it more detail of the model you're using?
@bhagavanprasad 4 หลายเดือนก่อน
@@tonykipkemboi These are the logs
2024-07-09 11:48:11 - INFO - template :Answer the question based ONLY on the following context:
{context}
Question: {question}
If you don't know the answer, just say that you don't know, don't try to make up an answer.
Only provide the answer from the {context}, nothing else.
Add snippets of the context you used to answer the question.
2024-07-09 11:48:11 - INFO - prompt :input_variables=['context', 'question'] messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], template="Answer the question based ONLY on the following context:

{context}

Question: {question}

If you don't know the answer, just say that you don't know, don't try to make up an answer.
Only provide the answer from the {context}, nothing else.
Add snippets of the context you used to answer the question.
"))]
2024-07-09 11:49:01 - INFO - Generated queries: ['Here are three different versions of the original question:', '', '"What is Bhagavan\'s contact information?"', '"What can you tell me about Bhagavan\'s communication details?"', '"How do I get in touch with Bhagavan?"', '', 'These alternative questions aim to capture the essence of the original query while rephrasing it in slightly different ways. By doing so, we can increase the chances of retrieving relevant documents from the vector database that might not have matched exactly with the original question.']
@karthikb.s.k.4486 4 หลายเดือนก่อน ⁺¹
Nice tutorial will the above code works in a machine with mackbook pro 2015 model with 8gb Ram and 256 gb ssd? Please let me know. Or should i need to use google colab?
@tonykipkemboi 4 หลายเดือนก่อน
Yes this is possible. The only limitation I can think of with your setup is memory to handle the embeddings model and the LLM of your choice like Mistral or Llama3. If you feel constrained with memory, try downloading the quantized versions which will have low memory requirements.
@karthikb.s.k.4486 4 หลายเดือนก่อน ⁺¹
@@tonykipkemboi Any material on how to download the quantized versions.I am
New to this field. Thank you
@tonykipkemboi 4 หลายเดือนก่อน
@@karthikb.s.k.4486 if you go on ollama.com/library/llama3 for example, and select the dropdown button and scroll down, you'll see model versions that are smaller (quantized) that might be a better fit to your system memory reqs. Click on one that you want to use then on the right side you'll see the command that you need to run to install your selection.
@karthikb.s.k.4486 4 หลายเดือนก่อน ⁺¹
@@tonykipkemboi Thank you
@lloydjohnson558 4 หลายเดือนก่อน ⁺¹
Does mistral run localy?
@tonykipkemboi 4 หลายเดือนก่อน ⁺¹
Yes, everything is running locally. I'm using Ollama to run the models.
@lloydjohnson558 4 หลายเดือนก่อน ⁺¹
@@tonykipkemboi word! i looked and it appears mistral is entirely docker ran, does your implementation require an Nvidia Gfx card? I get lab tests back regularly in PDF format but the API is pricey. Wondering i can set up an RPI to automate the collection of data from those lab tests onto a spreadsheet, or maybe i need a slightly beefier computer to do this.
@lloydjohnson558 4 หลายเดือนก่อน ⁺¹
ive tried various services to do this but the lab results are very dynamic and context is needed to understand the values its giving so using LLM to do this makes more sense to me
@pranav-codes 4 หลายเดือนก่อน ⁺¹
Please create a requirements.txt file for the modules. 😢
@tonykipkemboi 4 หลายเดือนก่อน
Just added!
@JDP-uq7zn 4 หลายเดือนก่อน ⁺¹
Correct
@tonywhite4476 4 หลายเดือนก่อน ⁺¹
RAGs are useless.
@tonykipkemboi 4 หลายเดือนก่อน
can you share more on why "RAG is useless"?
@DigitaTransforma 4 หลายเดือนก่อน ⁺¹
Since RAGs are useless, would you prefer fine-tuning models then? I guess you have millions of dollars to spend
@SumipeguShorts 4 หลายเดือนก่อน ⁺¹
❤Support me plz
@tonykipkemboi 4 หลายเดือนก่อน
Support how?
@gilfcr8620 2 หลายเดือนก่อน ⁺¹
Hello Tony and thank you for your amazing Job !! I have just an error that actually streamlit always interrup it self like that :
2024-09-10 18:24:16 - INFO - HTTP Request: GET localhost:11434/api/tags "HTTP/1.1 200 OK"
2024-09-10 18:24:16 - INFO - Extracting model names from models_info
2024-09-10 18:24:16 - INFO - Extracted model names: ('nomic-embed-text:latest', 'mistral-nemo:latest', 'mistral:latest')
2024-09-10 18:24:18 - INFO - HTTP Request: GET localhost:11434/api/tags "HTTP/1.1 200 OK"
2024-09-10 18:24:21 - INFO - HTTP Request: GET localhost:11434/api/tags "HTTP/1.1 200 OK"
2024-09-10 18:24:21 - INFO - Creating vector DB from file upload: monopoly.pdf
2024-09-10 18:24:21 - INFO - File saved to temporary path: C:\Users\XXX\AppData\Local\Temp\tmpobt9kjka\monopoly.pdf
2024-09-10 18:24:24 - INFO - pikepdf C++ to Python logger bridge initialized
2024-09-10 18:24:25 - INFO - Document split into chunks
OllamaEmbeddings: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00
Let me know if you want more info :)
@tonykipkemboi 2 หลายเดือนก่อน
@@gilfcr8620 thanks for posting. can you share the error message itself? these seem to be just the logs. also describe what you're seeing/expecting and where it fails.
@gilfcr8620 2 หลายเดือนก่อน
@@tonykipkemboi Humm i tried to run this application with a debug session in VScode and no error appear... it just get out after embeddings...
What is your advise ? i'm not a full time developer but i can click where you want :)
@gilfcr8620 2 หลายเดือนก่อน
@@tonykipkemboi Ok i find it !! this is a Chroma issue, i just change :
vector_db = Chroma.from_documents(documents=chunks, embedding=embeddings, collection_name="myRAG"
by :
vector_db = Chroma.from_documents(persist_directory=CHROMA_PATH, documents=chunks, embedding=embeddings, collection_name="myRAG"
And use langchain_chroma package.
Thank you !

ต่อไป

เล่นอัตโนมัติ

How to build the FASTEST AI chatbot with Groq and Streamlit