If you like this video, you may also like these related videos: 👉🏾In Part 1 of this series, we create the RAG pipeline: th-cam.com/video/ztBJqzBU5kc/w-d-xo.html 👉🏾In Part 2, We created a Streamlit UI for local Ollama models: th-cam.com/video/bAI_jWsLhFM/w-d-xo.htmlsi=oHwvPjGLxO7l-HXL What kind of videos would you like to see more of? Please let me know in the comments below!
Great intro to RAG with a nice UI to boot. Well done. For others following along, I had a ton of problems using Chroma. It would lose it's mind between requests (the vector_db became empty for some reason). I had to swap in a milvus vector_db. When I did that the whole bit of code became much more reliable. I was able to interrogate a PDF document for about 45 minutes. The responses were cogent with the gemma2 model and sometimes useful! Note I had no problem with Chroma in the previous published video. I am thinking this is an interaction problem with Chroma and streamlit's state mechanism. Which is kind of strange, as I would expect it to just be an in-memory dictionary or something. Anyway milvus worked for me on this project.
@@tonykipkemboi Hello Tony and thank you for your amazing job, i firu out the same error actually : 2024-09-10 18:12:42 - INFO - HTTP Request: GET localhost:11434/api/tags "HTTP/1.1 200 OK" 2024-09-10 18:12:45 - INFO - HTTP Request: GET localhost:11434/api/tags "HTTP/1.1 200 OK" 2024-09-10 18:12:45 - INFO - Creating vector DB from file upload: mypdf.pdf 2024-09-10 18:12:45 - INFO - File saved to temporary path: C:\Users\XXX\AppData\Local\Temp\tmp9feff731\assurance-prevoyance_com21426.pdf 2024-09-10 18:12:45 - INFO - Document split into chunks 2024-09-10 18:12:46 - INFO - Anonymized telemetry enabled. See docs.trychroma.com/telemetry for more information. OllamaEmbeddings: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:01 What do you think about that ? i'll try to find a solution on my side.
@@tonykipkemboi Hello Tony ands thank you for your amazing job ! I figure out the same error actually : 2024-09-10 18:12:42 - INFO - HTTP Request: GET localhost:11434/api/tags "HTTP/1.1 200 OK" 2024-09-10 18:12:45 - INFO - HTTP Request: GET localhost:11434/api/tags "HTTP/1.1 200 OK" 2024-09-10 18:12:45 - INFO - Creating vector DB from file upload: maypdf.pdf 2024-09-10 18:12:45 - INFO - File saved to temporary path: C:\Users\XXX\AppData\Local\Temp\tmp9feff731\assurance-prevoyance_com21426.pdf 2024-09-10 18:12:45 - INFO - Document split into chunks 2024-09-10 18:12:46 - INFO - Anonymized telemetry enabled. See docs.trychroma.com/telemetry for more information. OllamaEmbeddings: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:01 What do you think ?
@@tonykipkemboi Same issue, After ollamaEmbeddings: 100% log , program comes out and it does not even log "Vector DB Create". Program ended with command prompt showing program stopped. Please help
Every time I close, I have to upload the pdf again and it creates chunks and embeddings from scratch even if it is the same pdf. And it always says no question in the prompt. But generated queries is working.I am using Llama3
@@dhatric6418 first of all, i found llama3 not to be the best for this IMO and that's why i used Mistral. The other thing is that Streamlit refreshes the page from top to bottom every time to interact with a widget on the app. I added session state to mitigate this but if you refresh the page, it automatically starts a new fresh session.
You can have a selectbox where you have a drop down to pick a pdf name and it gets displayed on the container on the left column. Otherwise if you decide to create two columns for pdf files, then the 3rd column for chat will not be big enough and clearly readable for the user.
@@tonykipkemboi , How can handle if PDF file content with two different layouts on the same page be effectively managed? How does llm model determine the context of the chunks/tokens? Ex : aklassapart.wordpress.com/wp-content/uploads/2010/12/screenshot.png
Thank you! I believe you can spin up an EC2 instance on AWS and download Ollama + your model of choice then upload your code to the same instance. This is an example that I haven't deployed or tested yet, but theoretically it should work.
A valuable information and excellent delivery. Thank you so much. I am subscribing to your channel. If I have any questions how do I post it to you? Thank you once again
Hello Sir, First thanks for making this video. I am trying this solution in my local Windows Machine but while uploading File getting below error, "OSError: [WinError 126] The specified module could not be found. Error loading "C:\Data\GharAdhar\workspace\ollama_pdf_rag\venv\lib\site-packages\torch\lib\fbgemm.dll" or one of its dependencies." Can you please quide me what I am missing.
Great content. I'm getting an error when I try to upload the file on the streamlit interface. I also tried to run the local_ollama_rag.ipynb file in juptyer notebook and get the same error when I execute upload pdf tab. can someone advice how to resolve this? OSError: No such file or directory: '/Users/nltk_data/tokenizers/punkt/PY3_tab'.
@@tonykipkemboi Are there any code dependencies for the files to be sourced from the folder 'PY3_tab'. When I installed nltk, I only saw folder PY3 not PY3_tab.
Yes, I usually create a venv for each project I am working on; good practice. I also have videos under the Python tutorials playlist that demonstrate how to do it and automate it as well.
Is it possible to make a video about a chatbot using Groq and open source models and open embeddings which is shareable and used by others. It must be pre-trained on data for example Google drive link containing videos, photos, multiple pdf files, website urls.
Deployment is easy using Streamlit Community Cloud. The challenge is how to get the LLM model to production/deployed. Might need a different deployment strategy like using something like AWS EC2 to load Ollama + setup the code to run there in a container. Could work but might not be the best solution since accuracy is sacrificed a lot.
It's giving me the below error on running the streamlit app: ConnectError: [WinError 10061] No connection could be made because the target machine actively refused it Please look into it
Hey, PermissionError: [Errno 13] Permission denied: 'C:\\Users\\Javis\\AppData\\Local\\Temp\\tmp94faevfg' I keep getting this error despite having the required permissions. Any ideas?
@@javytechnologies5779 can you share more error logs? Check out this StackOverflow issue that is related to yours for more options stackoverflow.com/questions/36434764/permissionerror-errno-13-permission-denied... I'd also recommend pasting the error to chatgpt and let it help you with troubleshooting because this seems to be a file error issue.
I am also facing the same issue still couldn't resolve it... Tried alot of things but no success... I have also created an issue on your GitHub Repo... Please guide.
You can theoretically upload a billion files but who does that, right. The only thing you'd have to keep in mind is if the documents are not of the same topic, then the context will be messed up and random unless you create several collections for each different file. You'd also need to handle retrieval of context from the correct collection.
mmmh that's interesting. I think it might work. all you need is to download Ollama on it and the models as well. memory seems to be good but I'd be interested in actually trying it out. Let me know if it works.
did you write the code exactly or are there any changes, please check that first, also make sure the versions of various packages and modules are latest or not.
Really nice to see the next video, that is to create a UI for chatting with PDF file, but I am facing one issue here. I am getting an error at the statement loader.load() which gives the following error. ImportError: DLL load failed while importing onnx_cpp2py_export: A dynamic link library (DLL) initialization routine failed. This all started after I installed the streamlit and ollama packages, and because of this even the earlier code, started throwing error at the same line of code. any help in resolving this issue will be of great help, also in the previous "non ui based chatting with PDF", i had to install langchain_community and not langchain, only then it worked earlier.
I tried using the packages that you specified in the requirements file, but with this a new problem is coming, the moment when its doing the ollamaembeddings the program is exiting silently without throwing any error, I think other's also mentioned the same thing
I think the errror " ImportError: DLL load failed while importing onnx_cpp2py_export: A dynamic link library (DLL) initialization routine failed. " is coming after installation of streamlit
I am getting error saying "Question: (None specified" Based on the provided context, I will answer the question. Question: (None specified) Since there is no specific question, I will assume that the goal is to summarize Bhagavan Prasad's work experience and skills. Am I doing any mistake?
@@tonykipkemboi Thank you very very much for quick reply. I am giving the input at streamlit UI, but still facing the issue. By the way, I am running the code as is, without any modifications. Meantime I am also trying to debug the issue
@@tonykipkemboi These are the logs 2024-07-09 11:48:11 - INFO - template :Answer the question based ONLY on the following context: {context} Question: {question} If you don't know the answer, just say that you don't know, don't try to make up an answer. Only provide the answer from the {context}, nothing else. Add snippets of the context you used to answer the question. 2024-07-09 11:48:11 - INFO - prompt :input_variables=['context', 'question'] messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], template="Answer the question based ONLY on the following context:
{context}
Question: {question}
If you don't know the answer, just say that you don't know, don't try to make up an answer. Only provide the answer from the {context}, nothing else. Add snippets of the context you used to answer the question. "))] 2024-07-09 11:49:01 - INFO - Generated queries: ['Here are three different versions of the original question:', '', '"What is Bhagavan\'s contact information?"', '"What can you tell me about Bhagavan\'s communication details?"', '"How do I get in touch with Bhagavan?"', '', 'These alternative questions aim to capture the essence of the original query while rephrasing it in slightly different ways. By doing so, we can increase the chances of retrieving relevant documents from the vector database that might not have matched exactly with the original question.']
Nice tutorial will the above code works in a machine with mackbook pro 2015 model with 8gb Ram and 256 gb ssd? Please let me know. Or should i need to use google colab?
Yes this is possible. The only limitation I can think of with your setup is memory to handle the embeddings model and the LLM of your choice like Mistral or Llama3. If you feel constrained with memory, try downloading the quantized versions which will have low memory requirements.
@@karthikb.s.k.4486 if you go on ollama.com/library/llama3 for example, and select the dropdown button and scroll down, you'll see model versions that are smaller (quantized) that might be a better fit to your system memory reqs. Click on one that you want to use then on the right side you'll see the command that you need to run to install your selection.
@@tonykipkemboi word! i looked and it appears mistral is entirely docker ran, does your implementation require an Nvidia Gfx card? I get lab tests back regularly in PDF format but the API is pricey. Wondering i can set up an RPI to automate the collection of data from those lab tests onto a spreadsheet, or maybe i need a slightly beefier computer to do this.
ive tried various services to do this but the lab results are very dynamic and context is needed to understand the values its giving so using LLM to do this makes more sense to me
Hello Tony and thank you for your amazing Job !! I have just an error that actually streamlit always interrup it self like that : 2024-09-10 18:24:16 - INFO - HTTP Request: GET localhost:11434/api/tags "HTTP/1.1 200 OK" 2024-09-10 18:24:16 - INFO - Extracting model names from models_info 2024-09-10 18:24:16 - INFO - Extracted model names: ('nomic-embed-text:latest', 'mistral-nemo:latest', 'mistral:latest') 2024-09-10 18:24:18 - INFO - HTTP Request: GET localhost:11434/api/tags "HTTP/1.1 200 OK" 2024-09-10 18:24:21 - INFO - HTTP Request: GET localhost:11434/api/tags "HTTP/1.1 200 OK" 2024-09-10 18:24:21 - INFO - Creating vector DB from file upload: monopoly.pdf 2024-09-10 18:24:21 - INFO - File saved to temporary path: C:\Users\XXX\AppData\Local\Temp\tmpobt9kjka\monopoly.pdf 2024-09-10 18:24:24 - INFO - pikepdf C++ to Python logger bridge initialized 2024-09-10 18:24:25 - INFO - Document split into chunks OllamaEmbeddings: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00 Let me know if you want more info :)
@@gilfcr8620 thanks for posting. can you share the error message itself? these seem to be just the logs. also describe what you're seeing/expecting and where it fails.
@@tonykipkemboi Humm i tried to run this application with a debug session in VScode and no error appear... it just get out after embeddings... What is your advise ? i'm not a full time developer but i can click where you want :)
@@tonykipkemboi Ok i find it !! this is a Chroma issue, i just change : vector_db = Chroma.from_documents(documents=chunks, embedding=embeddings, collection_name="myRAG" by : vector_db = Chroma.from_documents(persist_directory=CHROMA_PATH, documents=chunks, embedding=embeddings, collection_name="myRAG" And use langchain_chroma package. Thank you !
If you like this video, you may also like these related videos:
👉🏾In Part 1 of this series, we create the RAG pipeline: th-cam.com/video/ztBJqzBU5kc/w-d-xo.html
👉🏾In Part 2, We created a Streamlit UI for local Ollama models: th-cam.com/video/bAI_jWsLhFM/w-d-xo.htmlsi=oHwvPjGLxO7l-HXL
What kind of videos would you like to see more of?
Please let me know in the comments below!
I looking for 2 weeks this type of video but finally I got it Thank you brother.
the most awaited video on YT for RAG!
we also need a robust CSV RAG, and other amazing RAG + other exciting videos as well!
also please increase your frequency of dropping videos. content like urs is Gold
@@THE-AI_INSIDER thank you. I am planning on doing this more often.
@@THE-AI_INSIDER I'm working on the CSV/Excel/structured data RAG
@@tonykipkemboi you rock
Great intro to RAG with a nice UI to boot. Well done. For others following along, I had a ton of problems using Chroma. It would lose it's mind between requests (the vector_db became empty for some reason). I had to swap in a milvus vector_db. When I did that the whole bit of code became much more reliable. I was able to interrogate a PDF document for about 45 minutes. The responses were cogent with the gemma2 model and sometimes useful! Note I had no problem with Chroma in the previous published video. I am thinking this is an interaction problem with Chroma and streamlit's state mechanism. Which is kind of strange, as I would expect it to just be an in-memory dictionary or something. Anyway milvus worked for me on this project.
Great video Tony!
Thank you, @DataProfessor! 😊
Awesome, it will be great to include the requirements file as, having problem with "pdfplumber", which version it should be.
Yes, will update this today with better README instructions as well.
Great Video Tony, I want to know if what is the method to display the metadata of the pdf while publishing the answer ?
@@anurag040891 more of like citations? i did experiment with it a bit but wasn't happy with it to add it to the video.
Hi from Peru!!! Great video!! thanks!!
@@saulojesusbricenowong2758 thank you 😊
why does the streamlit disconnected after the embedding process is completed?
What do you mean when you say disconnected?
@@tonykipkemboi Hello Tony and thank you for your amazing job, i firu out the same error actually :
2024-09-10 18:12:42 - INFO - HTTP Request: GET localhost:11434/api/tags "HTTP/1.1 200 OK"
2024-09-10 18:12:45 - INFO - HTTP Request: GET localhost:11434/api/tags "HTTP/1.1 200 OK"
2024-09-10 18:12:45 - INFO - Creating vector DB from file upload: mypdf.pdf
2024-09-10 18:12:45 - INFO - File saved to temporary path: C:\Users\XXX\AppData\Local\Temp\tmp9feff731\assurance-prevoyance_com21426.pdf
2024-09-10 18:12:45 - INFO - Document split into chunks
2024-09-10 18:12:46 - INFO - Anonymized telemetry enabled. See docs.trychroma.com/telemetry for more information.
OllamaEmbeddings: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:01
What do you think about that ? i'll try to find a solution on my side.
@@tonykipkemboi Hello Tony ands thank you for your amazing job ! I figure out the same error actually :
2024-09-10 18:12:42 - INFO - HTTP Request: GET localhost:11434/api/tags "HTTP/1.1 200 OK"
2024-09-10 18:12:45 - INFO - HTTP Request: GET localhost:11434/api/tags "HTTP/1.1 200 OK"
2024-09-10 18:12:45 - INFO - Creating vector DB from file upload: maypdf.pdf
2024-09-10 18:12:45 - INFO - File saved to temporary path: C:\Users\XXX\AppData\Local\Temp\tmp9feff731\assurance-prevoyance_com21426.pdf
2024-09-10 18:12:45 - INFO - Document split into chunks
2024-09-10 18:12:46 - INFO - Anonymized telemetry enabled. See docs.trychroma.com/telemetry for more information.
OllamaEmbeddings: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:01
What do you think ?
@@tonykipkemboi Same issue, After ollamaEmbeddings: 100% log , program comes out and it does not even log "Vector DB Create". Program ended with command prompt showing program stopped. Please help
same issue
This is really cool! Have you tried adding highlights to PDF for citation?
I haven't actually but thought about it. Probably a future videoi on this for sure.
Every time I close, I have to upload the pdf again and it creates chunks and embeddings from scratch even if it is the same pdf.
And it always says no question in the prompt. But generated queries is working.I am using Llama3
@@dhatric6418 first of all, i found llama3 not to be the best for this IMO and that's why i used Mistral. The other thing is that Streamlit refreshes the page from top to bottom every time to interact with a widget on the app. I added session state to mitigate this but if you refresh the page, it automatically starts a new fresh session.
@@tonykipkemboi yeah It worked thank you so much
Awesome. i have one question here , how to handle pdf file content like two columns in same page?
You can have a selectbox where you have a drop down to pick a pdf name and it gets displayed on the container on the left column. Otherwise if you decide to create two columns for pdf files, then the 3rd column for chat will not be big enough and clearly readable for the user.
@@tonykipkemboi , How can handle if PDF file content with two different layouts on the same page be effectively managed? How does llm model determine the context of the chunks/tokens? Ex : aklassapart.wordpress.com/wp-content/uploads/2010/12/screenshot.png
Nice tutorial Bro. How can you host applications like this that is powered by Ollama? Especially low cost approach
Thank you! I believe you can spin up an EC2 instance on AWS and download Ollama + your model of choice then upload your code to the same instance. This is an example that I haven't deployed or tested yet, but theoretically it should work.
@@tonykipkemboi nice. Thanks for the knowledge. 👍🏽
@@DigitaTransforma 👌
A valuable information and excellent delivery. Thank you so much.
I am subscribing to your channel.
If I have any questions how do I post it to you?
Thank you once again
@@bhagavanprasad thank you. Just post it in the comments, I'm pretty responsive.
Hello Sir, First thanks for making this video. I am trying this solution in my local Windows Machine but while uploading File getting below error, "OSError: [WinError 126] The specified module could not be found. Error loading "C:\Data\GharAdhar\workspace\ollama_pdf_rag\venv\lib\site-packages\torch\lib\fbgemm.dll" or one of its dependencies." Can you please quide me what I am missing.
Great content. I'm getting an error when I try to upload the file on the streamlit interface. I also tried to run the local_ollama_rag.ipynb file in juptyer notebook and get the same error when I execute upload pdf tab. can someone advice how to resolve this?
OSError: No such file or directory: '/Users/nltk_data/tokenizers/punkt/PY3_tab'.
Thank you. Maybe try installing "nltk" package to see if it resolves the issue?
@@tonykipkemboi Are there any code dependencies for the files to be sourced from the folder 'PY3_tab'. When I installed nltk, I only saw folder PY3 not PY3_tab.
Nice video! Could you please let me know which python version you use in your code mentioned in your video?
I used Python v3.12
Did you create the Python Virtual enviornment to create and run this code?
Yes, I usually create a venv for each project I am working on; good practice. I also have videos under the Python tutorials playlist that demonstrate how to do it and automate it as well.
Is it possible to make a video about a chatbot using Groq and open source models and open embeddings which is shareable and used by others.
It must be pre-trained on data for example Google drive link containing videos, photos, multiple pdf files, website urls.
can u please make another video how to deploy this streamlit UI online or any website???
Deployment is easy using Streamlit Community Cloud. The challenge is how to get the LLM model to production/deployed. Might need a different deployment strategy like using something like AWS EC2 to load Ollama + setup the code to run there in a container. Could work but might not be the best solution since accuracy is sacrificed a lot.
@@tonykipkemboi a tutorial will be really helpful for beginners
Your support will be appreciated ❤️❤️
@@HassanAli-tv6fc I hear you. I'll add it to my list of videos.
@@tonykipkemboi it will be really helpful coz you are the 1 we understand easily ❤️
It's giving me the below error on running the streamlit app:
ConnectError: [WinError 10061] No connection could be made because the target machine actively refused it
Please look into it
How are you running the app in the terminal?
@@tonykipkemboi I wrote "streamlit run app.py" -> the usual way
I wrote "streamlit run app.py" -> the usual way
I am having the same issue
@@JDP-uq7zn I assume that you're using a Windows system right?
Hey,
PermissionError: [Errno 13] Permission denied: 'C:\\Users\\Javis\\AppData\\Local\\Temp\\tmp94faevfg'
I keep getting this error despite having the required permissions. Any ideas?
@@javytechnologies5779 can you share more error logs? Check out this StackOverflow issue that is related to yours for more options stackoverflow.com/questions/36434764/permissionerror-errno-13-permission-denied...
I'd also recommend pasting the error to chatgpt and let it help you with troubleshooting because this seems to be a file error issue.
I am also facing the same issue still couldn't resolve it... Tried alot of things but no success... I have also created an issue on your GitHub Repo... Please guide.
@@ziayounasch try the solution above
@@tonykipkemboi I have tried multiple solutions from the link you mentioned but no success... I don't know what is wrong? Still wondering
Did you try set execution policy
How many documents is it possible upload if I change this argument accept_multiple_files=False to True?
You can theoretically upload a billion files but who does that, right. The only thing you'd have to keep in mind is if the documents are not of the same topic, then the context will be messed up and random unless you create several collections for each different file. You'd also need to handle retrieval of context from the correct collection.
Can you upload a tutorial where from scratch, we can create AI assistant on data we have....eg : HR assistant...
I think you can adapt this tutorial so long as your documents are all PDFs. You can also easily modify to other document types.
Brilliant , great content. Cheerz
Can we deploy this in streamlet?
if yes Can you make a video about it.
You can deploy it but you will also need to deploy an instance of Ollama. You could deploy it to any VPS like EC2 -> download Ollama -> load Streamlit
Is it possible to do this on a r-pi 5 with NVMe M.2 SSD 1TB?
mmmh that's interesting. I think it might work. all you need is to download Ollama on it and the models as well. memory seems to be good but I'd be interested in actually trying it out. Let me know if it works.
Can anyone help as I am unable to upload a pdf file. I am getting the following error message AxiosError: Request failed with status code 403
did you write the code exactly or are there any changes, please check that first, also make sure the versions of various packages and modules are latest or not.
Really nice to see the next video, that is to create a UI for chatting with PDF file, but I am facing one issue here. I am getting an error at the statement loader.load() which gives the following error. ImportError: DLL load failed while importing onnx_cpp2py_export: A dynamic link library (DLL) initialization routine failed.
This all started after I installed the streamlit and ollama packages, and because of this even the earlier code, started throwing error at the same line of code. any help in resolving this issue will be of great help, also in the previous "non ui based chatting with PDF", i had to install langchain_community and not langchain, only then it worked earlier.
I tried using the packages that you specified in the requirements file, but with this a new problem is coming, the moment when its doing the ollamaembeddings the program is exiting silently without throwing any error, I think other's also mentioned the same thing
I think the errror " ImportError: DLL load failed while importing onnx_cpp2py_export: A dynamic link library (DLL) initialization routine failed. " is coming after installation of streamlit
@@spotnuru83 Same issue here
I am getting error saying "Question: (None specified"
Based on the provided context, I will answer the question.
Question: (None specified)
Since there is no specific question, I will assume that the goal is to summarize Bhagavan Prasad's work experience and skills.
Am I doing any mistake?
@@bhagavanprasad did you enter a question when the input box popped up?
@@tonykipkemboi Thank you very very much for quick reply.
I am giving the input at streamlit UI, but still facing the issue.
By the way, I am running the code as is, without any modifications.
Meantime I am also trying to debug the issue
@@bhagavanprasad it seems to say that you didn't ask a question, which is a bit weird. Can you it more detail of the model you're using?
@@tonykipkemboi These are the logs
2024-07-09 11:48:11 - INFO - template :Answer the question based ONLY on the following context:
{context}
Question: {question}
If you don't know the answer, just say that you don't know, don't try to make up an answer.
Only provide the answer from the {context}, nothing else.
Add snippets of the context you used to answer the question.
2024-07-09 11:48:11 - INFO - prompt :input_variables=['context', 'question'] messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], template="Answer the question based ONLY on the following context:
{context}
Question: {question}
If you don't know the answer, just say that you don't know, don't try to make up an answer.
Only provide the answer from the {context}, nothing else.
Add snippets of the context you used to answer the question.
"))]
2024-07-09 11:49:01 - INFO - Generated queries: ['Here are three different versions of the original question:', '', '"What is Bhagavan\'s contact information?"', '"What can you tell me about Bhagavan\'s communication details?"', '"How do I get in touch with Bhagavan?"', '', 'These alternative questions aim to capture the essence of the original query while rephrasing it in slightly different ways. By doing so, we can increase the chances of retrieving relevant documents from the vector database that might not have matched exactly with the original question.']
Nice tutorial will the above code works in a machine with mackbook pro 2015 model with 8gb Ram and 256 gb ssd? Please let me know. Or should i need to use google colab?
Yes this is possible. The only limitation I can think of with your setup is memory to handle the embeddings model and the LLM of your choice like Mistral or Llama3. If you feel constrained with memory, try downloading the quantized versions which will have low memory requirements.
@@tonykipkemboi Any material on how to download the quantized versions.I am
New to this field. Thank you
@@karthikb.s.k.4486 if you go on ollama.com/library/llama3 for example, and select the dropdown button and scroll down, you'll see model versions that are smaller (quantized) that might be a better fit to your system memory reqs. Click on one that you want to use then on the right side you'll see the command that you need to run to install your selection.
@@tonykipkemboi Thank you
Does mistral run localy?
Yes, everything is running locally. I'm using Ollama to run the models.
@@tonykipkemboi word! i looked and it appears mistral is entirely docker ran, does your implementation require an Nvidia Gfx card? I get lab tests back regularly in PDF format but the API is pricey. Wondering i can set up an RPI to automate the collection of data from those lab tests onto a spreadsheet, or maybe i need a slightly beefier computer to do this.
ive tried various services to do this but the lab results are very dynamic and context is needed to understand the values its giving so using LLM to do this makes more sense to me
Please create a requirements.txt file for the modules. 😢
Just added!
Correct
RAGs are useless.
can you share more on why "RAG is useless"?
Since RAGs are useless, would you prefer fine-tuning models then? I guess you have millions of dollars to spend
❤Support me plz
Support how?
Hello Tony and thank you for your amazing Job !! I have just an error that actually streamlit always interrup it self like that :
2024-09-10 18:24:16 - INFO - HTTP Request: GET localhost:11434/api/tags "HTTP/1.1 200 OK"
2024-09-10 18:24:16 - INFO - Extracting model names from models_info
2024-09-10 18:24:16 - INFO - Extracted model names: ('nomic-embed-text:latest', 'mistral-nemo:latest', 'mistral:latest')
2024-09-10 18:24:18 - INFO - HTTP Request: GET localhost:11434/api/tags "HTTP/1.1 200 OK"
2024-09-10 18:24:21 - INFO - HTTP Request: GET localhost:11434/api/tags "HTTP/1.1 200 OK"
2024-09-10 18:24:21 - INFO - Creating vector DB from file upload: monopoly.pdf
2024-09-10 18:24:21 - INFO - File saved to temporary path: C:\Users\XXX\AppData\Local\Temp\tmpobt9kjka\monopoly.pdf
2024-09-10 18:24:24 - INFO - pikepdf C++ to Python logger bridge initialized
2024-09-10 18:24:25 - INFO - Document split into chunks
OllamaEmbeddings: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00
Let me know if you want more info :)
@@gilfcr8620 thanks for posting. can you share the error message itself? these seem to be just the logs. also describe what you're seeing/expecting and where it fails.
@@tonykipkemboi Humm i tried to run this application with a debug session in VScode and no error appear... it just get out after embeddings...
What is your advise ? i'm not a full time developer but i can click where you want :)
@@tonykipkemboi Ok i find it !! this is a Chroma issue, i just change :
vector_db = Chroma.from_documents(documents=chunks, embedding=embeddings, collection_name="myRAG"
by :
vector_db = Chroma.from_documents(persist_directory=CHROMA_PATH, documents=chunks, embedding=embeddings, collection_name="myRAG"
And use langchain_chroma package.
Thank you !