And by the way, the shared code is not using the DB at all. Try to run the notebook without the part of connecting the DB and it will work in the same way. The whole RAG process is built on top of RAM with Llamaindex.
😎Cool stuff and well paced presentation. Interesting, we now have an offline alternative to (preferably) uploading the entire PDF to the web and running a browser based Ai chatbox to run a query.. By the way, Thank you for introducing us to LLAMA 2. Waiting for more videos on this. 👍👍
love your teaching style! It would be great to see another video without the use of gradient or anything that is a paid service.. it gets pretty expensive quickly...
Hi Bhavesh, can you please tell the version of the llama-index you have used for this project. I am having issues with the imports as new versions of llama index seems to have few things changed. Please reply!!
Sir as a medical student, i want to create such system of asking from medical books pdf ,but the size is large around 600 mb and 2000 pages nearly....is it possible to do so...?? Currently i use a website named chatpdf which allow to ask questions by uploading pdf of size not more than 10 mb ,and only 2 pdf daily we can upload....
Suppose I only have unstructured/semi-structured tabular data in pdf then what approach can we use other than RAG framework.My task is to summarise the tabular data
I am trying to solve similar problem of my organisation. So I have some questions.I made a research that uses openAI in background with langchain. 1.I don't want to use openAI for security reasons. 2.can we use this models without internet using python?
Hi Bhavesh, kindly dont rely heavily on third party sites for creating vector db for small usecases, when it can be handled easily on ram using lancedb and other similar packages. currently after 3 months of your upload there are hardly any llama import working most of the things are deprecated which seem natural but with the version clashes its hard to get anything running. I request you to please give out a dependency file too so that versions can be compared when referring to your projects.
how would do or the steps, when the content is web based. The QnA chatbot has to answer the question based on the content present in the given website. We are using llama2 7b and it is not giving accurate answers to the questions asked, the answers has to be from the website, but sometimes it gives additional information that is not part of the website. How would we fine tune and train. It would be helpful if you can share some suggestions or do a video on that
Have you tried testing with multiple papers on the same topic? I’d then the retriever getting all related data from each pdf well? And is able then llama2 to summarise well all the document fragments when they are a much more? Actually I’m building an equivalent setup with videos and using weaviate but I’m facing the above issues. Even with weaviate the LLM part is only compatible with paid APIs now.
1)What is minimum system requirements?😅 2) Sir ,i want to creat chat bot of 6 subject pdf around 1000 pages, pdf are not OCR. 3) is it complete open source or any subscription?
can anyone tell me how can i extract specific content from the PDF and save that in json format? and that content is multiple choice questions i want to extract only those questions with the relevant options of that question but there are other content available which should be excluded
@@bhattbhavesh91 but there are also an other contents which I don't want in response and with my code I am getting it, i got the response in json but not that i wanted
it's using a vector dB to store the unstructured data (pdfs) as structured data (nodes & a node index) the LLM (LLaM2) then retrieves the structured data from vector DB once retrieved & merged with LLM LLM is then ready to apply - input Prompt P - evaluate P (using model) then - output completion C (using all training data LLaM2 & pdf's)
And by the way, the shared code is not using the DB at all. Try to run the notebook without the part of connecting the DB and it will work in the same way. The whole RAG process is built on top of RAM with Llamaindex.
😎Cool stuff and well paced presentation. Interesting, we now have an offline alternative to (preferably) uploading the entire PDF to the web and running a browser based Ai chatbox to run a query.. By the way, Thank you for introducing us to LLAMA 2. Waiting for more videos on this. 👍👍
how i can embeed to my website ?
love your teaching style! It would be great to see another video without the use of gradient or anything that is a paid service.. it gets pretty expensive quickly...
Great suggestion!
Hi Bhavesh, can you please tell the version of the llama-index you have used for this project. I am having issues with the imports as new versions of llama index seems to have few things changed. Please reply!!
Same problem man
Engineer bhai zindabad
Sir as a medical student, i want to create such system of asking from medical books pdf ,but the size is large around 600 mb and 2000 pages nearly....is it possible to do so...??
Currently i use a website named chatpdf which allow to ask questions by uploading pdf of size not more than 10 mb ,and only 2 pdf daily we can upload....
Suppose I only have unstructured/semi-structured tabular data in pdf then what approach can we use other than RAG framework.My task is to summarise the tabular data
What if I want the retrieving the docs? how to get that?
i tried running you streamlit code but it fails somehow due to some st.cache() deprecation , any fixes to it ?
I am trying to solve similar problem of my organisation. So I have some questions.I made a research that uses openAI in background with langchain.
1.I don't want to use openAI for security reasons.
2.can we use this models without internet using python?
cannot import name 'ServiceContext' from 'llama_index'
This is a the error showing up everytime I run the file in Colab. Please help me out
for gradient use
%pip install llama-index-llms-gradient
%pip install llama-index-llms-openai
%pip install llama-index-readers-file pymupdf
%pip install llama-index-finetuning
!pip install llama-index gradientai -q
from llama_index.llms.gradient import GradientBaseModelLLM
Hi Bhavesh, kindly dont rely heavily on third party sites for creating vector db for small usecases, when it can be handled easily on ram using lancedb and other similar packages. currently after 3 months of your upload there are hardly any llama import working most of the things are deprecated which seem natural but with the version clashes its hard to get anything running. I request you to please give out a dependency file too so that versions can be compared when referring to your projects.
I’d would be good be good to test the same with a local llama2 instead of using Gradient and see if the quality of the answer is the same.
I have created a database, but I can't see my code. Kindly guide me where to check the code sir.
Can i use this for my philosophy research papers readings to understand and chat with the PDFs? Or if you any other pdf chatting site ?
Yes!
how would do or the steps, when the content is web based. The QnA chatbot has to answer the question based on the content present in the given website. We are using llama2 7b and it is not giving accurate answers to the questions asked, the answers has to be from the website, but sometimes it gives additional information that is not part of the website. How would we fine tune and train. It would be helpful if you can share some suggestions or do a video on that
Have you tried testing with multiple papers on the same topic? I’d then the retriever getting all related data from each pdf well? And is able then llama2 to summarise well all the document fragments when they are a much more? Actually I’m building an equivalent setup with videos and using weaviate but I’m facing the above issues. Even with weaviate the LLM part is only compatible with paid APIs now.
Make some manual testing tutorials or which is best manual testing channel bcoz I'am starting to learn plz help me
Sure 👍
1)What is minimum system requirements?😅
2) Sir ,i want to creat chat bot of 6 subject pdf around 1000 pages, pdf are not OCR.
3) is it complete open source or any subscription?
can anyone tell me how can i extract specific content from the PDF and save that in json format?
and that content is multiple choice questions i want to extract only those questions with the relevant options of that question but there are other content available which should be excluded
Jsonify the response!
@@bhattbhavesh91 but there are also an other contents which I don't want in response and with my code I am getting it, i got the response in json but not that i wanted
Nice video🎉
Thank you so much 😀
also make when query not related to pdf then llm should respond by its own
Sure
Anyone here who got Unauthorised Exception while running the base model slug, max token line?
Where is the json file?
it's using a vector dB to store the unstructured data (pdfs) as structured data (nodes & a node index)
the LLM (LLaM2) then retrieves the structured data from vector DB
once retrieved & merged with LLM
LLM is then ready to apply
- input Prompt P
- evaluate P (using model) then
- output completion C (using all training data LLaM2 & pdf's)
I hope you dont get Popular and stay to me and get us ahead of others,
...❤
Sorry I found it 🙂
😂
LOL me too😂😂 @@yt_souvik
Insecurities are popping up 😂
But Why? 😮