Building a Document-based Question Answering System with LangChain, Pinecone, and LLMs like GPT-4.

แชร์
ฝัง
  • เผยแพร่เมื่อ 22 ก.ย. 2024
  • Learn how to build a powerful document-based question-answering system using LangChain, Pinecone, and advanced LLMs like GPT-4 and ChatGPT. Unlock the potential of semantic search and AI-driven insights to create precise and context-aware AI applications. Watch now and elevate your projects with cutting-edge techniques!
    Code: blog.futuresma...
    AI Demos: www.aidemos.com/
    AIDemos.com is your go-to directory for video demos of the latest AI tools. AI Demos goal is to educate and inform about the possibilities of AI.
    Revolutionizing Search: How to Combine Semantic Search with GPT-3 Q&A
    • Revolutionizing Search...
    GPT-4 API Tutorial: • GPT-4 API: Real-World ...
    ChatGPT API Tutorial: • Using OpenAI's ChatGPT...
    Building a GPT-4 Chatbot using ChatGPT API and Streamlit Chat
    • Building a GPT-4 Chatb...
    🚀 Top Rated Plus Data Science Freelancer with 8+ years of experience, specializing in NLP and Back-End Development. Founder of FutureSmart AI, helping clients build custom AI NLP applications using cutting-edge models and techniques. Former Lead Data Scientist at Oracle, primarily working on NLP and MLOps.
    💡 As a Freelancer on Upwork, I have earned over $60K with a 100% Job Success rate, creating custom NLP solutions using GPT-3, ChatGPT, GPT-4, and Hugging Face Transformers. Expert in building applications involving semantic search, sentence transformers, vector databases, and more.
    #LangChain #Pinecone #GPT4 #ChatGPT #SemanticSearch #DocumentQnA"

ความคิดเห็น • 172

  • @FutureSmartAI
    @FutureSmartAI  ปีที่แล้ว +6

    Code: blog.futuresmart.ai/building-a-document-based-question-answering-system-with-langchain-pinecone-and-llms-like-gpt-4-and-chatgpt
    📌 Hey everyone! Enjoying these NLP tutorials? Check out my other project, AI Demos, for quick 1-2 min AI tool demos! 🤖🚀
    🔗 TH-cam: www.youtube.com/@aidemos.futuresmart (Read more)
    We aim to educate and inform you about AI's incredible possibilities. Don't miss our AI Demos TH-cam channel and website for amazing demos!
    🌐 AI Demos Website: www.aidemos.com/
    Subscribe to AI Demos and explore the future of AI with us!

  • @LaveshNK
    @LaveshNK ปีที่แล้ว +11

    Just an update, unstructured 0.6.1 does not support local-inference. For the import statements you have to add *!pip install unstructured[local-inference]* as well to load your documents.
    Thanks for the content!

  • @Yogic-ignition
    @Yogic-ignition 11 หลายเดือนก่อน +3

    for everyone getting error on:
    embeddings = OpenAIEmbeddings(model_name="ada")
    text= "Hello world"
    query_result = embeddings.embed_query(text)
    len(query_result)
    can go ahead and use:
    response = openai.Embedding.create(
    input="Hello world",
    model="text-embedding-ada-002"
    )
    embeddings = response['data'][0]['embedding']
    len(embeddings)

    • @vedantpandya9655
      @vedantpandya9655 5 หลายเดือนก่อน

      its still not working bro

  • @FindMultiBagger
    @FindMultiBagger ปีที่แล้ว +5

    Crisp and up to the point !!! Great work
    need more tutorials on LLM

  • @kevon217
    @kevon217 ปีที่แล้ว +3

    This was a very illuminating demo. Appreciate it!

  • @SaiKiranAdusumilli
    @SaiKiranAdusumilli ปีที่แล้ว +4

    Fastest research on complete langchain and you provided the best notes ❤🎉

  • @extrememike
    @extrememike ปีที่แล้ว +2

    Good explanations. The flow diagrams really helps with he big picture. Thanks!

  • @port7421
    @port7421 ปีที่แล้ว +2

    It was a very helpful presentation. Thanks and greetings from Poland.

  • @rotormeeeeeeee
    @rotormeeeeeeee ปีที่แล้ว +2

    This is first-class information.
    Thank you.
    I just subscribed!

  • @dogtens1060
    @dogtens1060 11 หลายเดือนก่อน +1

    nice tutorial, thank you Pradip!

  • @thomasguillemard4873
    @thomasguillemard4873 ปีที่แล้ว +2

    Clear explanations and great code! Thanks!

  • @100p
    @100p ปีที่แล้ว +2

    Terrific work! Thank you :)

  • @entertainmentbuzz4934
    @entertainmentbuzz4934 หลายเดือนก่อน

    Very useful information in the video.
    Thanks.!

  • @rishikapandit6780
    @rishikapandit6780 ปีที่แล้ว +3

    Sir in the code it is giving an error saying that pinecone does not have an attribute named ‘init’ how do we resolve this ?

  • @mohsinaliriad5278
    @mohsinaliriad5278 ปีที่แล้ว +1

    Looking for something similar and found the best one. Thanks @Pradip

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว +1

      Glad you liked it! Check other videos also.

  • @ShadowD2C
    @ShadowD2C 4 หลายเดือนก่อน

    wouldve been helpful to have the docs included with the colab files to run some tests straight away

  • @davidtowers7851
    @davidtowers7851 ปีที่แล้ว +5

    First rate notes.

  • @rayen1722
    @rayen1722 ปีที่แล้ว +1

    Very useful video! Thank you :)

  • @quengelbeard
    @quengelbeard 6 หลายเดือนก่อน +1

    Hey Pradip, great video!
    Do you know if it's possible to automatically create a pinecone db index from code?
    So that you don't have to create them manually

  • @ujjwalsrivastava6248
    @ujjwalsrivastava6248 2 หลายเดือนก่อน

    Sir is it required to have open ai key or call openai library as i want q&a using the provided document only?

  • @zahramovahedinia1896
    @zahramovahedinia1896 ปีที่แล้ว +1

    This was awesome!!

  • @patrickhilpold7032
    @patrickhilpold7032 ปีที่แล้ว +3

    Has someone turned this into a streamlit app and would share the github? Would really appreciate that!

    • @SageLewis
      @SageLewis ปีที่แล้ว

      I agree. I would LOVE to see how to turn this into a Streamlit app.

  • @mohanvishe2889
    @mohanvishe2889 5 หลายเดือนก่อน

    Easy To understand tutroial👍

  • @saadkhattak7258
    @saadkhattak7258 ปีที่แล้ว +3

    Hi Pradip, Hope you are doing well :)
    I installed all the dependencies and was running the following cell,
    directory = '/content/data'
    def load_docs(directory):
    loader = DirectoryLoader(directory)
    documents = loader.load()
    return documents
    documents = load_docs(directory)
    len(documents)
    I got this error: ImportError: cannot import name 'is_directory' from 'PIL._util' (/usr/local/lib/python3.9/dist-packages/PIL/_util.py)
    Any idea how to resolve this? I have faced this error before as well.

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว +2

      Yes, I had faced this. If you restart runtime, it will resolve.
      github.com/obss/sahi/discussions/781

    • @saadkhattak7258
      @saadkhattak7258 ปีที่แล้ว +2

      @@FutureSmartAI Okay thanks for sharing :)
      Let me try
      UPDATE:
      It worked :)

    • @sauradeepdebnath437
      @sauradeepdebnath437 ปีที่แล้ว

      @@FutureSmartAI thanks it worked. had to change the version of PIL to 6.2.2 and then Restart the Run time

  • @shahabuddin-pc8jr
    @shahabuddin-pc8jr 6 หลายเดือนก่อน +1

    great work i love this ❤❤

    • @FutureSmartAI
      @FutureSmartAI  6 หลายเดือนก่อน

      Glad you like it!

  • @Rider-jn6zh
    @Rider-jn6zh 6 หลายเดือนก่อน +1

    Hello brother,
    Can you please upload videos on how to evaluate llm model and which evaluation metrics can be used for specific usecase.
    As I am getting this question in every interview and not able to answer itt

  • @khari_baat
    @khari_baat ปีที่แล้ว +1

    Thank you Dear.

  • @saswatmishra1256
    @saswatmishra1256 ปีที่แล้ว +1

    Can you make a video on how to make the chatbot using open source embedding like instruct and any open source llm. Btw great video ❤

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว +1

      I have video with open source embeding and open source vector db but not open source LLM. Will do.

  • @borisguzmancaceres9105
    @borisguzmancaceres9105 10 หลายเดือนก่อน +1

    Love your videos, I have to do something like that on my job, can you help me?
    I have to do a chatbot trained by multiple documents, using templates, streamlit, openai etc.

    • @FutureSmartAI
      @FutureSmartAI  10 หลายเดือนก่อน

      Yes you can easily do with multiple docs. did you check new assistent api ? it has made things simpler now. th-cam.com/video/yo0qy7xyd3A/w-d-xo.htmlsi=x9WoNOifwwj2Yz6k

  • @akarshghale2932
    @akarshghale2932 ปีที่แล้ว +1

    Hello!
    Can you please share whether it is possible to mention the namespace in langchain when using Pinecone loader?

  • @hitendrasingh01
    @hitendrasingh01 ปีที่แล้ว +1

    best content thankyou.

  • @FindMultiBagger
    @FindMultiBagger ปีที่แล้ว +1

    Subscribed !!! ♥️

  • @rahuldinesh2840
    @rahuldinesh2840 6 หลายเดือนก่อน

    I think database like MySQL is better than vector DB.

  • @SeBa-mg3ms
    @SeBa-mg3ms ปีที่แล้ว +1

    Hi, Great tutorial ! Is it possibble for chatGPT to answear with a image that is in the PDF? Like if you ask him about some document where there will be a description of a tiger and a tiger image, can it answear with summary about the tiger and a image?

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว

      No. at this moment it only understand text

  • @snehitvaddi
    @snehitvaddi 6 หลายเดือนก่อน

    Hello! I’m working on creating an idiom dataset to fine-tune LLaMa2 for suggesting idioms based on different scenarios. I have a PDF full of idioms and I’m wondering if there’s a way to extract all the idioms using GPT or any other Large Language Model. Is there a cost-effective or free method to generate this dataset? Also, could you advise on how the data should be structured for fine-tuning the LLM? Should it be similar to a QnA format or something else?

  • @ljfi3324
    @ljfi3324 ปีที่แล้ว +1

    Hello, greetings from México! 🇲🇽
    I have a question, when the credits of GPT run out what other AI model do you recommend using?
    Another amazing vide

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว +1

      There are other open source options are available, but they are not that good. You should check and see whether it works for your use case. There are a few alternatives
      Alpaca
      Vicuna
      GPT4ALL
      Flan-UL2

  • @shinycaroline3722
    @shinycaroline3722 6 หลายเดือนก่อน

    Nice tutorial and well explained 👍 But suppose i want to create the embeddings once for all and then access it through its index when required then how could that be done? I came across this function Pincone.from_existing_index(), but when i tried it didn't work out. Not sure whether the issue is at the langchain end. Because creating the embeddings for each run is not the correct approach right?

  • @nikk6489
    @nikk6489 ปีที่แล้ว +1

    Nice Explanation and video. Few questions what will be the overhead cost of using OpenAI API Key and even Pinecone? Can you give some idea or thought on how we can create the QA System without using the OpenAI API key etc. Many thanks in advance. Cheers!!

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว +1

      Then we have to use open source alternatives. I am exploring some open source llms

  • @nabinbhusalofficial
    @nabinbhusalofficial 3 หลายเดือนก่อน

    Will it work for low resource language like Nepali? What should be taken care of in that case?

    • @FutureSmartAI
      @FutureSmartAI  3 หลายเดือนก่อน

      You should check it else you can find any open source llm that finetune with more Nepali data.
      Here is list of languages it support but i coulnt find Nepali there help.openai.com/en/articles/8357869-how-to-change-your-language-setting-in-chatgpt

  • @shahabuddin-pc8jr
    @shahabuddin-pc8jr 6 หลายเดือนก่อน

    but how to connect these things with any mobile frame work like flutter from there we upload document and also pass query like conversation

    • @FutureSmartAI
      @FutureSmartAI  6 หลายเดือนก่อน

      You will need to create api that your flutter app will use

  • @HiteshGulati
    @HiteshGulati ปีที่แล้ว +2

    Hi thanks for the detail video. I was able to follow your video and create a QnA chat bot. One place I am stuck is how can I reuse the embedings created earlier, is there a way to fetch already saved embedings from pinecone db into docsearch variable. Any suggestion would be helpful :)

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว +1

      Yes we can re use embedding. we have to use same index name. Did you check my other pinecone videos? th-cam.com/video/4QaodVdUTf0/w-d-xo.html

    • @HiteshGulati
      @HiteshGulati ปีที่แล้ว

      @@FutureSmartAI Thanks I'll check this video.

  • @polly28-9
    @polly28-9 4 หลายเดือนก่อน

    Thanks for the video! Well Done! I want to know how to make the chatbot to return a list of results. Not only one result, but a list of relevant answer to the input question. I do not know what to change: search index, search parameters, metric_type or what? Can you help me? Thanks!

    • @FutureSmartAI
      @FutureSmartAI  3 หลายเดือนก่อน

      Hi can you share scenario that requires multiple answer for single question?

    • @polly28-9
      @polly28-9 2 หลายเดือนก่อน

      @@FutureSmartAI Hi, we have a database of various customer inquiries and many of these inquiries are about the same issues. We have one topic in many inquiries from different customers. I want the chatbot, when I ask it about something, to return to me all the inquiries from the various customers on this topic. In other words, chatbot must return list of all inquiries about the same topic. How can I do this? Change search params ?Now I use this to search:
      vector_store = get_vector_store()
      retriever = vector_store.as_retriever(
      # search_type="mmr",
      search_type="similarity",
      search_kwargs={'k': 6, 'lambda_mult': 0.25}
      ) but I am not sure. Can you help me how to do that? Thanks!

  • @dianaliu7543
    @dianaliu7543 7 หลายเดือนก่อน

    This is great. How to deploy this question-answering on AWS?

    • @FutureSmartAI
      @FutureSmartAI  7 หลายเดือนก่อน

      Check I have two video on
      How to deploy streamlit on aws ec2
      And deploying ChatGPT + FastAPI on ec2

  • @satheeshthangaraj5614
    @satheeshthangaraj5614 ปีที่แล้ว +1

    Hi Pradip, thanks for sharing, if we want to deploy this code in AWS as web app what changes we should done in this code.

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว +1

      You can integrate this code into the streamlit app and deploy it on ec2.
      check this: th-cam.com/video/W7kDwsWFjvE/w-d-xo.html
      and this: th-cam.com/video/904cW9lJ7LQ/w-d-xo.html

    • @satheeshthangaraj5614
      @satheeshthangaraj5614 ปีที่แล้ว

      Thank You

    • @satheeshthangaraj5614
      @satheeshthangaraj5614 ปีที่แล้ว

      Can we use Django framework to build this ML app?

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว

      @@satheeshthangaraj5614 yes

  • @sauradeepdebnath437
    @sauradeepdebnath437 ปีที่แล้ว

    Great video ! One question though---for the vector store --why did we use "ada" model, instead of say GPT 3.5 ---which we are using downstream as LLM anyway ?

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว

      we need to use embeding model and not completion model

  • @ambrosionguema9200
    @ambrosionguema9200 ปีที่แล้ว

    Hi Pradip, excelent. but I can't extract source metadata. I dont understand.

  • @axysharma2010
    @axysharma2010 11 หลายเดือนก่อน

    How to show the document path alongwith get_answer(query) call without using print similar_docs?

  • @Ds12781
    @Ds12781 ปีที่แล้ว

    Thank you foe sharing. You are using similarity search to retrieve relevant chunks, but will this provide all relevant documents? Some relevant documents may be missed and this can lead to inaccurate answers. There may be a use case where accuracy is needed without losing any information.

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว

      you can bring more chunks, there are different modes available for generating response.
      Eg. we can pass alll chunks to gpt and generetae answer , other way pass each chunk at a time and then generate and refine answer.
      Here is good documentation: python.langchain.com/docs/modules/chains/document/
      I also talked abou this in my recent video: th-cam.com/video/5NG8mefEsCU/w-d-xo.html

  • @rinkugangishetty5713
    @rinkugangishetty5713 7 หลายเดือนก่อน

    I have content of nearly 100 pages. Each page have nearly 4000 characters. What chunk size I can choose and what retrieval method I can you for optimised answers?

    • @FutureSmartAI
      @FutureSmartAI  7 หลายเดือนก่อน

      It depends on what embeding you are using for Eg many sentence transformers embeding will only support up 512 tokens length and will ignore all other text after that.
      retrieval method: You should experiment but start with default

    • @rinkugangishetty5713
      @rinkugangishetty5713 7 หลายเดือนก่อน

      @@FutureSmartAI I'm using embeddings type "Ada". I'm using MultiQueryReteiever.

  • @aislayer2866
    @aislayer2866 ปีที่แล้ว +2

    if you get this error when loading the data: "ImportError: cannot import name 'is_directory'"
    use pillow==9.5

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว

      I have impoved audio. check my recent videos

  • @kamalthej9794
    @kamalthej9794 10 หลายเดือนก่อน

    Hi i am unable to resolve this error. Can you please help me in this.
    !pip install pinecone
    ERROR: Could not find a version that satisfies the requirement pinecone (from versions: none)
    ERROR: No matching distribution found for pinecone

    • @FutureSmartAI
      @FutureSmartAI  10 หลายเดือนก่อน

      pip install pinecone-client

  • @MohitKumar-gp6nr
    @MohitKumar-gp6nr ปีที่แล้ว

    I have some JSON files which I want to use for chatbot data source. How to store the JSON information in Croma DB using embedding and then retrieve it based on the user query. I googled a lot but did not find any answers.

  • @VenkatesanVenkat-fd4hg
    @VenkatesanVenkat-fd4hg ปีที่แล้ว +1

    Thanks for video. How to get the page number also..

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว

      I haven't tried yet with langchain. Did you try processing multi page pdf?

  • @sunil_modi1
    @sunil_modi1 ปีที่แล้ว

    Very Useful video for document question answering but as i use session state for storing chat conversation it fails giving correct answer. would you please provide some lecture or article to refer.

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว

      Did you check my recent video? Langchain and streamlit chat? there I have shown how to refine query to get correct answer.

  • @mattiabolognesi1787
    @mattiabolognesi1787 ปีที่แล้ว

    This is amazing man! Great content
    I would love to know how to expand the knowledge of this program also to the general one of chatGPT if it does not find the relevant information in the embedding, any suggestions?

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว +1

      You can focus on having better embeding and if you dont find answer in via semantic serach you can ask GPT to answer it from its own knowedge

  • @axysharma2010
    @axysharma2010 11 หลายเดือนก่อน

    I am getting error AttributeError: 'tuple' object has no attribute 'page_content' when i run get_answer(query)

  • @sahil0094
    @sahil0094 ปีที่แล้ว

    Is this fine tuning of LLM or what? I understand we are using embeddings to create vectors of documents and using that as context. So context for llm here would be what? All documents vector or just 4096 tokens in case of gpt 3

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว

      No its finetuning. Here we are not asking model to remeber rather we are providing model knowledge realtime.
      It has many advantges:
      You can keep adding more data to knwoledge base and you dont need to worry fientuning again
      GPT 3.5 and 4 , are not availabe for finetuning yet

  • @vishalsugandh
    @vishalsugandh 7 หลายเดือนก่อน

    Is it possible to use this for millions of documents?

  • @Lakshita-z5i
    @Lakshita-z5i 6 หลายเดือนก่อน

    the embedding query part is not working I have tried other solution but even they give error 14:52

    • @FutureSmartAI
      @FutureSmartAI  6 หลายเดือนก่อน

      Library has changed significantly. I am creating new video

  • @pradeept328
    @pradeept328 ปีที่แล้ว

    Thanks for uploading. But if I ask any question that is not related to the indexed documents, then also it is generating the answer from its own world knowledge. How to prevent it?

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว

      You can restrict it.
      You can add "Don't use general knowledge from outside the context."

  • @larawehbee
    @larawehbee ปีที่แล้ว

    Interesting!! Thanks for sharing!! Can we use the Langchain with haystack ? and Elastic search instead of Pinecone ?

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว +1

      Yes Langchain supoort elastic search also. Hay stack you need to check

  • @nezubn
    @nezubn ปีที่แล้ว

    I have tabular data and the columns have Questions, Answers and User Queries. Now if a new user comes and asks a new question, I would like to match it with the existing set of questions --> answers.
    As a query can be asked in many different ways the context recognition will be really important. Can I pass it as csv as you passed a pdf in this current setup?

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว

      You can treat each row of csv as one chunk and you can insert it in pinecone. When users ask any questions you can retrive similar rows.

  • @Gautamkumar-tk1xt
    @Gautamkumar-tk1xt ปีที่แล้ว

    Why this error is popping on my window machine 'apt-get' is not recognized as an internal or external command,
    operable program or batch file.

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว

      Hi we dont use ap-get install for windows. See if this helps towardsdatascience.com/poppler-on-windows-179af0e50150

  • @rohith646
    @rohith646 ปีที่แล้ว

    hey pradip can we download it as a model and build a UI for asking questions so that it looks like a chatbot

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว

      you can use this th-cam.com/video/nAKhxQ3hcMA/w-d-xo.html

  • @omkarmalpure3463
    @omkarmalpure3463 ปีที่แล้ว

    which type of documents does the langchain support ? like excel , or pdf s etc ?

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว

      many types it supports python.langchain.com/docs/modules/data_connection/document_loaders/

  • @larawehbee
    @larawehbee ปีที่แล้ว

    Thanks for the informative video. I would really appreciate your response for my questions . In the text splitter, it splits the document to chunks so let's say i have a pdf of 7 pages, so the 7 pages will be saved in differet splitted chunks in the vector db ?and another question, the answer will be from a specific chunk right, and not the whole document?

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว +1

      Yes those 7 pages will be split in different chunk based on chunk size and we will be maching user query against those chunks.

    • @larawehbee
      @larawehbee ปีที่แล้ว

      @@FutureSmartAI Amazing thanks. Im using Sentence Transformers embeddings to avoid using OpenAI API, because i need on premises solution, but the results come so weak. Do you recommend a better transformer ? Do you think llamaEmbeddings might be a better fit and close to openai embeddings?

  • @aanchalgupta2577
    @aanchalgupta2577 ปีที่แล้ว

    Hi pradip, first of all nice video.Can you please let me know I have a data but 50-60 percent is labeled and rest unlabeled so is it possible to use this mechanism on that as I want to get the labels of unlabeled using similarity search.

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว +1

      you mean you want to take unlabeled example and use semantic search to find label of it? If yes then you can first index all examples for which you have labels. while inserting in pinecone you can add meta data to each vector. When you find matching examples from pincecone you will also get that metadata check my other pinecone videos.
      So if you take test example t and find top 5 matching examples from pinecone then get metadata of those 5 examples and take majority vote of those 5 labels and assign that label to test example t.

    • @aanchalgupta2577
      @aanchalgupta2577 ปีที่แล้ว

      @@FutureSmartAI Thanks a lot.

    • @aanchalgupta2577
      @aanchalgupta2577 ปีที่แล้ว

      I am able to see only one video ,this one based on pinecone.Can you please share the link of pinecone playlist

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว +1

      ​@@aanchalgupta2577 th-cam.com/video/bWOvO_cxLHw/w-d-xo.html
      th-cam.com/video/4QaodVdUTf0/w-d-xo.html

  • @harinisri2962
    @harinisri2962 ปีที่แล้ว

    I tried conversationalbufferwindowmemory, My model is generating answers for out-of-context questions. How can I restrict that?

  • @BREWRBlitzHotline-el1cj
    @BREWRBlitzHotline-el1cj ปีที่แล้ว

    Can we add scraping your own web pages to it easily?

  • @gunngunn6763
    @gunngunn6763 ปีที่แล้ว

    Hi...how can one give access to other people just like ChatGpt without running code everytime?

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว

      You can integrate this is streamlit and deploy it. check my other videos

  • @SaveThatMoney411
    @SaveThatMoney411 ปีที่แล้ว

    Trying to figure how to use this to write my academic review papers and research articles.

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว

      While writing if you need factual info from specific articles or sources you can use this.

  • @Venkatesh-vm4ll
    @Venkatesh-vm4ll ปีที่แล้ว

    What if we created like api, every time new file will be uploaded, we are storing the embedding in same index, this is right or any different approach, for every file upload whether we need to create new index?

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว

      you can insert into same index

  • @basavaakash8846
    @basavaakash8846 ปีที่แล้ว

    how to calculat accuracy of the model

  • @manikantasurapathi92
    @manikantasurapathi92 ปีที่แล้ว

    Hey, @pradip I'm using windows laptop and running my code in VS Code. I'm not able to use apt-get install poppler-utils. Can you please help

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว

      Hi we dont use ap-get install for windows. See if this helps towardsdatascience.com/poppler-on-windows-179af0e50150

  • @shrutinathavani
    @shrutinathavani ปีที่แล้ว +1

    does this answer based on sementics search

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว

      Yes

    • @shrutinathavani
      @shrutinathavani ปีที่แล้ว

      @@FutureSmartAI ImportError: cannot import name 'is_directory' from 'PIL._util' this error even though i have upgraded packages of the libraries used

  • @PedramAbrari
    @PedramAbrari ปีที่แล้ว

    when I run the question_answering function and I use chain_type of map_reduce, I get the following error. ValueError: OpenAIChat currently only supports single prompt.
    if I use chain_type of stuff, I get the following error: InvalidRequestError: The model: `gpt-4` does not exist
    What am I doing wrong here?

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว

      if I use chain_type of stuff, I get the following error: InvalidRequestError: The model: `gpt-4` does not exist
      This means you dont have access to gpt4 model yet.
      chain_type of map_reduce ? It will make multiple request.
      you can read hear more about stuff vs map reduce.
      docs.langchain.com/docs/components/chains/index_related_chains

  • @sahil0094
    @sahil0094 ปีที่แล้ว

    How to measure accuracy of output of LLMs?

  • @learnforjannah7763
    @learnforjannah7763 ปีที่แล้ว

    directory = '/content/data'
    def load_docs(directory):
    loader = DirectoryLoader(directory)
    documents = loader.load()
    return documents
    documents = load_docs(directory)
    len(documents)
    this code are not work. plz solve this problem.

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว

      what is error?

    • @rekha388
      @rekha388 ปีที่แล้ว

      @@FutureSmartAI WARNING:langchain.embeddings.openai:Retrying langchain.embeddings.openai.embed_with_retry.._embed_with_retry in 4.0 seconds as it raised RateLimitError: You exceeded your current quota, please check your plan and billing details.. how to fix this

  • @shrutinathavani
    @shrutinathavani ปีที่แล้ว

    hello sir i am getting a validation error...
    could you please check
    ValidationError: 1 validation error for OpenAIEmbeddings
    model_name
    extra fields not permitted (type=value_error.extra)

    • @shrutinathavani
      @shrutinathavani ปีที่แล้ว

      embeddings = OpenAIEmbeddings(model_name="ada")
      query_result = embeddings.embed_query("Hello world")
      len(query_result)
      for this cell!!

    • @xXswagXxbro
      @xXswagXxbro ปีที่แล้ว

      ​@@shrutinathavani removed the argument from OpenAIEmbeddings function
      embeddings = OpenAIEmbeddings()

    • @jessicajames8724
      @jessicajames8724 ปีที่แล้ว

      Did you resolve this ? Even im facing the same error

    • @jessicajames8724
      @jessicajames8724 ปีที่แล้ว

      @FutureSmartAI

    • @xapreditz07
      @xapreditz07 6 หลายเดือนก่อน

      @@shrutinathavani
      embeddings = OpenAIEmbeddings()
      query_result = embeddings.embed_query("Hello world")
      len(query_result)
      do this

  • @mdsohailahmed7936
    @mdsohailahmed7936 ปีที่แล้ว

    Sir how to use pinecone with json data

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว

      you should be able to calculate embeding for it.

  • @Anonymus_123
    @Anonymus_123 ปีที่แล้ว

    How would I restrict the answers apart from the training data? Like if I type a query "What is Rasa Chatbot?" since I don't have it in my training data, I expect it to give an answer like "I don't know". But I am unable to do so. I am getting the answer which I should not. It would be glad for anyone who can solve my problem. Also I am getting an error like tesseract is not installed error whenever I am trying to load the documents

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว

      It restricts answers to only documents and say I don't know if its not present in document

    • @Anonymus_123
      @Anonymus_123 ปีที่แล้ว

      @@FutureSmartAI Would you please give me code for that on how to restrict apart from what it is been trained.

    • @harinisri2962
      @harinisri2962 ปีที่แล้ว

      Hi, I too have the same question. In case if you have tried this, Can you pls confirm me if it restricts the irrelavent queries which are not present in the document?

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว

      @@harinisri2962 Prompt will have that logic
      github.com/hwchase17/langchain/blob/master/langchain/chains/question_answering/stuff_prompt.py
      prompt_template = """Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.
      {context}
      Question: {question}
      Helpful Answer:"""

  • @rhiteshkumarsingh4401
    @rhiteshkumarsingh4401 ปีที่แล้ว +1

    do you recommend deleting the index in pinecone to avoid getting billed for it?

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว +2

      Yes, pinecone is very costly.

    • @rhiteshkumarsingh4401
      @rhiteshkumarsingh4401 ปีที่แล้ว +2

      @@FutureSmartAI can we use chroma db instead? does it provide the same functionality as pinecone?

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว +2

      @@rhiteshkumarsingh4401 Yes we can use chroma and others also. I use pinecone because we can use it as API and it is cloud based

    • @LaveshNK
      @LaveshNK ปีที่แล้ว +1

      @@FutureSmartAI How different would it be if you used chromadb? could you make a video on that

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว

      @@LaveshNK It will be similar just import and vector db object changes all things remain same. Even lanhchain use chromadb as default

  • @islamicinterestofficial
    @islamicinterestofficial ปีที่แล้ว

    It is something like private GPT? Is our document and questions are not going to openAI servers? Please answer this?

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว

      No its not private. we send context and question to open ai to get answer.

    • @islamicinterestofficial
      @islamicinterestofficial ปีที่แล้ว

      @@FutureSmartAI So, what's the benefit of using openai embeddings and directly using their API in python. Will it not be the same way?

    • @islamicinterestofficial
      @islamicinterestofficial ปีที่แล้ว

      @@FutureSmartAI And can you use make some tutorial to use pertained model for question answering with document offline. I don't want to use any third party API and want to build private solution

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว

      @@islamicinterestofficialNot sure what you are asking. Both open ai embeding and llm models are api as service thay you can use as rest api or python library.
      If you dont want to send your data you can use open source embeding and open source llm

  • @moviespalace17
    @moviespalace17 ปีที่แล้ว

    embeddings = OpenAIEmbeddings(model_name="ada")
    query_result = embeddings.embed_query("Hello world")
    len(query_result)
    I'm facing the below error from the above block of code:
    ValidationError Traceback (most recent call last)
    in ()
    ----> 1 embeddings = OpenAIEmbeddings(model_name="ada")
    2
    3 query_result = embeddings.embed_query("Hello world")
    4 len(query_result)
    /usr/local/lib/python3.10/dist-packages/pydantic/main.cpython-310-x86_64-linux-gnu.so in pydantic.main.BaseModel.__init__()
    ValidationError: 2 validation errors for OpenAIEmbeddings
    model_name
    extra fields not permitted (type=value_error.extra)
    __root__
    Did not find openai_api_key, please add an environment variable `OPENAI_API_KEY` which contains it, or pass `openai_api_key` as a named parameter. (type=value_error)

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว

      Did you add open ai key in as env variable? you can pass key as parameter also
      embeddings = OpenAIEmbeddings(openai_api_key="my-api-key")