Using Langchain and Open Source Vector DB Chroma for Semantic Search with OpenAI's LLM | Code

แชร์
ฝัง
  • เผยแพร่เมื่อ 23 ก.ย. 2024
  • Discover the power of LangChain, Chroma DB, and OpenAI's Large Language Models (LLM) in this step-by-step guide. Dive into semantic search capabilities using an open-source vector database, Chroma DB.
    AI Demos: www.aidemos.com/
    AIDemos.com is your go-to directory for video demos of the latest AI tools. AI Demos's goal is to educate and inform about the possibilities of AI.
    Code and Explanation: blog.futuresma...
    Building a Document-based Question Answering System with LangChain, Pinecone, and LLMs like GPT-4.
    • Building a Document-ba...
    Revolutionizing Search: How to Combine Semantic Search with GPT-3 Q&A
    • Revolutionizing Search...
    GPT-4 API Tutorial: • GPT-4 API: Real-World ...
    ChatGPT API Tutorial: • Using OpenAI's ChatGPT...
    Building a GPT-4 Chatbot using ChatGPT API and Streamlit Chat
    • Building a GPT-4 Chatb...
    🚀 Top Rated Plus Data Science Freelancer with 8+ years of experience, specializing in NLP and Back-End Development. Founder of FutureSmart AI, helping clients build custom AI NLP applications using cutting-edge models and techniques. Former Lead Data Scientist at Oracle, primarily working on NLP and MLOps.
    💡 As a Freelancer on Upwork, I have earned over $100K with a 100% Job Success rate, creating custom NLP solutions using GPT-3, ChatGPT, GPT-4, and Hugging Face Transformers. Expert in building applications involving semantic search, sentence transformers, vector databases, and more.

ความคิดเห็น • 59

  • @amitagarwal5223
    @amitagarwal5223 3 หลายเดือนก่อน +1

    This video was very helpful. Good explanation. Appreciate your time and effort in putting it togather.

  • @oxytic
    @oxytic ปีที่แล้ว +3

    Dear Pradip,
    I have been following your video tutorials for the past three weeks, and I have found them to be very informative and comprehensive. However, I am having some difficulty understanding how to create a custom agent in Langchain with class and class decorators and LLMchain.
    I would be grateful if you could create a video tutorial that specifically addresses this topic. I believe that this would be a valuable resource for many people who are interested in learning how to use Langchain to create custom agents.

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว

      Sure. I will explore it and cover

  • @chinmaydeshpande5046
    @chinmaydeshpande5046 7 หลายเดือนก่อน +1

    Thanks for the great video . TBH , I was struggling with this from 2 days .. and your video helped me .

  • @allwiyn
    @allwiyn ปีที่แล้ว +1

    Thanks for the video. This was very useful. Information was pretty clear for the beginners too.

  • @digvijayyadav4168
    @digvijayyadav4168 ปีที่แล้ว +1

    Great work Pradip

  • @ashxos
    @ashxos 9 หลายเดือนก่อน +1

    Thanks for Sharing!

  • @yazanrisheh5127
    @yazanrisheh5127 11 หลายเดือนก่อน +1

    Can you make a video where we persist the Chroma db on cloud and show us how to add new files, delete new files, embed these files and ask questions with our new db? Would really really really appreciate it! Thank you in advance

  • @aidev8926
    @aidev8926 ปีที่แล้ว +1

    Brilliant work !!!!

  • @nehat786
    @nehat786 ปีที่แล้ว

    Hello sir, my name is Nehat. I am from India. I am full stack developer and very much interested in AI. I have seen your videos and trust me I learn a lot from those videos. You have an amazing quality of teaching and making things easy to understand. I would be so many if you teach me those skills and show me the new world of AI. Please do let me know if there is any paid courses I can take. Thank you!

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว

      Hi Nehat, Thanks. I dont have any paid course. I mostly focused on my freelancing work and share my learning here on youtube. IF you have any doubts you can ask here.

  • @himu04
    @himu04 5 หลายเดือนก่อน

    cannot import name 'Document' from 'langchain'
    getting this error
    also i m facing some problem is it possible to connect you.

  • @sandedom339
    @sandedom339 ปีที่แล้ว +1

    Very nicely explained!
    Can we do this job without OpenAI API key? can we do it locally on our own computer?

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว

      For open ai you will require key. if you want to only test semantic search and see relevant docs you dont need key

    • @sandedom339
      @sandedom339 ปีที่แล้ว

      @@FutureSmartAI Thanks Pradip- Just sent you an email, please see.

    • @riyadhmollik
      @riyadhmollik ปีที่แล้ว

      @@FutureSmartAI this api key free or paid ?

    • @fadyabdo124
      @fadyabdo124 ปีที่แล้ว

      @@riyadhmollik+1

  • @testahom4690
    @testahom4690 ปีที่แล้ว +1

    Hey Pradip, good video.
    Do you have any plans for similar videos on LLAMA2 and vector dbs?

  • @ruthirockstar2852
    @ruthirockstar2852 หลายเดือนก่อน

    I would like to know how to HOST this CUSTOM model in cloud... please? anyone?

  • @yosshi2028
    @yosshi2028 ปีที่แล้ว +2

    Could you share your 「Chroma DB with Langchain.ipynb 」file?

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว +2

      blog.futuresmart.ai/using-langchain-and-open-source-vector-db-chroma-for-semantic-search-with-openais-llm

  • @rahulcn1314
    @rahulcn1314 3 หลายเดือนก่อน

    is there any difference in embeddings if we use OpenAi embedding rather than langchain ?

    • @FutureSmartAI
      @FutureSmartAI  3 หลายเดือนก่อน

      langchain dont have their own embedings. we can use open ai or other open source embding like sentence transformers

  • @nukulkhadse5253
    @nukulkhadse5253 ปีที่แล้ว

    Hi @Pradip, what if I need a summarised answer for my query, for example, what is the average health score of all my pumps? Health score is just a number in all my documents. So will my model get all that data or it will go out of context limit while fetching the data ?

  • @jpdoan4531
    @jpdoan4531 ปีที่แล้ว +1

    Thank you Pradip for this great tutorial! 👏It worked for my dataset.🎉
    Do you have an alternative code to create chroma_db from:
    either: splitting a Pandas DataFrame's column
    or: splitting a PySpark SQL DataFrame's column of table content ?
    instead of: docs = split_docs(documents) and db = Chroma.from_documents(docs, embeddings)
    My dataset is too large and creating thousands of .txt files is not sustainable.

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว

      Why dont you use Langchain SQL Agents

  • @nikitamobile
    @nikitamobile 7 หลายเดือนก่อน

    Thanks for your video. I have tried to implement the same on my side, however the text pieces returned by similarity_search for the query look unrelevant. I'm using all-MiniLM-L6-v2 model for the embedding and the following settings of the text_splitter - chunk_size=1024,chunk_overlap=20. What can be the reason of the poor seimilarity search results? P.S. input files are not in English language. It is in Uzbek. Can it be also the reason?

  • @rizwanat7496
    @rizwanat7496 8 หลายเดือนก่อน

    how long will it take to embed and store a large dataset of around 50 MB in Chroma

  • @sachintiwari2794
    @sachintiwari2794 8 หลายเดือนก่อน

    Thanks for your helpful video!
    I am loading python code.
    Could you please suggest the best vector semantic similarity search here to get most relevant top k results?

    • @FutureSmartAI
      @FutureSmartAI  8 หลายเดือนก่อน

      you mean best embeding model?

  • @MrunmayeeBangar
    @MrunmayeeBangar 3 หลายเดือนก่อน

    How can we do vector search for dynamic data, can nodes update automatically

    • @FutureSmartAI
      @FutureSmartAI  3 หลายเดือนก่อน

      as we get new data we need to add more docs to existing index or collection

  • @virtualrendezvous2220
    @virtualrendezvous2220 ปีที่แล้ว +1

    Hi,I have a json file with content and metadata. How to split content into chunks in that case.

    • @koushiksherugar8680
      @koushiksherugar8680 ปีที่แล้ว

      I'm looking for the same
      Have you got any leads?

    • @virtualrendezvous2220
      @virtualrendezvous2220 ปีที่แล้ว +1

      @@koushiksherugar8680 yes,you can check json document loader integration
      in langchain documentation

  • @i_hope69
    @i_hope69 ปีที่แล้ว

    Hey i am facing an issue which is DirectoryLoader runs for either very long time or crashes. :( how can I solve this?

  • @suniha2803
    @suniha2803 ปีที่แล้ว

    hi pradip
    In the past video
    You used streamlit, langchain, and pinecone.
    I want to use ChomaDB instead of pinecone
    I don’t know what I should do.
    Can you give me some tips?

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว

      you should be able to do it yourself. try combining code from both of these videos. Even you can ask chatGPT to do that

  • @abhishekvij2409
    @abhishekvij2409 6 หลายเดือนก่อน

    is it possible to dockerize this code with fastapi?

  • @moralstorieskids3884
    @moralstorieskids3884 ปีที่แล้ว +1

    thanks

  • @indranilcool
    @indranilcool ปีที่แล้ว

    If different models were used for generating embedding would the performance of semantic search be effected ?

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว

      Yes. www.sbert.net/docs/pretrained_models.html

  • @shaikshavalivali876
    @shaikshavalivali876 ปีที่แล้ว

    I have one doubt Does Azure cognitive service provides vector db service too or not? Please help me

    • @FutureSmartAI
      @FutureSmartAI  ปีที่แล้ว

      They have something Azure Cognitive Search but you can always use other vector databases.
      learn.microsoft.com/en-us/semantic-kernel/memories/vector-db

  • @kanikasharma4611
    @kanikasharma4611 ปีที่แล้ว

    Can you guide me how can I connect LLM and pine Cone and open Ai embedding to connect with mongo database and create chat bot ?

  • @dattatreyagundumolu1593
    @dattatreyagundumolu1593 6 หลายเดือนก่อน

    can i upload more than 1 pdf file when i integrate with streamlit

  • @zd676
    @zd676 7 หลายเดือนก่อน

    Great video! But if I get one dollar for every "you know" you said, I wouldn't need to worry about learning ChromaDB lol jk.

  • @nattyzaddy6555
    @nattyzaddy6555 ปีที่แล้ว

    Does this video say how you can use a local model with the vector db?

  • @i_hope69
    @i_hope69 ปีที่แล้ว

    sir i want to split json file how can I do it?