LangChain - Advanced RAG Techniques for better Retrieval Performance

แชร์
ฝัง
  • เผยแพร่เมื่อ 4 ก.ค. 2024
  • In this Video I will show you multiple techniques to improve RAG Applications. We will have a look at ParentDocumentRetrievers, MultiQueryRetrievers, Ensemble Retrievers, Document Compressors, Self-Querying and Time Weighted VectorStore Retrivers
    Github: github.com/Coding-Crashkurse/...
    Timestamps
    0:00 Introduction
    0:55 Chunksize Experiment
    5:45 ParentDocumentRetriever
    7:15 MultiQueryRetriever
    10:18 Contextual Compression
    15:35 Emsemble Retriever
    17:29 Self-Querying Retriever
    21:10 Time-weighted VectorStore Retriever

ความคิดเห็น • 45

  • @codingcrashcourses8533
    @codingcrashcourses8533  5 หลายเดือนก่อน

    Many requested a follow-up video with an example - Two-Stage Retrieval with Cross-Encoders: th-cam.com/video/3w_D1L0F-uE/w-d-xo.html

  • @StyrmirSaevarsson
    @StyrmirSaevarsson 5 หลายเดือนก่อน

    Thank you so much for this tutorial! It is exactly the stuff I was looking for!

  • @wylhias
    @wylhias 2 หลายเดือนก่อน

    Great useful content, with clear explanation. 👍

  • @santasalo86
    @santasalo86 17 วันที่ผ่านมา

    Nice work! few new methods of Langchain I was not aware of :)

  • @say.xy_
    @say.xy_ 6 หลายเดือนก่อน +1

    Already Love your content ❤
    Would love to see you making Production Ready Chatbot Pt 2 along with deployment part. Thankyou for producing quality content for free.

    • @codingcrashcourses8533
      @codingcrashcourses8533  6 หลายเดือนก่อน

      Thank you! I currently work on a Udemy Course, which explains how to deploy a Production Grade Chatbot on Microsoft Azure. It´s not free, but only costs a few bucks 🙂. Will release it in January. But of course I will continue to do Videos on YT which are completely free.

    • @Peter-cd9rp
      @Peter-cd9rp 5 หลายเดือนก่อน

      @@codingcrashcourses8533 very cool. where is it :D

  • @gangs0846
    @gangs0846 6 หลายเดือนก่อน

    Absolutely fantastic

  • @newcooldiscoveries5711
    @newcooldiscoveries5711 5 หลายเดือนก่อน

    Excellent information!! Thank you. Liked and Subscribed.

    • @codingcrashcourses8533
      @codingcrashcourses8533  5 หลายเดือนก่อน +1

      Nice! Will release a follow up video with a practical example on monday ;-)

  • @danielbusquets3282
    @danielbusquets3282 2 หลายเดือนก่อน

    Liked and subscribed. Spot on!

  • @sivajanumm
    @sivajanumm 6 หลายเดือนก่อน

    Thanks for great video of this topic.
    can you also post some videos related to LoRA with any LLMs of your choice.

  • @syedhaideralizaidi1828
    @syedhaideralizaidi1828 6 หลายเดือนก่อน

    Thank you so much for making this video! You create valuable content. I just have one question. I'm currently utilizing the Azure Search Service, and I'm curious if it's feasible to integrate all the retrievers. I've attempted to use LangChain with it, but my options seem limited to searching with specific parameters and filters. Unfortunately, there's not a lot of information available on how to effectively use these retrievers in conjunction with the Azure Search Service.

    • @codingcrashcourses8533
      @codingcrashcourses8533  6 หลายเดือนก่อน

      I tried ACS before and also was not tooo happy with it. My biggest con is that ACS does not support the indexing API. I prefer Postgres/PgVector :)

  • @Chevignay
    @Chevignay 6 หลายเดือนก่อน

    Thank you so much this is really good stuff

    • @codingcrashcourses8533
      @codingcrashcourses8533  6 หลายเดือนก่อน

      Thanks for your comment :)

    • @Chevignay
      @Chevignay 6 หลายเดือนก่อน

      You're welcome I just bought your course actually 🙂@@codingcrashcourses8533

  • @micbab-vg2mu
    @micbab-vg2mu 6 หลายเดือนก่อน

    Thank you for the video:). In your opinion which method of retrieval will give me the most accurate output ( the cost is not as important in my case )? I work in pharma industry - tolerance to LMMs mistakes is very low.

    • @codingcrashcourses8533
      @codingcrashcourses8533  6 หลายเดือนก่อน +1

      I can not give you a blueprint for that. Just try it out and experiment. You know your data and there are so many different ways to improve performance. If cost does not matter the easiest way is use GPT-4 instead of GPT-3.5. Also try chain of thought prompting and then use one of the techniques I showed in the notebooks. There are so many ways to improve performance :)

  • @theindianrover2007
    @theindianrover2007 5 หลายเดือนก่อน

    Thanks for the video, what is x & y dim in the scatter plot (5.19)?

  • @quengelbeard
    @quengelbeard 4 หลายเดือนก่อน +1

    Fantastic video! :D
    Quick question: Do you know how it's possible to create a local vector database that's queried via code, so the database doesn't get initialised each time the script is run?
    Would really appreciate your help!

    • @codingcrashcourses8533
      @codingcrashcourses8533  4 หลายเดือนก่อน +1

      You just have the use the correct constructor for that Database class. Methods like from_documents are just helper functions to make that easier. Not sure if I understood your question correct though

    • @quengelbeard
      @quengelbeard 4 หลายเดือนก่อน

      Yeah, answered my question pretty much, thanks a lot! Do you know which function i can use to create a local database, that can also be passed to the SelfQueryRetriever.from_llm() constructor?@@codingcrashcourses8533

  • @moonly3781
    @moonly3781 4 หลายเดือนก่อน

    Thank you for the amazing tutorial! I was wondering, instead of using ChatOpenAi, how can I utilize a llama 2 model locally? Specifically, I couldn't find any implementation, for example, for contextual compression, where you pass compressor = LLMChainExtractor.from_llm(llm) with the ChatOpenAi (llm). How can I achieve this locally with llama 2? My use case involves private documents, so I'm looking for solutions using open-source LLMS.

    • @codingcrashcourses8533
      @codingcrashcourses8533  4 หลายเดือนก่อน +1

      Sorry, I only use the OpenAI models due to my old computer. Can´t really help you with that

  • @ghazouaniahmed766
    @ghazouaniahmed766 3 หลายเดือนก่อน

    Thank you, can you handle theproblem of retrieval when we ask question out of context of rag or greeting for exemple ?

  • @yazanrisheh5127
    @yazanrisheh5127 6 หลายเดือนก่อน

    I'm a beginner here and I've been using langchain from your videos. Is the advanced RAG instead of doing something like my code below where instead of using the search type as similarity, I'm using the types that you showed in the video yet everything else stays the same like using ConversationalRetrievalChain, prompt, memory etc...?
    retriever=knowledge_base.as_retriever(search_type = "similarity_score_threshold", search_kwargs = {"score_threshold":0.8})
    Also, which would you recommend to retrieve for large documents? I need to do RAG over 80 PDF documents and have been struggling with accuracy.
    Lastly, in your OpenAi embeddings, why are you using chunk_size= 1 when by default, its chunk_size = 1000? Can you explain this part also please and thank you in advance

    • @codingcrashcourses8533
      @codingcrashcourses8533  6 หลายเดือนก่อน +1

      The advanced techniques also work with memory etc., but with the High Level chains I showed I may become a little bit difficult and "hacky".
      In general I don´t set any scores, but just retrieve the best documents. I also don´t have an answer for setting a good threshold. In general I recommend using the get_documents method with the retriever interface for getting documents.
      I set the chunk_size to 1 due to rate limit errors I often experienced. With higher chunk sizes it just makes too many requests at once it seems.

  • @akshaykumarmishra2129
    @akshaykumarmishra2129 6 หลายเดือนก่อน

    hi, in retrievalQa from langchain, we have a retriever that retrieves docs from a vector db and provides a context to the llm, let's say i'm using gpt3.5 whose max tokens is 4096... how do i handle huge context to be sent to it ? any suggestions will be appreciated

    • @codingcrashcourses8533
      @codingcrashcourses8533  6 หลายเดือนก่อน

      Gpt-3.5 Turbo allows 32 tokens I guess, gpt-4-turbo 128k. If you really need that large context window, my go-to apporach would be to use models with larger context windows at the end of 2023. There are also map-reduce methode to reduce the context, but these also do many requests before sending a final one.

  • @karthikb.s.k.4486
    @karthikb.s.k.4486 6 หลายเดือนก่อน

    Nice tutorial . May I know the theme used for visual studio code please

    • @codingcrashcourses8533
      @codingcrashcourses8533  6 หลายเดือนก่อน

      Material Theme dark :)

    • @karthikb.s.k.4486
      @karthikb.s.k.4486 6 หลายเดือนก่อน

      @@codingcrashcourses8533 link for the theme please as I see lot of material themes in market place extensions

  • @saurabhjain507
    @saurabhjain507 6 หลายเดือนก่อน

    Nice video. Can you please create a video on evaluation of RAG? I think a lot of people would be interested in this.

    • @codingcrashcourses8533
      @codingcrashcourses8533  6 หลายเดือนก่อน +1

      Thank you! That kind of video is currently not planned, since it´s actually quite expensive to evaluate RAG Output and designing that experiment is PROBABLY something not many people would watch on TH-cam. In addition to that I am not really an Expert on that topic. In my company our data scientists currently work on this^^

    • @prateek_alive
      @prateek_alive 5 หลายเดือนก่อน

      @@codingcrashcourses8533 what would be the right technique for evaluating a RAG? If you can share your thoughts in chat?

  • @whitedeviljr9351
    @whitedeviljr9351 5 หลายเดือนก่อน

    PDFInfoNotInstalledError: Unable to get page count. Is poppler installed and in PATH?

  • @vicvicking1990
    @vicvicking1990 11 วันที่ผ่านมา

    Wait what, I thought FAISS didnt support metadata filters ?
    Weird that TimeWaited works with it no ?

    • @codingcrashcourses8533
      @codingcrashcourses8533  11 วันที่ผ่านมา +1

      I am not too familiar with each change, FAISS is also work in progress, maybe they added it in some version :)

    • @vicvicking1990
      @vicvicking1990 11 วันที่ผ่านมา

      @@codingcrashcourses8533 In any case, your video is amazing and you are greatly helping me for my internship project.
      Many thanks, keep up the great work 💪👍