Summarizing and Querying Multiple Papers with LangChain

แชร์
ฝัง
  • เผยแพร่เมื่อ 19 พ.ย. 2024

ความคิดเห็น • 51

  • @s4m3r
    @s4m3r 7 หลายเดือนก่อน +1

    Thanks for the video. When using VectorstoreIndexCreator and running .query(), what model is being used for the generation? I don’t see you specifying a model anywhere in the context of the vector store index

    • @automatalearninglab
      @automatalearninglab  7 หลายเดือนก่อน

      Hey! I just looked up in the langchain documentation cause I also didn't know, and its the OpenAIEmbeddings. CHeck out this source code to confirm: api.python.langchain.com/en/latest/_modules/langchain/indexes/vectorstore.html#VectorstoreIndexCreator:~:text=%5Bdocs%5Dclass%20VectorstoreIndexCreator,Field(default_factory%3Ddict)
      Thanks for watching! Cheers!

  • @birdingtherapy
    @birdingtherapy ปีที่แล้ว +1

    I like the clear step by step description of each line of code. Liked and subscribed!

  • @yasminesmida2585
    @yasminesmida2585 4 หลายเดือนก่อน +1

    which SLM you would recommend for handling summaries in both English and French .thank you

    • @automatalearninglab
      @automatalearninglab  3 หลายเดือนก่อน

      Not aware of good ones for French but I would fine tune llama3.1 70b or 8B probably

  • @transgenxinc3175
    @transgenxinc3175 ปีที่แล้ว +1

    Excellent Work!!!!

  • @attilavass6935
    @attilavass6935 ปีที่แล้ว +2

    Curiously waiting for a multiple webdoc crawl / scrape / search solution, like Langchain's Python / JS docs + OpenAI docs + their Github repos.

  • @dzanaga
    @dzanaga ปีที่แล้ว +1

    Nice tutorial, thanks for sharing

  • @brothersofgenration9185
    @brothersofgenration9185 8 หลายเดือนก่อน +1

    Hey man nice tutorial. But suppose I have a PDF document with a section like an Abstract, Introduction etc. Now I want section wise summary of how can I implement the same?

    • @automatalearninglab
      @automatalearninglab  8 หลายเดือนก่อน

      All right I think the. best way would be to just break down the pdf by sections and summarize those. THere are many way to actually do the breaking down in sections properly. I think you can go the route of fully giving the pdf to gpt4 and asking for this breakdown to make it easy, but I would look into builtin tools in LangChain, now they must have more specialized tools.

    • @brothersofgenration9185
      @brothersofgenration9185 8 หลายเดือนก่อน +1

      @@automatalearninglab Hi man, I am able to get the desired results by specifying the prompt and using gpt-trubo-3.5. Will look into the LangChain part too. Thanks for replying

    • @brothersofgenration9185
      @brothersofgenration9185 8 หลายเดือนก่อน

      @@automatalearninglab Hi man, I am able to get the desired results by specifying the prompt and using gpt-trubo-3.5. Will look into the LangChain part too. Thanks for replying

    • @automatalearninglab
      @automatalearninglab  8 หลายเดือนก่อน

      awesome! You're welcome! :)@@brothersofgenration9185

  • @pavithrak3989
    @pavithrak3989 6 หลายเดือนก่อน +1

    Hi, Great work!, i am trying to store the summary of each chunk in a single pdf file. can you please me with it?

    • @automatalearninglab
      @automatalearninglab  6 หลายเดือนก่อน

      You can use something simple like this:
      def generate_pdf():
      subprocess.run(["mdpdf", "-o", "research_report.pdf", "research_report.md"])
      subprocess.run(["open", "research_report.pdf"])
      but adapt to save your chunks instead of this research report markdown in the example!

  • @DrDanielCho-Kee
    @DrDanielCho-Kee ปีที่แล้ว +1

    Great video, thank you! How we can use our pdf paper database to help write a new scientific paper with the existing papers as references? I want to generate new text with the papers, not just simply summarize the info within them. Thanks again!

    • @automatalearninglab
      @automatalearninglab  ปีที่แล้ว

      Ah, this is a bit more tricky, Look into langchain and prompt templates to chain together a set of prompts that create the text your looking for.

  • @vinitkulal5207
    @vinitkulal5207 4 หลายเดือนก่อน

    Can it give summarization for any research documents?

  • @577Pradeep
    @577Pradeep 9 หลายเดือนก่อน +1

    dont u need openai key when u import openai lib and referencing it below?

    • @automatalearninglab
      @automatalearninglab  9 หลายเดือนก่อน

      YOu need to have it as an environment variable yes

  • @VastIllumination
    @VastIllumination ปีที่แล้ว +1

    Thank you this is exactly what I needed, so helpful. A quick question, can you show us how to use the Custom Prompt step that is commented out. This is exactly the feature I need, which steps are required to be run before I can run Step 21 (Custom Prompts). Thanks
    Also it would be amazing to show us how to save the summaries into a spreadsheet (CSV or XLS) file instead of a txt file. Where it puts the Study Title (Column A), Study Date (Column B), Custom Query (Column C). That would be monumental.

    • @automatalearninglab
      @automatalearninglab  ปีที่แล้ว

      Yeah sure! I will either make another video or write the solution here! Thanks for watching! :)

    • @VastIllumination
      @VastIllumination ปีที่แล้ว +1

      @@automatalearninglab Thank you that would be much appreciated. I was able to get the custom prompt to work, but more specifically was looking to have the custom prompt search each of the documents one after another like the summary function does so I can load a folder and generate the same query across all the pdf files, like with the summary. Best Regards

    • @automatalearninglab
      @automatalearninglab  ปีที่แล้ว +1

      @@VastIllumination nice, you got it, already put on my content calendar! 📆 Thanks for watching:)

    • @VastIllumination
      @VastIllumination ปีที่แล้ว +1

      @@automatalearninglab Ty, so appreciated

    • @automatalearninglab
      @automatalearninglab  ปีที่แล้ว

      @@VastIllumination No worries! :)

  • @RedCloudServices
    @RedCloudServices ปีที่แล้ว +1

    Is think Lanchain has a flag which returns the source of each response?

    • @automatalearninglab
      @automatalearninglab  ปีที่แล้ว +1

      There is query with sources, is that what you're looking for?

  • @weihongyeo5579
    @weihongyeo5579 ปีที่แล้ว +1

    good work, may i know this required to use openAI API?

  • @harshsinha9709
    @harshsinha9709 ปีที่แล้ว +1

    How can we do the same if we want to do this with txt,pdf,pptx,etc

  • @massibob2004
    @massibob2004 ปีที่แล้ว +1

    Excellent. Can we have the same Chat-Gpt UI and history with langchain ?

    • @automatalearninglab
      @automatalearninglab  ปีที่แล้ว +5

      You can do include memory with langchain, but the UI is not with langchain. I might post something about this soon ! :)
      Cheers

    • @andy1979s
      @andy1979s ปีที่แล้ว +1

      @@automatalearninglab interested in memory an UI, looking forward to see content

    • @automatalearninglab
      @automatalearninglab  ปีที่แล้ว

      @@andy1979s awesome! Planning on some content for the upcoming weeks on this ! Thanks guys :)

  • @prakyathkini5540
    @prakyathkini5540 ปีที่แล้ว +1

    hi, are there free open source alternatives rather than OpenAI?

    • @automatalearninglab
      @automatalearninglab  ปีที่แล้ว

      Yes of course, you can use llama2 or mistral 7B

    • @yasminesmida2585
      @yasminesmida2585 4 หลายเดือนก่อน +1

      @@automatalearninglab Can we use Llama2 with MapReduce, specifically with the load_summarize_chain function, or does MapReduce require an OpenAI model?

    • @automatalearninglab
      @automatalearninglab  4 หลายเดือนก่อน

      @@yasminesmida2585 as far as I know it does not, map reduce just organizes how langchain will use the model to orchestrate the summary process. But use llama3! It's much better!

  • @muskanrath7125
    @muskanrath7125 ปีที่แล้ว +1

    can the same be done with hugging face?

    • @automatalearninglab
      @automatalearninglab  ปีที่แล้ว

      YEah there are tons of models there for summarization. huggingface.co/models?pipeline_tag=summarization&sort=downloads

    • @muskanrath7125
      @muskanrath7125 ปีที่แล้ว +1

      @@automatalearninglab ok Thanks.

  • @marcellosaccomani1097
    @marcellosaccomani1097 ปีที่แล้ว +1

    Someone can solve pass `disallowed_special=()'?

    • @automatalearninglab
      @automatalearninglab  ปีที่แล้ว

      I didn't get that issue, can you write the entire error?

    • @marcellosaccomani1097
      @marcellosaccomani1097 ปีที่แล้ว

      @@automatalearninglab I get
      Encountered text corresponding to disallowed special token ''
      This raise when creating index with VectorsStoreIndexCreator

  • @S-Lingo
    @S-Lingo 3 หลายเดือนก่อน +1

    2:56