Thanks for the video. When using VectorstoreIndexCreator and running .query(), what model is being used for the generation? I don’t see you specifying a model anywhere in the context of the vector store index
Hey! I just looked up in the langchain documentation cause I also didn't know, and its the OpenAIEmbeddings. CHeck out this source code to confirm: api.python.langchain.com/en/latest/_modules/langchain/indexes/vectorstore.html#VectorstoreIndexCreator:~:text=%5Bdocs%5Dclass%20VectorstoreIndexCreator,Field(default_factory%3Ddict) Thanks for watching! Cheers!
Hey man nice tutorial. But suppose I have a PDF document with a section like an Abstract, Introduction etc. Now I want section wise summary of how can I implement the same?
All right I think the. best way would be to just break down the pdf by sections and summarize those. THere are many way to actually do the breaking down in sections properly. I think you can go the route of fully giving the pdf to gpt4 and asking for this breakdown to make it easy, but I would look into builtin tools in LangChain, now they must have more specialized tools.
@@automatalearninglab Hi man, I am able to get the desired results by specifying the prompt and using gpt-trubo-3.5. Will look into the LangChain part too. Thanks for replying
@@automatalearninglab Hi man, I am able to get the desired results by specifying the prompt and using gpt-trubo-3.5. Will look into the LangChain part too. Thanks for replying
You can use something simple like this: def generate_pdf(): subprocess.run(["mdpdf", "-o", "research_report.pdf", "research_report.md"]) subprocess.run(["open", "research_report.pdf"]) but adapt to save your chunks instead of this research report markdown in the example!
Great video, thank you! How we can use our pdf paper database to help write a new scientific paper with the existing papers as references? I want to generate new text with the papers, not just simply summarize the info within them. Thanks again!
Thank you this is exactly what I needed, so helpful. A quick question, can you show us how to use the Custom Prompt step that is commented out. This is exactly the feature I need, which steps are required to be run before I can run Step 21 (Custom Prompts). Thanks Also it would be amazing to show us how to save the summaries into a spreadsheet (CSV or XLS) file instead of a txt file. Where it puts the Study Title (Column A), Study Date (Column B), Custom Query (Column C). That would be monumental.
@@automatalearninglab Thank you that would be much appreciated. I was able to get the custom prompt to work, but more specifically was looking to have the custom prompt search each of the documents one after another like the summary function does so I can load a folder and generate the same query across all the pdf files, like with the summary. Best Regards
@@automatalearninglab Can we use Llama2 with MapReduce, specifically with the load_summarize_chain function, or does MapReduce require an OpenAI model?
@@yasminesmida2585 as far as I know it does not, map reduce just organizes how langchain will use the model to orchestrate the summary process. But use llama3! It's much better!
Thanks for the video. When using VectorstoreIndexCreator and running .query(), what model is being used for the generation? I don’t see you specifying a model anywhere in the context of the vector store index
Hey! I just looked up in the langchain documentation cause I also didn't know, and its the OpenAIEmbeddings. CHeck out this source code to confirm: api.python.langchain.com/en/latest/_modules/langchain/indexes/vectorstore.html#VectorstoreIndexCreator:~:text=%5Bdocs%5Dclass%20VectorstoreIndexCreator,Field(default_factory%3Ddict)
Thanks for watching! Cheers!
I like the clear step by step description of each line of code. Liked and subscribed!
Ooh! Thanks a lot :)
which SLM you would recommend for handling summaries in both English and French .thank you
Not aware of good ones for French but I would fine tune llama3.1 70b or 8B probably
Excellent Work!!!!
:)
Curiously waiting for a multiple webdoc crawl / scrape / search solution, like Langchain's Python / JS docs + OpenAI docs + their Github repos.
Yessss
Nice tutorial, thanks for sharing
Thanks for watching :)
Hey man nice tutorial. But suppose I have a PDF document with a section like an Abstract, Introduction etc. Now I want section wise summary of how can I implement the same?
All right I think the. best way would be to just break down the pdf by sections and summarize those. THere are many way to actually do the breaking down in sections properly. I think you can go the route of fully giving the pdf to gpt4 and asking for this breakdown to make it easy, but I would look into builtin tools in LangChain, now they must have more specialized tools.
@@automatalearninglab Hi man, I am able to get the desired results by specifying the prompt and using gpt-trubo-3.5. Will look into the LangChain part too. Thanks for replying
@@automatalearninglab Hi man, I am able to get the desired results by specifying the prompt and using gpt-trubo-3.5. Will look into the LangChain part too. Thanks for replying
awesome! You're welcome! :)@@brothersofgenration9185
Hi, Great work!, i am trying to store the summary of each chunk in a single pdf file. can you please me with it?
You can use something simple like this:
def generate_pdf():
subprocess.run(["mdpdf", "-o", "research_report.pdf", "research_report.md"])
subprocess.run(["open", "research_report.pdf"])
but adapt to save your chunks instead of this research report markdown in the example!
Great video, thank you! How we can use our pdf paper database to help write a new scientific paper with the existing papers as references? I want to generate new text with the papers, not just simply summarize the info within them. Thanks again!
Ah, this is a bit more tricky, Look into langchain and prompt templates to chain together a set of prompts that create the text your looking for.
Can it give summarization for any research documents?
dont u need openai key when u import openai lib and referencing it below?
YOu need to have it as an environment variable yes
Thank you this is exactly what I needed, so helpful. A quick question, can you show us how to use the Custom Prompt step that is commented out. This is exactly the feature I need, which steps are required to be run before I can run Step 21 (Custom Prompts). Thanks
Also it would be amazing to show us how to save the summaries into a spreadsheet (CSV or XLS) file instead of a txt file. Where it puts the Study Title (Column A), Study Date (Column B), Custom Query (Column C). That would be monumental.
Yeah sure! I will either make another video or write the solution here! Thanks for watching! :)
@@automatalearninglab Thank you that would be much appreciated. I was able to get the custom prompt to work, but more specifically was looking to have the custom prompt search each of the documents one after another like the summary function does so I can load a folder and generate the same query across all the pdf files, like with the summary. Best Regards
@@VastIllumination nice, you got it, already put on my content calendar! 📆 Thanks for watching:)
@@automatalearninglab Ty, so appreciated
@@VastIllumination No worries! :)
Is think Lanchain has a flag which returns the source of each response?
There is query with sources, is that what you're looking for?
good work, may i know this required to use openAI API?
Yeah in this case it is
How can we do the same if we want to do this with txt,pdf,pptx,etc
Update I have done it 😂
Excellent. Can we have the same Chat-Gpt UI and history with langchain ?
You can do include memory with langchain, but the UI is not with langchain. I might post something about this soon ! :)
Cheers
@@automatalearninglab interested in memory an UI, looking forward to see content
@@andy1979s awesome! Planning on some content for the upcoming weeks on this ! Thanks guys :)
hi, are there free open source alternatives rather than OpenAI?
Yes of course, you can use llama2 or mistral 7B
@@automatalearninglab Can we use Llama2 with MapReduce, specifically with the load_summarize_chain function, or does MapReduce require an OpenAI model?
@@yasminesmida2585 as far as I know it does not, map reduce just organizes how langchain will use the model to orchestrate the summary process. But use llama3! It's much better!
can the same be done with hugging face?
YEah there are tons of models there for summarization. huggingface.co/models?pipeline_tag=summarization&sort=downloads
@@automatalearninglab ok Thanks.
Someone can solve pass `disallowed_special=()'?
I didn't get that issue, can you write the entire error?
@@automatalearninglab I get
Encountered text corresponding to disallowed special token ''
This raise when creating index with VectorsStoreIndexCreator
2:56