🎯 Key Takeaways for quick navigation: 00:09 *🎵 Introduction and Overview of LlamaParse* - Introduction of the hosts and the topic of the video, which is the new LlamaParse library. - Brief discussion on the capabilities of LlamaParse, particularly its ability to parse embedded tables and figures. 02:09 *📚 Understanding LlamaParse and its Performance* - Explanation of the purpose and functionality of LlamaParse. - Discussion on how to build a query engine using LlamaParse for document retrieval applications. 05:07 *📈 Llama Index and its Role in Data Framework* - Detailed explanation of Llama Index and its role as a data framework. - Discussion on the concept of context augmentation and its importance in the data-centric paradigm. 10:54 *📊 LlamaParse's Parsing Algorithm and its Capabilities* - Introduction to LlamaParse's proprietary parsing algorithm for documents with embedded objects. - Discussion on the comparison of LlamaParse's performance with other parsing tools. 14:05 *🧪 Testing LlamaParse's Performance* - Explanation of the testing process and the documents used for testing. - Discussion on the results of the testing, highlighting the strengths and weaknesses of LlamaParse. 20:54 *💻 Demonstration of LlamaParse in Code* - Walkthrough of the code used for testing LlamaParse. - Explanation of the models and tools used in the testing process. 23:24 *📚 Setting up LlamaParse and Llama Index* - Explanation of how to set up LlamaParse and Llama Index. - Discussion on the process of generating an API key for Llama Cloud. - Mention of the limitations of LlamaParse, such as only accepting PDFs and returning only plain text or markdown. 26:40 *🛠️ Initializing LlamaParse and Parsing Documents* - Walkthrough of initializing LlamaParse and parsing documents. - Explanation of the importance of preserving the structure of the data in the documents. - Discussion on the inconsistency in the parsing process and the potential issues that may arise. 31:52 *🚀 Building a Query Engine with Llama Index v0.10* - Introduction to Llama Index v0.10 and the changes it brings. - Explanation of how to build a query engine using Llama Index. - Discussion on the importance of preserving the structure of the data in the documents. 35:06 *🧪 Testing the Query Engine* - Walkthrough of testing the query engine. - Discussion on the results of the testing, highlighting the strengths and weaknesses of the query engine. - Explanation of the importance of the ranker in the retrieval process. 39:21 *📊 Querying Structured Data* - Demonstration of querying structured data using the query engine. - Discussion on the accuracy of the results and the potential issues that may arise. - Explanation of the importance of preserving the structure of the data in the documents. 42:52 *🎯 Testing LlamaParse on Figures and Graphs* - Demonstration of LlamaParse's performance on figures and graphs. - Discussion on the limitations of LlamaParse in understanding pictorial representations of data. - Mention of the potential improvements in LlamaParse's ability to handle images in the future. 44:15 *📊 LlamaParse's Strengths and Limitations* - Summary of LlamaParse's strengths, particularly in tabular extraction from PDFs. - Discussion on the proprietary nature of LlamaParse and its ease of use. - Mention of the potential improvements and developments in LlamaParse. 45:52 *💡 Q&A Session* - Start of the Q&A session, addressing various questions about LlamaParse. - Discussion on the potential of integrating LlamaParse with other tools and models. - Explanation of the decision to use a recursive query engine and the benefits of this approach. 49:54 *🔄 Comparing LlamaParse with Other Tools* - Comparison of LlamaParse with other open-source parsers. - Discussion on the benefits of LlamaParse being integrated into the Llama Index ecosystem. - Mention of the potential improvements and developments in LlamaParse. 51:17 *📑 Handling Tables in LlamaParse* - Explanation of how LlamaParse handles tables and maintains their structure. - Discussion on the limitations of LlamaParse in preserving the visual presentation of tables. - Mention of the potential improvements and developments in LlamaParse. 53:07 *🔄 Integrating LlamaParse with Other Tools* - Discussion on the potential of integrating LlamaParse with other tools and models. - Explanation of the benefits of LlamaParse's output being in markdown format. - Mention of the potential improvements and developments in LlamaParse. 54:58 *🚀 Future of LlamaParse and RAG* - Discussion on the future of LlamaParse and RAG in the context of large context window models. - Explanation of the benefits of RAG and its continued relevance. - Mention of the potential improvements and developments in LlamaParse and RAG. Made with HARPA AI
The links for notebooks and slides are very helpful as it is sometimes necessary to access this content after the initial streaming due to schedule concerns. Thank you!
@@AI-Makerspace how does it compare to Langchain RAG implementations? Which one would you recommend for someone working with only html and pdf? Thank you very much :)
Thank you for sharing a great tool once again. I'm letting you know that your LamaCloud link in the colab notebook isn't properly spelled, the letter I is missing at the end of the url.
Here, we are questioning each pdf right? Can we do questioning all the pdfs we have at a time? Let's say i have 10 10-k pdfs...i want to parse them store them to vector store (chromadb for example) and them do the retrieval on all the documents at a time
@@AI-Makerspace yeah I am storing all of'em on one index and my similarity score is very low 0.08 something like that. Could you tell me the best index algorithm to apply for my case?
Google Colab Notebook: colab.research.google.com/drive/1IVQkSGwS5kdTiKBwz85PO6vg_WaNx15c?usp=sharing
Event Slides: www.canva.com/design/DAF-L3KONQc/276F2Y-5Ym771I64RsjZiQ/view?DAF-L3KONQc&
🎯 Key Takeaways for quick navigation:
00:09 *🎵 Introduction and Overview of LlamaParse*
- Introduction of the hosts and the topic of the video, which is the new LlamaParse library.
- Brief discussion on the capabilities of LlamaParse, particularly its ability to parse embedded tables and figures.
02:09 *📚 Understanding LlamaParse and its Performance*
- Explanation of the purpose and functionality of LlamaParse.
- Discussion on how to build a query engine using LlamaParse for document retrieval applications.
05:07 *📈 Llama Index and its Role in Data Framework*
- Detailed explanation of Llama Index and its role as a data framework.
- Discussion on the concept of context augmentation and its importance in the data-centric paradigm.
10:54 *📊 LlamaParse's Parsing Algorithm and its Capabilities*
- Introduction to LlamaParse's proprietary parsing algorithm for documents with embedded objects.
- Discussion on the comparison of LlamaParse's performance with other parsing tools.
14:05 *🧪 Testing LlamaParse's Performance*
- Explanation of the testing process and the documents used for testing.
- Discussion on the results of the testing, highlighting the strengths and weaknesses of LlamaParse.
20:54 *💻 Demonstration of LlamaParse in Code*
- Walkthrough of the code used for testing LlamaParse.
- Explanation of the models and tools used in the testing process.
23:24 *📚 Setting up LlamaParse and Llama Index*
- Explanation of how to set up LlamaParse and Llama Index.
- Discussion on the process of generating an API key for Llama Cloud.
- Mention of the limitations of LlamaParse, such as only accepting PDFs and returning only plain text or markdown.
26:40 *🛠️ Initializing LlamaParse and Parsing Documents*
- Walkthrough of initializing LlamaParse and parsing documents.
- Explanation of the importance of preserving the structure of the data in the documents.
- Discussion on the inconsistency in the parsing process and the potential issues that may arise.
31:52 *🚀 Building a Query Engine with Llama Index v0.10*
- Introduction to Llama Index v0.10 and the changes it brings.
- Explanation of how to build a query engine using Llama Index.
- Discussion on the importance of preserving the structure of the data in the documents.
35:06 *🧪 Testing the Query Engine*
- Walkthrough of testing the query engine.
- Discussion on the results of the testing, highlighting the strengths and weaknesses of the query engine.
- Explanation of the importance of the ranker in the retrieval process.
39:21 *📊 Querying Structured Data*
- Demonstration of querying structured data using the query engine.
- Discussion on the accuracy of the results and the potential issues that may arise.
- Explanation of the importance of preserving the structure of the data in the documents.
42:52 *🎯 Testing LlamaParse on Figures and Graphs*
- Demonstration of LlamaParse's performance on figures and graphs.
- Discussion on the limitations of LlamaParse in understanding pictorial representations of data.
- Mention of the potential improvements in LlamaParse's ability to handle images in the future.
44:15 *📊 LlamaParse's Strengths and Limitations*
- Summary of LlamaParse's strengths, particularly in tabular extraction from PDFs.
- Discussion on the proprietary nature of LlamaParse and its ease of use.
- Mention of the potential improvements and developments in LlamaParse.
45:52 *💡 Q&A Session*
- Start of the Q&A session, addressing various questions about LlamaParse.
- Discussion on the potential of integrating LlamaParse with other tools and models.
- Explanation of the decision to use a recursive query engine and the benefits of this approach.
49:54 *🔄 Comparing LlamaParse with Other Tools*
- Comparison of LlamaParse with other open-source parsers.
- Discussion on the benefits of LlamaParse being integrated into the Llama Index ecosystem.
- Mention of the potential improvements and developments in LlamaParse.
51:17 *📑 Handling Tables in LlamaParse*
- Explanation of how LlamaParse handles tables and maintains their structure.
- Discussion on the limitations of LlamaParse in preserving the visual presentation of tables.
- Mention of the potential improvements and developments in LlamaParse.
53:07 *🔄 Integrating LlamaParse with Other Tools*
- Discussion on the potential of integrating LlamaParse with other tools and models.
- Explanation of the benefits of LlamaParse's output being in markdown format.
- Mention of the potential improvements and developments in LlamaParse.
54:58 *🚀 Future of LlamaParse and RAG*
- Discussion on the future of LlamaParse and RAG in the context of large context window models.
- Explanation of the benefits of RAG and its continued relevance.
- Mention of the potential improvements and developments in LlamaParse and RAG.
Made with HARPA AI
35:54 my favorite part:
"If we only miss sometimes, that's obviously much better than if we miss all the time or if we miss A LOT"
We also loved that part :)
The links for notebooks and slides are very helpful as it is sometimes necessary to access this content after the initial streaming due to schedule concerns. Thank you!
Thanks!
Thank YOU!
Solid tutorial. Just tried out converting PDF to MD files and a few more stuff. Mind blowing potential. Thanks so much for sharing.
Thank you @Tigzig! We love to hear that you're already putting the tool to use!!
Thank you so much. I really needed this tutorial
Hi, Great video thanks
Is this parser better than the one you use in previous video ?
From pdf to html ?
Or compare to surya ?
Yes, this is better (across the board) than the previous approach we tried.
Did you also compare it to Surya please ?
@@AI-Makerspace how does it compare to Langchain RAG implementations? Which one would you recommend for someone working with only html and pdf? Thank you very much :)
Both can be great tools! Use what you're most comfortable with!
Thank you for sharing a great tool once again. I'm letting you know that your LamaCloud link in the colab notebook isn't properly spelled, the letter I is missing at the end of the url.
Thank you!
Parsing files: 50%|█████ | 1/2 [00:00
You'll need to make sure you provide a valid API key.
Here, we are questioning each pdf right?
Can we do questioning all the pdfs we have at a time? Let's say i have 10 10-k pdfs...i want to parse them store them to vector store (chromadb for example) and them do the retrieval on all the documents at a time
Yes! If you store them in the same index - you will query across all of them at once.
@@AI-Makerspace yeah I am storing all of'em on one index and my similarity score is very low 0.08 something like that.
Could you tell me the best index algorithm to apply for my case?
how do we get set up with an API key? From what I can tell it looks like the Llama cloud is limited access.
You'll need an API key for this at this time.
@@AI-Makerspace is there any alternative in open source models??
use finish built rag systems that clean, chunk, and meta tag your pdf`s, i did 600 in a day you get good data
❤
wwHY aRE you TALKING iN CAPS lOCK!?!
We continue to work to get our audio game DIALED 🤙