LlamaIndex Webinar: Advanced RAG with Knowledge Graphs (with Tomaz from Neo4j)

RAG in 2024: Advancing to Agents

LlamaIndex Webinar: RAPTOR - Tree-Structured Indexing and Retrieval

Don't Waste Water💧, Quick Fixes You Can Do Now!😸🩹 #catvideos #catmemes #trending

แอบหนีออกจากบ้าน ลงกล่อง

🔴Live โหนกระแส หลวงตางานเข้ายืมเงินชาวบ้าน 10 ล้าน เอาโบสถ์มาค้ำ

LlamaIndex Webinar: Improving RAG with Advanced Parsing + Metadata Extraction

LlamaIndex

มุมมอง 3 376

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 5 ก.ย. 2024
In this video we cohost a workshop with the cofounders of Deasie (Reece, Leonard, Mikko) on improving RAG with advanced parsing and metadata.
The data processing layer is one of the most important pieces to get right for RAG. This means that AI engineers need to make careful decisions in terms of parsing and transformations - including metadata extraction and chunking - in order to make sure that their e2e QA system is surfacing relevant results.
This is a nice two-part workshop that demonstrates the following:
- the value of good parsing itself over complex documents, with LlamaParse
- the value of additional value of adding in metadata through Deasie's powerful automated labeling platform
We show overall experimental results over research papers validating the combination of parsing + metadata for good performance.

ความคิดเห็น • 13

@SamiSabirIdrissi หลายเดือนก่อน ⁺²
Overall i think this is super dope! I can’t wait to try this. The Increase routing accuracy capability is wild. Pulling relevant data with high accuracy is extremely important! 💪⚡️
@SamiSabirIdrissi หลายเดือนก่อน ⁺¹
Very very interesting, i feel like this is similar to Azure’s document intelligence feature?
@awakenwithoutcoffee หลายเดือนก่อน
yup, similar to unstructured / Llamaparse + LlamaExtract (new) . There is also new OCR models like ColPali!
หลายเดือนก่อน ⁺²
Same for the metadata tags generation: another open AI GPT wrapper doing generation or mapping depending if the tags are suggested or not. As shown, the best result is obtained with the custom metadata. It means that humans are still in need to do the most difficult and time-consuming task, i.e. defining the custom tasks...😢
@awakenwithoutcoffee หลายเดือนก่อน
nah I can't see this not being automated in the foreseeable future. There are already OCR models with long context memory that are able to create metadata tags. Give it a few months and this metadata "problem" will be solved.
@pin65371 หลายเดือนก่อน ⁺²
It seems to me like this would get much more effective with a graph system? When Jerry was asking about how the data would be retrieved it seemed like graph would work well with this. When you ask a question the LLM first would retrieve relevant parent metadata. From there it can branch out from there. The advantage with that would be that connections that maybe arent so obvious with vector would be very obvious with graph. Also with graph at least you have visibility to be able to manually go in and see what is going on. I liked that last question as well. It seems like maybe it wasnt something they thought about but they might look at how to implement something like that. Tokens are getting so cheap now that it would make sense. Especially if lets say you are using the openai 4o-mini model its 30 cents for a million tokens output. Just getting it to output some extra metadata would essentially be free and would only make the whole system more efficient in the long run.
@awakenwithoutcoffee หลายเดือนก่อน
I agree but it still too expensive and difficult to fully automate correctly. Let's keep in touch trough to the comments as us engineers are looking to for production ready techniques. My take is that graphRAG is not ready yet but it might be early next year (for enterprise).
หลายเดือนก่อน ⁺⁶
The example with PyPDF is not correct as nobody is using PyPDF texts extracted per page. Instead, there is post-processing on the raw text. All these startups founders think that we are dummies and propose in their "products" the recipes that we are all using for months or years without pretending to build a company on the top of them. Same for metadata.... Almost nothing new here. 😢😮
@awakenwithoutcoffee หลายเดือนก่อน
you bring up an important point: The part about cross-page context confused me since Jerry basically didn't know why this was happening. Have you found additional information or techniques yourself ? I'm looking for production ready techniques for meta-data extraction. One alternative new approach is ColPali.
@MatijaGrcic หลายเดือนก่อน
This was great, thanks for sharing.
@isle1009 หลายเดือนก่อน
8:16 Does Deasie support languages other than English well, especially Korean?
@Deasie หลายเดือนก่อน ⁺²
Yes, we do support other languages, including Korean!

ต่อไป

เล่นอัตโนมัติ

LlamaIndex Webinar: Advanced RAG with Knowledge Graphs (with Tomaz from Neo4j)

LlamaIndex Webinar: Advanced RAG with Knowledge Graphs (with Tomaz from Neo4j)

RAG in 2024: Advancing to Agents

RAG in 2024: Advancing to Agents

LlamaIndex Webinar: RAPTOR - Tree-Structured Indexing and Retrieval

LlamaIndex Webinar: RAPTOR - Tree-Structured Indexing and Retrieval

Don't Waste Water💧, Quick Fixes You Can Do Now!😸🩹 #catvideos #catmemes #trending

Don't Waste Water💧, Quick Fixes You Can Do Now!😸🩹 #catvideos #catmemes #trending

แอบหนีออกจากบ้าน ลงกล่อง

แอบหนีออกจากบ้าน ลงกล่อง

🔴Live โหนกระแส หลวงตางานเข้ายืมเงินชาวบ้าน 10 ล้าน เอาโบสถ์มาค้ำ

🔴Live โหนกระแส หลวงตางานเข้ายืมเงินชาวบ้าน 10 ล้าน เอาโบสถ์มาค้ำ

[Re:all TREASURE] EP.6 in 방콕ㅣ🚩 쩡캠의 방콕 투어 빵 마이 와이 👍

[Re:all TREASURE] EP.6 in 방콕ㅣ🚩 쩡캠의 방콕 투어 빵 마이 와이 👍

Why are vector databases so FAST?

Why are vector databases so FAST?

Workshop on Useful and Reliable AI Agents

Workshop on Useful and Reliable AI Agents

Building agentic LLM application Workflows with LlamaIndex

Building agentic LLM application Workflows with LlamaIndex

How Millionaire Bankers Actually Work | Authorized Account | Insider

How Millionaire Bankers Actually Work | Authorized Account | Insider

Jerry Liu-LlamaIndex - Practical Data Considerations for building Production-Ready LLM Applications

Jerry Liu–LlamaIndex – Practical Data Considerations for building Production-Ready LLM Applications

High-performance RAG with LlamaIndex

High-performance RAG with LlamaIndex

Is Tree-based RAG Struggling? Not with Knowledge Graphs!

Is Tree-based RAG Struggling? Not with Knowledge Graphs!

How to scrape the web for LLM in 2024: Jina AI (Reader API), Mendable (firecrawl) and Scrapegraph-ai

How to scrape the web for LLM in 2024: Jina AI (Reader API), Mendable (firecrawl) and Scrapegraph-ai

Local GraphRAG with LLaMa 3.1 - LangChain, Ollama & Neo4j

Local GraphRAG with LLaMa 3.1 - LangChain, Ollama & Neo4j

ผัวช้ำ เมียคบชู้ 5 คน ขอฝ่ายหญิงกลับมาหย่า ชู้ล่าสุดเป็นถึงทนายความ l EP.1750 l 3 ก.ย.67

ผัวช้ำ เมียคบชู้ 5 คน ขอฝ่ายหญิงกลับมาหย่า ชู้ล่าสุดเป็นถึงทนายความ l EP.1750 l 3 ก.ย.67

Useful construction tips. How to reliably tie reinforcement #shorts #diy #tips #construction

Useful construction tips. How to reliably tie reinforcement #shorts #diy #tips #construction

#เชียร์krk #supung

#เชียร์krk #supung

ไม่มีพี่หน่วงชีวิตเราดี #pasulol #พี่หน่วง #ชีวิตติดหน่วง #โหนกระแส #หนุ่มกรรชัย

ไม่มีพี่หน่วงชีวิตเราดี #pasulol #พี่หน่วง #ชีวิตติดหน่วง #โหนกระแส #หนุ่มกรรชัย

Wait for end 😂 | Best family game 😜 #shorts

Wait for end 😂 | Best family game 😜 #shorts

ก็ขรี้กันตรงนี้ไปเลย!!!#เซียนหรั่ง #อินเดีย #มิ้นท์นวินดา#พลอยรัญดภา #ปักหมุดชาแนล #ฟีลแฟนได้ป่ะ

ก็ขรี้กันตรงนี้ไปเลย!!!#เซียนหรั่ง #อินเดีย #มิ้นท์นวินดา#พลอยรัญดภา #ปักหมุดชาแนล #ฟีลแฟนได้ป่ะ

A Minecraft Movie | Teaser

A Minecraft Movie | Teaser

📍LIVE📍 งานแถลงข่าวการประกวด Miss Grand Thailand 2025

📍LIVE📍 งานแถลงข่าวการประกวด Miss Grand Thailand 2025