Evaluating RAG Performance with Vector Databases | BLEU, ROUGE, and RAGAS

Retrieval-Augmented Generation (RAG) Patterns and Best Practices

Semantic Chunking for RAG with #langchain

เซอร์ไพรส์ซื้อรถคันใหม่ให้พี่หน่อง! ไม่ต้องทนขับรถเก่าอีกต่อไป ขับมาอวดจนพ่อหมั่นไส้

ใครขยับไม่ได้เป็น!!

เจ้าของแทบทรุด บ้านสร้างได้ 3 เดือน พังทรุดตัว เพจดังชี้สาเหตุ ไม่ใช่เกิดจากเสาเข็ม

Chunking Best Practices for RAG Applications

KX

มุมมอง 11 362

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 1 ม.ค. 2025

ความคิดเห็น • 13

@reiniervaneijk 9 หลายเดือนก่อน ⁺¹
Good job guys, valuable talk thnx
@mauriciolopes8502 11 หลายเดือนก่อน ⁺²
Thank you, Ryan! Awesome lecture.
@deepaksingh9318 9 หลายเดือนก่อน ⁺¹
Thanks .. It was A very good content and full of Details..
@tizulis2 9 หลายเดือนก่อน ⁺¹
excellent presentation!
@IgnacioLlorca-v9r ปีที่แล้ว ⁺²
Keep up the good work!
@Jaybearno 11 หลายเดือนก่อน ⁺⁴
Hi, thanks for the video, really covered a lot of relevant questions for me. Open question to the community-
I have been struggling with the retrieval relevance for relatively small chunks using ada-002 (OpenAI embedding). For example, I search do a similarity Search on a key word ("sea slug") I know only appears a few times, and the top k result doesn't even include either parts words. It appears in the text as "sea-slug", but this feels extremely brittle and like something the embeddings should capture. Is this somewhat expected? Hence the need for more complicated retrieval?
@RyanSieglerAI 11 หลายเดือนก่อน
Since the embeddings capture the context of a chunk, it isn't focused on specific words (this is where hybrid search can come into play). My thought is the embedding model doesn't know much context around a word like "sea-slug" so potentially finetuning the embedding model with some examples using that phrase, or using a hybrid search method would help.
@tonylv6119 9 หลายเดือนก่อน ⁺¹
Sometimes, document has some images and figures inside, i think that's hard part to deal with that for RAG.😊
@maryamashraf6370 10 หลายเดือนก่อน
Great video, learnt allot! Had a question. What should be the chunking approach for a RAG application scraping the Internet for context? Since the documents would be web pages I get that you'd start off with the html splitter, but what approach should you use to try to get as much relevant context as possible while limiting the number of pages you embed? Especially considering that embeddings will be made in real time, trying to make the process as fast as possible etc. Would the approach be very different from using an offline document corpus?
@soren81 10 หลายเดือนก่อน
Great video! I have a question about chunk decoupling. Shouldn't the vector storage embedding do pretty much the same abstraktion with the large text, as the summary does? I mean, wouldn't the summary and the original end up i the same place in the vector space, rendering the summary more or less pointless?
@RyanSieglerAI 10 หลายเดือนก่อน ⁺²
Thanks for the question! In this context, the summary should highlight the key points and concepts in the original document, which should make retrieval more accurate especially in cases where there are documents covering similar/adjacent concepts. This is because in a full document there could be unnecessary information that could throw off vector search. The quality of the summary needs to be high for this to work. If the quality of the summary is not good and does not present the key points of the original document then yes it would be better to just embed original document as a whole.
@vijaybrock 8 หลายเดือนก่อน
Hi Sir,
Can you suggest me the best chunking strategy for 10K reports (pdfs) to chat with?

ต่อไป

เล่นอัตโนมัติ

Evaluating RAG Performance with Vector Databases | BLEU, ROUGE, and RAGAS

Evaluating RAG Performance with Vector Databases | BLEU, ROUGE, and RAGAS

Retrieval-Augmented Generation (RAG) Patterns and Best Practices

Retrieval-Augmented Generation (RAG) Patterns and Best Practices

Semantic Chunking for RAG with #langchain

Semantic Chunking for RAG with #langchain

เซอร์ไพรส์ซื้อรถคันใหม่ให้พี่หน่อง! ไม่ต้องทนขับรถเก่าอีกต่อไป ขับมาอวดจนพ่อหมั่นไส้

เซอร์ไพรส์ซื้อรถคันใหม่ให้พี่หน่อง! ไม่ต้องทนขับรถเก่าอีกต่อไป ขับมาอวดจนพ่อหมั่นไส้

ใครขยับไม่ได้เป็น!!

ใครขยับไม่ได้เป็น!!

เจ้าของแทบทรุด บ้านสร้างได้ 3 เดือน พังทรุดตัว เพจดังชี้สาเหตุ ไม่ใช่เกิดจากเสาเข็ม

เจ้าของแทบทรุด บ้านสร้างได้ 3 เดือน พังทรุดตัว เพจดังชี้สาเหตุ ไม่ใช่เกิดจากเสาเข็ม

Cool Items!🥰 New Gadgets, Smart Appliances, Kitchen Tools Utensils, Home Cleaning, Beauty #shorts

Cool Items!🥰 New Gadgets, Smart Appliances, Kitchen Tools Utensils, Home Cleaning, Beauty #shorts

Building Production RAG Over Complex Documents

Building Production RAG Over Complex Documents

Certification Insights for DevOps and AWS

Certification Insights for DevOps and AWS

RDA TIGER Cross-Fertilisation Webinar #3: FAIR/TRUST/CARE Adoption

RDA TIGER Cross-Fertilisation Webinar #3: FAIR/TRUST/CARE Adoption

Stanford CS25: V3 I Retrieval Augmented Language Models

Stanford CS25: V3 I Retrieval Augmented Language Models

🔵 Improve RAG performance with Knowledge Graphs, Generative AI Hub and SAP BTP

🔵 Improve RAG performance with Knowledge Graphs, Generative AI Hub and SAP BTP

Navigating IP and Technology Transfer for your Army ATTR Partnership

Navigating IP and Technology Transfer for your Army ATTR Partnership

Practical RAG - Choosing the Right Embedding Model, Chunking Strategy, and More

Practical RAG - Choosing the Right Embedding Model, Chunking Strategy, and More

Retrieval Augmented Generation (RAG): Boosting LLM Performance with External Knowledge

Retrieval Augmented Generation (RAG): Boosting LLM Performance with External Knowledge

Advanced RAG Techniques with @LlamaIndex

Advanced RAG Techniques with @LlamaIndex

#JasonDeruloTV // Funny #GotPermissionToPost From @SofiManassyan #SlowLow

#JasonDeruloTV // Funny #GotPermissionToPost From @SofiManassyan #SlowLow

ถ้าม้าโดนแกล้งที่โรงเรียน ม้าจะฟ้องครูว่าอะไร #แต้มเซน #การ์ตูน #tamzen #ตลก #shortvideo #การ์ตูน

ถ้าม้าโดนแกล้งที่โรงเรียน ม้าจะฟ้องครูว่าอะไร #แต้มเซน #การ์ตูน #tamzen #ตลก #shortvideo #การ์ตูน

🎄✨ Puff is saving Christmas again with his incredible baking skills! #PuffTheBaker #thatlittlepuff

🎄✨ Puff is saving Christmas again with his incredible baking skills! #PuffTheBaker #thatlittlepuff

#เดอะตุ๊ก !! เจาะเดือด ทีมชาติ ผ่าฟอร์ม !! ทีมชาติไทย มันส์ เปิด สาเหตุ !! ระบบ+แท็คติก

#เดอะตุ๊ก !! เจาะเดือด ทีมชาติ ผ่าฟอร์ม !! ทีมชาติไทย มันส์ เปิด สาเหตุ !! ระบบ+แท็คติก

HIGHLIGHTS : Singapore 2-4 Thailand | ASEAN Championship 2024 | 17.12.24

HIGHLIGHTS : Singapore 2-4 Thailand | ASEAN Championship 2024 | 17.12.24

ทำผิดกฏหมาย 100 ข้อ ในวันเดียว!!

ทำผิดกฏหมาย 100 ข้อ ในวันเดียว!!

หัวหน้าแก๊งพาลูกสาวไปกินไก่ทอด เจอกลุ่มนักเลงหาเรื่อง เลยจัดการพวกนั้นจนพ่ายแพ้

หัวหน้าแก๊งพาลูกสาวไปกินไก่ทอด เจอกลุ่มนักเลงหาเรื่อง เลยจัดการพวกนั้นจนพ่ายแพ้