GraphRAG or SpeculativeRAG ?

code_your_own_AI

มุมมอง 7 258

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 21 ส.ค. 2024
What are the latest and best RAG systems, mid-July 2024? IF you build your next RAG system to integrate external data (from databases), what RAG system to choose for the best performance: GraphRAG or SpeculativeRAG? My AI research channel is here to provide answers.
This video introduces a novel framework called Speculative Retrieval-Augmented Generation (Speculative RAG), designed to optimize retrieval-augmented generation systems by efficiently generating accurate responses. This method innovatively separates the retrieval-augmented generation process into two distinct phases: drafting and verification. In the drafting phase, a specialized, smaller language model (LM) generates multiple answer drafts in parallel, each from a distinct subset of retrieved documents. This approach ensures diversity in perspectives and reduces redundancy. In the verification phase, a larger, generalist LM evaluates these drafts and selects the most accurate response based on a scoring system that assesses the drafts against their rationales.
The Speculative RAG model significantly improves the efficiency and accuracy of response generation in knowledge-intensive tasks by leveraging parallel processing and optimized document sampling. The framework clusters documents based on content similarity before drafting to minimize information overload and enhance focus. The model has been tested across various benchmarks such as TriviaQA, MuSiQue, PubHealth, and ARC-Challenge, demonstrating substantial improvements in both speed and accuracy compared to conventional RAG systems.
GraphRAG is a recent development from Microsoft that significantly enhances the performance of large language models (LLMs) through the integration of knowledge graphs with Retrieval Augmented Generation (RAG). It was designed to address the shortcomings of traditional RAG, which typically relies on vector similarity for information retrieval, often resulting in inaccuracies when dealing with complex or comprehensive queries.
all rights w/ authors:
Speculative RAG: Enhancing Retrieval Augmented
Generation through Drafting
arxiv.org/pdf/...
#airesearch
#aieducation
#newtechnology
วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 26

@MattJonesYT หลายเดือนก่อน ⁺¹¹
I think the weakest link in RAG is it usually chunks it without respect to context which means the data is immediately corrupted and then you need a really complex system to make the data not corrupt again. I think the biggest gains to be had is at the start of the process by chunking it in very intelligent semantic paragraphs that stand on their own like a section of paragraphs in a book. Just splitting every n tokens ruins RAG performance.
@themax2go หลายเดือนก่อน ⁺¹
well that's why contextual graphs now exist, ie ms's graphrag and now scifi's triplex...
@criticalnodecapital หลายเดือนก่อน
@@themax2go 100%, i was waiting 4 months fro them to drop it, and then realised speculative graph, or using an abstractlayer to let the llms bash it out was a better way to go!!!>.. EVALs fml , why did i not do this before.
@davidwynter6856 หลายเดือนก่อน ⁺⁶
Through actual use of baseline RAG over a year ago I realised knowledge graphs with their rich semantic capability would improve things radically. But after some experimentation I realised I needed to combine the triples and the embeddings, for simplicity and performance reasons. This is easy and free using Weaviate which allows a schema to be added over the top of the vector store. Since then I have built 4 different knowledge graphs over Milvus and Weaviate, they work brilliantly, and you can also build embedding for the full triple as well as the constituent subject, predicate and object. GPT-4o understand triple representations extracted from the user prompt very well.
@fintech1378 หลายเดือนก่อน ⁺⁵
Awesome, any video / link to share?
@davidwynter6856 หลายเดือนก่อน
@@fintech1378 Sorry, trying to get a job currently, after my sabbatical, the toolkit I built has to remain private
@artur50 หลายเดือนก่อน
github ;)?
@Karl-Asger หลายเดือนก่อน ⁺¹
Great to hear this. Can you speak to the cost of generating the knowledge graph, and what scale you're working with? I really like your insight here about embedding not just the chunks but also the triplets.
@antaishizuku หลายเดือนก่อน ⁺⁴
Preprocessing text really gives better results so at the end of this if you return a preprocessed string instead of the original to the llm it would probably do better. Personally im focusing on a different approach but from my testing i found this helps.
@whoareyouqqq 19 วันที่ผ่านมา
Google reinvented the map-reduce algorithm, where the map step is draft and the reduce step is verification
@c.d.osajotiamaraca3382 หลายเดือนก่อน
Thank you for helping me avoid the rabbit hole.
@iham1313 หลายเดือนก่อน
there was (some time ago and somewhere) the argument about models not capable understanding the question as it is not trained on the domain specific data.
it would be interesting to combine training of a base model with domain data (like articles, documents and books) and sending it of to a RAG like setup to retrieve referable results.
@user-gj1gd5pi1m หลายเดือนก่อน ⁺⁷
Both are not practical. I guess the authors do not have a production level experience in RAG.
@topmaxdata หลายเดือนก่อน
In many cases, working with a KV store or a relational database with extracted entities and relationships is more practical than using a graph database like Neo4j for the following reasons:
Familiarity: Most developers are already familiar with relational databases and key-value stores, making them easier to work with and maintain.
Ecosystem: Relational databases and KV stores have mature ecosystems with robust tools, libraries, and integrations.
Performance: For many use cases, KV stores and well-designed relational databases can offer excellent performance.
Flexibility: Relational databases can handle a wide range of data structures and query patterns.
Scalability: Both KV stores and relational databases can be scaled horizontally or vertically to meet performance needs.
@code4AI หลายเดือนก่อน ⁺¹
Smile. And after the praise for a KV, now list 5 problems w/ KV, just to have a balanced presentation from your side.
@topmaxdata หลายเดือนก่อน
@@code4AI Curious what are the 5 problems? Thank you.
@be1tube หลายเดือนก่อน
I'll give the disadvantages a shot:
Consistency: joins are not atomic, so by the time you finish the join the info may be outdated
Extra memory: joins must be done in the client
Extra queries: need to do one query per joined table usually
No relationship constraints across tables or rows
Imperative style: you tell the DB every step. You don't get intelligent query optimizers giving you the benefit of years of database research, you have to build it from scratch.
The DB doesn't know its structure: it doesn't know when to cascade deletes or when to store two elements nearby because they will be accessed together.
Note: it's been a decade since I tried to use a large KV store, so maybe some of these are better now.
@thomaslapras1669 หลายเดือนก่อน
Great video, as usual ! But i have one question : what if the relevant context is splitted in different sub datasets ?
@code4AI หลายเดือนก่อน
You operate with multiple datasets.
@lionardo หลายเดือนก่อน ⁺¹
I doubt this is working better than simple rags.
@sinasec หลายเดือนก่อน ⁺¹
is there any source code for this RAG?
@AaronALAI หลายเดือนก่อน
Hmm 🤔 i don't doubt there are better rag strategies.... however, rag with a model of good context size (65k+) yields very good results. But there will always be a scaling issue, too little model context or too large a db.
@code4AI หลายเดือนก่อน ⁺¹
Whenever a global corporations tells us, that their old product has a very poor performance and that we now have to buy a new product .... we can decide for a product that fits our needs.
@GeertBaeke หลายเดือนก่อน ⁺¹
@@code4AI That is not exactly what Microsoft is saying. The team that built Graph RAG focused mainly on global queries that use community summaries that were created during indexing. This allows you to ask global questions about your data that, out of the box, provides better answers than baseline RAG. And their local queries are actually a combination of vector queries to find entry points in the graph followed up by graph traversal. It's about combining things, not simply selecting one thing.
@bastabey2652 หลายเดือนก่อน ⁺¹
kg is a pain in the neck

ต่อไป

เล่นอัตโนมัติ