Anthropic's new improved RAG: Explained (for all LLM)

Discover AI

มุมมอง 4 913

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 18 พ.ย. 2024

ความคิดเห็น • 26

@DanielBowne หลายเดือนก่อน ⁺¹
I am super stoked about this. Soooo many AI content channels just regurgitate the Anthropic's article, but no original content. Would love someone to take this example and actually build an example and show comparison of how RAG worked for them vs this new method.
@DaveRetchless หลายเดือนก่อน ⁺²
So much to learn....everyday! Thanks for providing great content! You are one of my daily learning resources. Keep kicking AI!
@code4AI หลายเดือนก่อน
Thanks. Smile.
@mulderbm 24 วันที่ผ่านมา
Love it keep your humor in these videos its beautiful 😂
@johnnybloem1 หลายเดือนก่อน
I like your sense of humour. I *giggled” when you spoke about chickens and eagles categorising both as birds. Chickens have not yet formally been added to the bird category…may be due to their limited flying capacity 😂
@brendawilliams8062 27 วันที่ผ่านมา ⁺¹
Some birds refused the AI tool box I guess
@ramitube21 หลายเดือนก่อน ⁺³
What about building a rag system with contextual retrieval and open source models like llama?
@__mbCrypto หลายเดือนก่อน
So...we've been doing this since MONTHS. Store the chunk without implicit references to improve false in context learning generation and rewrite it using a recursive summarisation until the full document fits in the content window.
Didn't know that could justify a published paper...😅 to us, it's just common sens and some implementation details.
Anyone agree or are we secretly geniuses? 😂
Great video BTW
@remusomega 25 วันที่ผ่านมา
Late Chunking has solved the Chunking Context problem.
@code4AI 24 วันที่ผ่านมา
Please indicate each of your jokes with a clear label, like joke::
@1DusDB หลายเดือนก่อน ⁺²
12:33 Give the whole document?! but at the beginning 0:40 they say that if kb is lower than 200K tokens its better to send it within the prompt (so not RAG used). 🤔
So what if the document is big >200K in that "situate_context" code?
@llucis-v หลายเดือนก่อน
exactly, that part was murky. i suppose those would be larger chunks, not whole documents. e.g. the target chunk + a few more chunks around, for a larger context
@code4AI หลายเดือนก่อน
It is so easy to find the answer. Just upload a 500 page document to Claude and look what is happening. You can experience Ai yourself! Trust yourself.
@pedrogondim2740 หลายเดือนก่อน
How does this compare to Jina Late Chuncking approach for contextual understanding
@xsrothebeginner8658 หลายเดือนก่อน
How about storing a hierarchical order of the chunks depending on e.g. paragraphs, which you add to retrieved embedded vectors. In addition, you can ask the LLM for the most important words like name etc entities in the prompt and search for the given word in the chunk texts and again use the hierarchical order of the chunks to obtain the contextual chunks
@PedroPereira-i6b หลายเดือนก่อน
I have a question: if you need to load the entire document into the prompt, does that mean that Contextual RAG doesn't work for situations where the document has more than 200k tokens?
In a way, it seems that this solution undermines the main principle of a RAG, which is to fragment the content to 'fit' into the prompt.
@ChristophBackhaus หลายเดือนก่อน
Is this not also very usefull if you use a very long system prompt?
I have a system prompt that is a couple pages long. Telling the AI what our coding conventions are. The idea is that instead of giving the Model a bunch of code and having it try to guess what the rules behind the code is we tell it. At least in my small amout of testing this has worked quite well.
@JonCollins-eq8jm หลายเดือนก่อน
I would also like to see and/or work on an open source implementation. If anyone has a resource, or @Discover AI would like to work on it, it would be appreciated. For the process described, why is the whole document and individual chunk fed to the LLM each time - couldn't you just feed the document once and for example a batch of 100 chunks (assuming it fits in the context window)? Then the LLM could produce a batch of contextualised chunks, rather calling it so many times?
@ChristophBackhaus หลายเดือนก่อน
So. Did you see that meta claims to have solved prompt injection?
@code4AI หลายเดือนก่อน
Again?
@SirajFlorida หลายเดือนก่อน ⁺¹
That's why I can't turn away.
@skinclub-cosmeticdoctors6247 หลายเดือนก่อน ⁺⁴
So in essence it's a load of BS. Why would we want to triple our embeddings requirement , it's not sustainable.
@DanielBowne หลายเดือนก่อน ⁺¹
Embedding storage is the cheap part. Anthropic is stating 47% accuracy improvement. If you use context caching, this will be fairly cheap to build out. Even cheaper if you use a local embeddings model.
@dragoon347 หลายเดือนก่อน
They are giving a master class on how to frontload and save on cost. If you have a large knowledge base its best to do this 1 time, then you don't have to do all the context and everything again.
For example all previous years sales numbers for a company. Nice static database.
Now something like a chatbot memory would need the extra compute as new information will constantly be ingested.
@DoktorUde หลายเดือนก่อน
This could have been explained in 10 minutes instead of 34.
@code4AI หลายเดือนก่อน ⁺⁴
Glad to hear the concept of contextual retrieval clicked for you so quickly! If you're ready to explain it in 10 minutes now, I'd say the 34 minutes were well spent. Thanks for the feedback!

ต่อไป

เล่นอัตโนมัติ

Rethinking AI: Solutions for Logical Reasoning?