Will the New GEMINI PDF Feature Replace RAG?

Prompt Engineering

มุมมอง 23 088

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 2 ก.พ. 2025

ความคิดเห็น • 53

@engineerprompt 5 หลายเดือนก่อน ⁺²
Check out the RAG Beyond Basics Course: prompt-s-site.thinkific.com/courses/rag
@artur50 5 หลายเดือนก่อน ⁺¹⁰
It’d be excellent if you could test gpt4o and Flash against your RAG and show the results like you did in this video. That would be a nice demonstration of different capabilities and results of course with the use of local LLM
@jannik3475 5 หลายเดือนก่อน
Yes!
@spamitaktien1256 5 หลายเดือนก่อน
That would be great
@perelmanych 5 หลายเดือนก่อน ⁺¹
In scientific papers tables are usually in text format. Latex just uses fancy formatting of text to make tables, so table content extraction is not test of visual capabilities of a model.
@ryshabh11 5 หลายเดือนก่อน ⁺¹
Thanks
@engineerprompt 5 หลายเดือนก่อน
Thank you 😊
@gregsLyrics 5 หลายเดือนก่อน ⁺¹
One Q that I missed: when making API calls to our pdf, does our private data become publicly available in any way? Another amazing vid. Really appreciate all the work you put into making great content.
@engineerprompt 5 หลายเดือนก่อน ⁺¹
For free api, Google does say, they can use it for training. For paid api, that doesn't seem to be case. Now just like the other api providers, really it's on your own comfort level and how much you trust their words :)
@MeinDeutschkurs 5 หลายเดือนก่อน ⁺³
What if Gemma 2 is also able to do this. How could we test this?
@marcomeyer7545 5 หลายเดือนก่อน ⁺⁴
Hi, can you do a video on this:
In a typical AI workflow, you might pass the same input tokens over and over to a model. Using the Gemini API context caching feature, you can pass some content to the model once, cache the input tokens, and then refer to the cached tokens for subsequent requests. At certain volumes, using cached tokens is lower cost than passing in the same corpus of tokens repeatedly.
@AritrAMukherjEEBIQ 4 หลายเดือนก่อน ⁺¹
context catching needs 30k tokens minimum which is useless as most context comes generally under 30k, It might work for very very long codebases or 10 -15 long sized pdfs
@KumR 5 หลายเดือนก่อน ⁺²
Hi. Can u show us how to get to the UI ?
@pankajsinghrawat1056 3 หลายเดือนก่อน
here are some considerations to use rag with this llm: - large coupous - token cost . I believe one can bring the total token cost down to an order magnitude. say directly using pdfs taken 30k tokens on average. doing same with rag will cost on average 2k. this was heuristic for about a 15-20 page 1 pdf.
@RedCloudServices 5 หลายเดือนก่อน ⁺¹
Thanks for your videos and course. You said at the beginning Gemini 1.5 was only good for small docs what would you recommend for a large corpus of multi-modal PDF requirements? Would an agentic approach work to breakup the PDFs into buckets and a single agent to combine responses?
@awakenwithoutcoffee 4 หลายเดือนก่อน
you would need to utilize more traditional approaches : chunking, indexing, retrieval (parent document retrieval is a good approach). It is ALLOT of work to make this in production (trust me we know !) so you have to love it ha
@durand101 5 หลายเดือนก่อน ⁺¹
Impressive model. Thank you for the video.
I think the main benefit from classic RAG so far for me has been citations and clear sourcing (where the llm can return which page it is using for information). How well does Gemini Flash return this kind of info?
@engineerprompt 5 หลายเดือนก่อน
I haven't tested it on multiple files yet but I suspect that should be possible. I will put together a new tutorial on it when I get a chance.
@vitalis 5 หลายเดือนก่อน ⁺¹
What about using Gemini Flash to parse the PDFs into markdown and optimally structure it for LLMs and then embedding for RAG?
@wesleymogaka 4 หลายเดือนก่อน ⁺¹
Pursuing this idea
@vitalis 4 หลายเดือนก่อน
@@wesleymogaka report back once you do it. Maybe send the TH-camr a link so he can also review it and give you some exposure
@ashimov1970 4 หลายเดือนก่อน
your Colab link doesn't work. It doesn't open
@IdPreferNot1 5 หลายเดือนก่อน
love the meta paper choice to scan
@ebandaezembe7508 5 หลายเดือนก่อน
thank you so much for this video
@CryptoMaN_Rahul 5 หลายเดือนก่อน
i wanted to build a previous year paper analysis system for my colllege ( engineering ) , there are total 7 departments , all subjects come upto 7*6*8. Can you just guide fine tuning or Rag ??
@engineerprompt 5 หลายเดือนก่อน ⁺¹
For this, my recommendation will be to use RAG for it.
@CryptoMaN_Rahul 5 หลายเดือนก่อน
Cool thanks @@engineerprompt
@freddiechipres 5 หลายเดือนก่อน
Why testing Gemini flash? Does Gemini Pro not work better?
@engineerprompt 5 หลายเดือนก่อน
Pro is better but has more limitations for free usage.
@rodrigolumi 5 หลายเดือนก่อน
Great video.
@engineerprompt 5 หลายเดือนก่อน ⁺¹
thank you!
@micbab-vg2mu 5 หลายเดือนก่อน ⁺¹
great i will test it -:)
@engineerprompt 5 หลายเดือนก่อน
Let me know how it goes
@intellect5124 5 หลายเดือนก่อน ⁺¹
small number of pdf means how many? whats ur assumption?
@engineerprompt 5 หลายเดือนก่อน
As long as they fit in the context, which is 1M, although I would suggest using about 50-70% of that. Using more can result in lost in the middle
@KGIV 5 หลายเดือนก่อน
I don't like using libraries to parse my PDF files. I found it to be more complex and less robust than writing the parsing services myself. I will defintely give flash a try though.
@engineerprompt 5 หลายเดือนก่อน
Agree, its worth a shot.
@mohammad-xy9ow 5 หลายเดือนก่อน
Is there demand of rag in the market ?
@engineerprompt 5 หลายเดือนก่อน
RAG is the only real application of GenAI at the moment that businesses are actually widely using.
@muhammadsaqib453 5 หลายเดือนก่อน
Please run any ad compaign for your channel as your channel has the potential to get 500k subscribes in a hour.
@lavericklavericklave 5 หลายเดือนก่อน ⁺¹
This review is basically pointless. Youre running it on one pdf. The whole pdf can easily be dumped into the context (oai default is 20 x 1000 token chunk). You should be doing it on much larger datasets
@ebandaezembe7508 5 หลายเดือนก่อน
gemini 1.5 pro also has this new feature i think
@engineerprompt 5 หลายเดือนก่อน
Yes, it does. Its relatively more expensive though if you put it in production.
@interspacer4277 5 หลายเดือนก่อน ⁺¹
RAG in general has been slowly dying as context increases are combined with cost decrease. On top of that, folk are getting better at compression and database use (LLMs understand SQL, etc), and agentic flows.
The speed loss and cost to maintain a vector database, just isnt always worth it when I can simply task a flow itself for semantic search and feed it to whatever needs it.
@Hisma01 5 หลายเดือนก่อน ⁺³
RAG is not dying. It merely depends on the use-case. It was even mentioned several times in this video where this is not a replacement for RAG where there is a large corpus of information (millions of docs). It certainly is evolving however, and quite rapidly. I would love to get to the point where I can avoid having to parse pdfs and documents completely, and just feed docs to a vision model & have that the chunks stored directly in a db. But getting rid of RAG completely? Nah. Not yet. I would say RAG would only go away if there's some way where model training reaches a point you can just throw docs at it and rather than feeding them into a vector db, you can feed docs directly into the llm itself.
@nguyenanhnguyen7658 5 หลายเดือนก่อน
Why would u want to pay for cloud GPT !?!? Do it yourself.
@engineerprompt 5 หลายเดือนก่อน
checkout localgpt for that :)
@NeuroScientician 5 หลายเดือนก่อน ⁺¹
As usual I will wait for third parties to verify which google's claims are real and which are just another scam.
@JT-qi6el 9 วันที่ผ่านมา
NO ITS SLOW

ต่อไป

เล่นอัตโนมัติ

Try this Before RAG. This New Approach Could Save You Thousands!