This video is just poetry. Wonderful and clear explanation about RAG systems. The whole playlist is wonderful, I would even pay for such clear no-bullshit content! Thank you for the wonderful work, Matt! Appreciate it!
i love the fact that you had a history lesson about pdf and printers that required Fonts . also i love you to know that i dont like youtubers who talk instead of coding and showing it on the code editor but somehow i love how you describe the lessons. maybe its your voice or your experience whatever it is , its unique to you. Thx Matt🌹
Most of my other videos spend more time doing, but this would be a much longer video then. Those videos will come very soon. And then i can refer back to this one for the concepts.
Hi Matt Williams, thanks for the video. just a suggestion. I think a flow diagram to show how the RAG works at a high level can better explain the concept. I have known how RAG works and while watching the video and put myself in a situation which i never known RAG before, it would still confuse me how RAG works.
Great explanation. Thanks. I recently used Excel's Get Dara from a folder using power query, which did a great job extracting data from a hundred bank statements. Question: If i may, are Excel files okay, or is convering to CSV better? Cheers.
As always, great channel! Loved the explanation, though not using it explicitly for building a rag database, i've been using PyMuPDF to parse PDFs with various NLP libraries and LLMs and I've been receiving meh results. After your explanation, im considering if it would make more sense to first convert the PDFs to text (i dont have access to the original text), and then try to use them.... either way... thanks!
How do you take care of the indices on the embeddings column to make the query fast? I am working on a similar problem and want to build a RAG solution for some of my use case. I am really looking forward to the next part of it. Hope it comes out soon.
@@technovangelist Let's say I have data in text format (not chunked) and the data is specific to specific customer so I will have to chunk them and store the embeddings for each customer in the database. Now I am not sure how to store this data in postgres or use vector database and query it for the customer. I also want to make query performant. How do I solve this problem and which type of database should I use?
@@technovangelist i see, i thought it was part if long context window they capable of, this is really complicated matter... having uploaded files resides in RAM/VRAM along with the whole context length... mind blowing assembly engineering it must be...
Great breakdown! One of the only sources I’ve seen mention this flow other than Steve sanderson’s talk th-cam.com/video/TSNAvFJoP4M/w-d-xo.html Keep it coming! Would also loves video of books or courses on this type of learning in detail to augment your videos. Maybe a paid course one day too!
This video is just poetry. Wonderful and clear explanation about RAG systems. The whole playlist is wonderful, I would even pay for such clear no-bullshit content! Thank you for the wonderful work, Matt! Appreciate it!
The size doesn't matter pixelating was a nice touch.
This is great, thank you 👍👍👍 ...
This channel is hidden gem. Really appreciate the content!
Thank you for these. One of the best channels out there on how to use LLMs/Ollama for private data.
Wow. Amazing explanation.
Glad you liked it!
This is great I'm planning on adding RAG to my project in the next 2-3 weeks
This is awesome Matt! Thank you so much! Will be doing ALL OF THIS TONIGHT xD
Thanks for the course Matt.
Awesome explanation. Thank you
i love the fact that you had a history lesson about pdf and printers that required Fonts . also i love you to know that i dont like youtubers who talk instead of coding and showing it on the code editor but somehow i love how you describe the lessons. maybe its your voice or your experience whatever it is , its unique to you. Thx Matt🌹
Most of my other videos spend more time doing, but this would be a much longer video then. Those videos will come very soon. And then i can refer back to this one for the concepts.
Hi Matt Williams, thanks for the video. just a suggestion. I think a flow diagram to show how the RAG works at a high level can better explain the concept. I have known how RAG works and while watching the video and put myself in a situation which i never known RAG before, it would still confuse me how RAG works.
btw re pdf, "the" (proposed) method to "do it" is using a vision model instead of OCR
Great explanation. Thanks.
I recently used Excel's Get Dara from a folder using power query, which did a great job extracting data from a hundred bank statements. Question: If i may, are Excel files okay, or is convering to CSV better? Cheers.
As always, great channel!
Loved the explanation, though not using it explicitly for building a rag database, i've been using PyMuPDF to parse PDFs with various NLP libraries and LLMs and I've been receiving meh results.
After your explanation, im considering if it would make more sense to first convert the PDFs to text (i dont have access to the original text), and then try to use them.... either way... thanks!
How do you take care of the indices on the embeddings column to make the query fast? I am working on a similar problem and want to build a RAG solution for some of my use case. I am really looking forward to the next part of it. Hope it comes out soon.
You seem to be asking a question specific to your implementation. Can you tell me more about what you are trying to do?
@@technovangelist Let's say I have data in text format (not chunked) and the data is specific to specific customer so I will have to chunk them and store the embeddings for each customer in the database. Now I am not sure how to store this data in postgres or use vector database and query it for the customer. I also want to make query performant. How do I solve this problem and which type of database should I use?
I want to do this with my bank statements so I can ask things like "How much did I spend on Pizza Hut last year".
matt can ollama do prompt caching? like claude and gemini do? they said it can fasten the Inference by more than half, rather than rag...
It’s not really comparable. You would have to build it. They have software in front of the model that does it. The model knows nothing about it.
@@technovangelist i see, i thought it was part if long context window they capable of, this is really complicated matter... having uploaded files resides in RAM/VRAM along with the whole context length... mind blowing assembly engineering it must be...
neo4j?
Great breakdown! One of the only sources I’ve seen mention this flow other than Steve sanderson’s talk th-cam.com/video/TSNAvFJoP4M/w-d-xo.html
Keep it coming! Would also loves video of books or courses on this type of learning in detail to augment your videos. Maybe a paid course one day too!
im in the middle on of training my llama 3.1 model to right now and i stopped to watch this vidto.