This was amazing! Thank you very much for all the hard work - you’re incredible! Keep up the great content. My humble request to continue with UI interface for this RAG application.
Thank You. I have done many different RAG apps. A future vid suggestion: comparing results from completely local RAG to using remote embeddings, say OpenAI. I am finding that the remote embeddings are consistently better quality and the Q&A responses are better. To me, it is pointless to do a RAG app if the embeddings are poor and the answers are mediocre !
RAG is a powerful tool for working with open-source models. It's a good idea to explore alternative tools as well, ensuring you choose the best fit for your specific needs.
is it better to use specialized embedding models like nomic-embed-text or llama3.1 itself as an embedding model? also can you please do a tutorial on some of the major rag ideas like building a self correcting rag (CRAG) and the compare the results with naive rag using an evaluating framework like ragas, giskard etc?
This is a very good comment and a very useful request.. I would like the model to respond not just with the answer but also with the source of that answer (file name + page number = make sure the model is not drunk one 🧐). I believe this would be great if we add a re-ranker model 💯
Great video and explanation! Thank you. I've a question. Will the context variable be inputted to the model through the prompt as embeddings of each page of the four pages or it will be converted back to string? Thank you in advanced.
Thanks a lot for your very interesting video. It's great that all the time your code is working out of the box. Only langchain-ollama was missing in the requirements.txt. And unfortunately faiss-gpu is not supported on Windows 11 (AFAIK?). Great stuff you are offering to all of us all the time. Your explanations are always so good to understand. Amazing !! Please keep going !
Santiago, I loved the video. Very clear explanation. I have a question. As you know, there is a limit to passing a prompt. For example, if I want a summary of a whole document, in theory, I have to pass the whole document to the LLM so it can create a summary of it. But this won’t fit in the context window. Chatgpt has I think 128K limit on the api but OLLAMA does not have this I think. Also I have no idea if 128K is enough for any LLM's. If I already stored a large document on my vector database how could I pass the whole document to LLM to summarize it? I cant just add whole document in the prompt. Thanks
Hey there thanks , your videos are really helpful. I am student creating project around rag I want a video how can I make interface oriented or easy full stack rag bot without the large GPU
@@rayofvictory you don't need langchain or any other of these tools to create a RAG. Also he mentions "a user employee asks" but all this is local so that's not true.
Came here out of curiosity, and ended up watching the full video. Thanks for taking the time to explain the very basics. Learned a lot!
YOU ARE JUST INCREDIBLE !!!!!!!!!! keep them coming. you are pretty much my main teacher.
This was amazing! Thank you very much for all the hard work - you’re incredible! Keep up the great content. My humble request to continue with UI interface for this RAG application.
That's incredible how good you're at explaining this though argument! Thanks a lot for your work. Really appreciate it
You are the man, I have watched god knows how many videos about rag and i finally get it, Thank you very much
Very beautiful explain each step and make it so simple to understand, thanks for providing this video.
Fantastic! You've really made my day by explaining it so clearly. Thank you!
Great video and nice that it's possible to run entirely locally, all with open source 🎉
Thank You. I have done many different RAG apps. A future vid suggestion: comparing results from completely local RAG to using remote embeddings, say OpenAI. I am finding that the remote embeddings are consistently better quality and the Q&A responses are better. To me, it is pointless to do a RAG app if the embeddings are poor and the answers are mediocre !
Thanks for the fantastic explanation!
outstanding. for next video, I would love to see how LLMs are applied to mine unstructured data
RAG is a powerful tool for working with open-source models. It's a good idea to explore alternative tools as well, ensuring you choose the best fit for your specific needs.
Absolutely loved it, Thank you for your efforts 🙂
Best explanation !
Thank You
Very clear and useful. Thank you!
is it better to use specialized embedding models like nomic-embed-text or llama3.1 itself as an embedding model?
also can you please do a tutorial on some of the major rag ideas like building a self correcting rag (CRAG) and the compare the results with naive rag using an evaluating framework like ragas, giskard etc?
This is a very good comment and a very useful request.. I would like the model to respond not just with the answer but also with the source of that answer (file name + page number = make sure the model is not drunk one 🧐). I believe this would be great if we add a re-ranker model 💯
@@HassanAllaham that will be great! @underfitted can you please chime in?
@@HassanAllaham For that, you need to keep the retriever output in a variable or a list while executing.
Great video and explanation! Thank you. I've a question. Will the context variable be inputted to the model through the prompt as embeddings of each page of the four pages or it will be converted back to string? Thank you in advanced.
Thanks a lot for your very interesting video. It's great that all the time your code is working out of the box. Only langchain-ollama was missing in the requirements.txt. And unfortunately faiss-gpu is not supported on Windows 11 (AFAIK?). Great stuff you are offering to all of us all the time. Your explanations are always so good to understand. Amazing !! Please keep going !
Great video
You well explained RAG
Love the video! Could you please create a video showing how to export a Jupyter notebook into a proper project structure and deploy it on the cloud?
Santiago, I loved the video. Very clear explanation. I have a question. As you know, there is a limit to passing a prompt. For example, if I want a summary of a whole document, in theory, I have to pass the whole document to the LLM so it can create a summary of it. But this won’t fit in the context window. Chatgpt has I think 128K limit on the api but OLLAMA does not have this I think. Also I have no idea if 128K is enough for any LLM's. If I already stored a large document on my vector database how could I pass the whole document to LLM to summarize it? I cant just add whole document in the prompt. Thanks
This was perfect thank you
superrr amazinggggg,explaination
next video is how to query if documents have images. Can LLMs describe or get context from images
You are really really good
🙏🏻
It is possible to run the jupyter notebook on Google Colab? How it could be?
Upload it from local
Apparently import langchain_ollama does not exist. I keep getting this error when trying to run the model
faiss-gpu only supports up to python 3.10, is there an alternative?
I'm facing an issue trying to install faiss-gpu on a Mac with an M3 Pro chip. Is anyone else having this problem?
Hey there thanks , your videos are really helpful. I am student creating project around rag I want a video how can I make interface oriented or easy full stack rag bot without the large GPU
good video
Gravenberch was my motm
*"An unnecessarily complicated introduction to RAG that only works locally.". There I fixed it for you.
May I know what is unnecessarily complicated? He is taking the time to go step by step for users to scale this solution for our use cases.
People always have to criticize, whatever it is.@@rayofvictory
If his explanation is too complex for you maybe this subject is not for you.
@@o_glethorpe I am not talking about me. This is supposed to be introduction. You don't need any of these to create a RAG.
@@rayofvictory you don't need langchain or any other of these tools to create a RAG. Also he mentions "a user employee asks" but all this is local so that's not true.