RAG from the Ground Up with Python and Ollama

แชร์
ฝัง
  • เผยแพร่เมื่อ 30 ธ.ค. 2024

ความคิดเห็น • 153

  • @decoder-sh
    @decoder-sh  9 หลายเดือนก่อน +13

    Thanks to @munchcup for sharing a great embedding model that is available straight from the ollama library ollama.com/library/nomic-embed-text 🔥
    Also all of the code from this video is provided on my website decoder.sh/videos/rag-from-the-ground-up-with-python-and-ollama 👌

    • @munchcup
      @munchcup 9 หลายเดือนก่อน +2

      🙏 Am Humbled

  • @mitchell2769
    @mitchell2769 9 หลายเดือนก่อน +29

    As a nonprofessional programmer, this was the introductory best video on RAG I have seen anywhere, and that's saying a lot with how many I've watched. Thank you! I look forward to many more videos continuing the series!

    • @decoder-sh
      @decoder-sh  9 หลายเดือนก่อน +1

      Thank you for sharing, I'm happy to hear it!

    • @SergesLemo
      @SergesLemo 8 หลายเดือนก่อน

      I second that.

  • @parttimelarry
    @parttimelarry 9 หลายเดือนก่อน +19

    This look solid. Happy it's not all LangChain-specific like many tutorials out there. Saving for later.

    • @decoder-sh
      @decoder-sh  9 หลายเดือนก่อน +7

      I'll cover langchain soon enough, but I wanted to start with a from-scratch implementation to teach the basics

    • @myhificloud
      @myhificloud 9 หลายเดือนก่อน

      @parttimelarry Your content was some of the first I absorbed as I entered the space. Spectacular and inspiring content. Looking forward to more if/when available, great work.

    • @decoder-sh
      @decoder-sh  9 หลายเดือนก่อน +2

      @parttimelarry Also I just subscribed to your account, congrats on 100k! I'd like to eventually explore the intersection of finance and LLMs :)

    • @dbwstein
      @dbwstein 9 หลายเดือนก่อน

      ‘This is a great video! Im also looking forward to your langchain video. I’ve been struggling with that.

  • @OgeIloanusi
    @OgeIloanusi 3 หลายเดือนก่อน +3

    This is a great video. You teach like a Professor. You're an expert and well talented! Your organization will indeed love working with you.

  • @user-wr4yl7tx3w
    @user-wr4yl7tx3w 8 หลายเดือนก่อน +9

    I really think TH-cam should be recommending this channel. The content quality is very high.

    • @decoder-sh
      @decoder-sh  8 หลายเดือนก่อน +1

      Thanks for saying so, welcome to my channel!

  • @HarishPillay
    @HarishPillay หลายเดือนก่อน

    Thanks for you series of videos. They are to the point and very comprehensive. Thanks again!

  • @BP-kc3dj
    @BP-kc3dj 9 หลายเดือนก่อน +3

    FANTATSTIC PRESENTATION! Thank you for being a good teacher. I stumbled on your channel after seeing the oppposite of what you did. I mean it was really bad. Thank You!

    • @decoder-sh
      @decoder-sh  9 หลายเดือนก่อน

      Haha well I'm sorry you had a bad experience before, but am glad you found your way here!

  • @christoherright6430
    @christoherright6430 หลายเดือนก่อน

    This is the true tutorial explain very well only important information that need to use to implement RAG. Thanks.

  • @ronaldokun
    @ronaldokun 9 หลายเดือนก่อน +2

    I liked your overall presentation style. Objective and minimalist without being simplistic.

    • @decoder-sh
      @decoder-sh  9 หลายเดือนก่อน

      Thank you for watching!

  • @Jeff-x7d
    @Jeff-x7d 8 วันที่ผ่านมา +2

    Wow! Great video! You explained everything very clearly. I would love to be able to see how to do this using Langchain, etc... Thank you!

    • @decoder-sh
      @decoder-sh  7 วันที่ผ่านมา

      I made one video on LangChain so far! I've been on a longer hiatus than I wanted to this year, but I'll continue with Langchain when I'm back :)
      th-cam.com/video/qYJSNCPmDIk/w-d-xo.html

  • @FrankenLab
    @FrankenLab 2 หลายเดือนก่อน

    I liked how you used cosign similarity to show the actual chunks of matched text. Since this usually happens behind the curtains it was nice to have that extra insight. I like your style and just subscribed.

  • @tinkerman1790
    @tinkerman1790 9 หลายเดือนก่อน +1

    Thx for sharing this amazing tutorial showing us what RAG is as well as “how-to” in such a way of details. Keep your great work and I‘ve clicked “subscribe” button right away 👍🏻

    • @decoder-sh
      @decoder-sh  9 หลายเดือนก่อน

      Thank you for subscribing, I look forward to making more videos for you!

  • @gustavow5746
    @gustavow5746 9 หลายเดือนก่อน +1

    best video about rag so far in the web! congrats! your code runs! I think it needed an intro about ollama and how to run that. but besides it, this video is fantastic!

    • @decoder-sh
      @decoder-sh  9 หลายเดือนก่อน +1

      Not a bad idea! I do have a whole playlist on Ollama, so can choose where you want to jump in :) th-cam.com/play/PL4041kTesIWby5zznE5UySIsGPrGuEqdB.html

    • @gustavow5746
      @gustavow5746 9 หลายเดือนก่อน

      ​@@decoder-sh I wanted to suggest to put a video on how to train a model on specific documents. We can see that this approach have some limitations such as the number of sentences passed as a context to the model and depending on the number, the output is different. For example, if you choose the last five sentences VS the last twenty, the output is a little different.
      Anyways I have to say that this is the first that combines all (embeddings, vectoring, LLM, documents and pure code) together. Congrats again! Look forward for your next content.

    • @decoder-sh
      @decoder-sh  9 หลายเดือนก่อน +1

      @@gustavow5746 Great idea! Model fine-tuning is definitely on my list :)

    • @gustavow5746
      @gustavow5746 8 หลายเดือนก่อน +1

      @@decoder-sh hello, I run some tests and looks like these models are already trained with peter pan story. I tried to ask with and withou using the script, and the asnwers were very similar. Just wanted to give this feedback. thanks

    • @decoder-sh
      @decoder-sh  8 หลายเดือนก่อน +1

      @@gustavow5746 Ah that's a great test to run, thank you for the information! I will take that into consideration for future videos.

  • @Gi-Home
    @Gi-Home 8 หลายเดือนก่อน +1

    Excellent tutorial, thank you so much, your code example ran perfectly and the results were quite decent.

    • @decoder-sh
      @decoder-sh  8 หลายเดือนก่อน

      I'm glad to hear it! Which embedding model did you end up using?

  • @richsadowsky8580
    @richsadowsky8580 5 หลายเดือนก่อน

    Fantastic video. Yes, I could explain embeddings. I already had a basic concept, but your simple file-based cache of the embeddings really highlighted the basic functionality without needing a vector database.

  • @vuyombam4872
    @vuyombam4872 หลายเดือนก่อน

    Great tutorial David, clear and easy to follow. You just got a new subscriber 🦾

  • @dr.mikeybee
    @dr.mikeybee 6 หลายเดือนก่อน +3

    Nicely done. I'm building a chatbot with a memory system based on RAG. Prompts, responses, topics, entities, and tool results are stored in json documents. Embeddings are stored in chromadb. I think I'll dump chroma and save embeddings the way you do. That removes a lot of complication. And I like your similarity search. I can dump Langchain too. I hope this will become a sophisticated memory system for chat history, etc. I'll delete or archive old memories that haven't been accessed in a week or a month, etc. This can be a hyper parameter that I can tune. I've already written a router program that decides what to retrieve and what tools to use, so no Langgraph either. Anyway, thanks for this. Publishing this helps everyone. Please do more like this.

  • @mbottambotta
    @mbottambotta 9 หลายเดือนก่อน +1

    Thanks for your clear and effective explanation. I'm grateful also for how you steered clear from frameworks that abstract the RAG implementation details away from you. In fact, I'd appreciate if you could dive deeper into things like chunking strategies and agents.

    • @decoder-sh
      @decoder-sh  9 หลายเดือนก่อน

      Thanks for watching! I do plan on covering these topics in future videos. I'm currently writing a series of videos on more advanced RAG topics using either llamaindex or langchain. I might do one that looks at "manual" pdf parsing versus builtin document loaders if people are interested

    • @mbottambotta
      @mbottambotta 9 หลายเดือนก่อน

      @@decoder-sh Thanks! Much appreciated, looking forward to your future videos.

    • @laritaharrington1117
      @laritaharrington1117 8 หลายเดือนก่อน

      @@decoder-shI am very new to ai and coding. How would you suggest I truly understand what it is these tools are doing. I have a great business plan but my mind thinks as a workflow not like a programmer.
      Great content!

    • @mbottambotta
      @mbottambotta 8 หลายเดือนก่อน

      @@laritaharrington1117 Here's what I do: first, I code along with the video. Typically, I make mistakes and have to fix them; this already helps me understand better.
      Then, I come up with my own little project and try and implement that. This usually takes far longer than I had originally planned for, but it does mean that I learn about the limitations and pitfalls.
      If you like, we can do a little project together; learning this way is probably even more effective.

  • @SimplyCarolLee
    @SimplyCarolLee 8 หลายเดือนก่อน

    I came here upon the recommendation of my good friend and this video is so educational. Loved it! ❤

    • @decoder-sh
      @decoder-sh  8 หลายเดือนก่อน

      Your friend clearly has good taste, thanks for watching!

  • @BigSenior472
    @BigSenior472 9 หลายเดือนก่อน

    Thanks!

  • @kushspatel
    @kushspatel 8 หลายเดือนก่อน

    I love these videos, very helpful. I am a junior dev trying to understand some of these concepts and I feel like these videos have helped me immensely!

    • @decoder-sh
      @decoder-sh  8 หลายเดือนก่อน +1

      I'm glad to hear it, keep learning!

  • @hugopristauz538
    @hugopristauz538 5 หลายเดือนก่อน

    nice, small sized demo, demonstrating the principles. good job, I learnt a lot 🙂

    • @decoder-sh
      @decoder-sh  5 หลายเดือนก่อน

      Glad you liked it!

  • @munchcup
    @munchcup 9 หลายเดือนก่อน +3

    In the embedding part there is a very fast model specific for embedding in ollama named nomic embed text which simplifies the process.Just a point to note.

    • @decoder-sh
      @decoder-sh  9 หลายเดือนก่อน +1

      Awesome suggestion, thank you! Wow and they're MRL embeddings? I've been meaning to do a video on these ollama.com/library/nomic-embed-text

    • @munchcup
      @munchcup 9 หลายเดือนก่อน

      ​@@decoder-shThank you for your teachings. You've opened me to endless possibilities in using ollama as a self taught dev.

  • @mrrohitjadhav470
    @mrrohitjadhav470 9 หลายเดือนก่อน +1

    Awesome lesson, precisely what I had been looking for. Please look at fine-tuning this existing model with many documents. I looked everywhere and couldn't locate one without utilising an API.

    • @decoder-sh
      @decoder-sh  9 หลายเดือนก่อน

      Thanks for watching! I plan to cover fine tuning soon, there are a lot of interesting techniques to address

    • @mrrohitjadhav470
      @mrrohitjadhav470 9 หลายเดือนก่อน

      @@decoder-sh Great ❤

  • @edwardtaft8813
    @edwardtaft8813 9 หลายเดือนก่อน

    Really great! Thank you. Look forward to the next steps here with langchain!

  • @rakibuzzamanrahat
    @rakibuzzamanrahat 9 หลายเดือนก่อน

    Great videos, I just started in this space and started following your videos.

    • @decoder-sh
      @decoder-sh  9 หลายเดือนก่อน

      Welcome! I peeked at your Hands on with Machine Learning video - that's a great textbook :)

  • @nandinijampana528jampana3
    @nandinijampana528jampana3 3 หลายเดือนก่อน +1

    First of all Thank you for making this vedio!!, can you also make vedio on how to handle when mulitple text files are there. Thank you.

  • @yaa3g
    @yaa3g 9 หลายเดือนก่อน

    Well prepared and executed, super useful, thanks for your work.

    • @decoder-sh
      @decoder-sh  9 หลายเดือนก่อน

      Thank you very much! Looking forward to making more

  • @txbluesguy
    @txbluesguy 9 หลายเดือนก่อน

    This is fantastic. It fits perfectly with a project I am working on.

    • @decoder-sh
      @decoder-sh  9 หลายเดือนก่อน

      That’s great! May I ask what you’re building?

  • @richardyim8914
    @richardyim8914 7 หลายเดือนก่อน

    Brilliant video. 10/10. Will be recommending to everyone I know.

    • @decoder-sh
      @decoder-sh  7 หลายเดือนก่อน

      Thanks very much!

  • @pedrogorilla483
    @pedrogorilla483 7 หลายเดือนก่อน

    Man, I need to study more. Thanks for putting this video out!

    • @decoder-sh
      @decoder-sh  7 หลายเดือนก่อน

      My pleasure, thanks for watching!

  • @theo1582
    @theo1582 7 หลายเดือนก่อน

    I'm looking forward the next video about RAG using langchain ! :)

  • @proterotype
    @proterotype 8 หลายเดือนก่อน

    Great stuff per usual. Looking forward to the LangChain video

  • @bhagavanprasad
    @bhagavanprasad 7 หลายเดือนก่อน

    Thank you for sharing knowledge. Looking for more such videos

  • @Enkumnu
    @Enkumnu 7 หลายเดือนก่อน

    I really like it. Your videos are very clear. The basics are important! I use the same approach with scanned documents, converted from PDF to text and stored in a database. (filename, link, text, and tokenization). With Streamlit I do a search. What I would like to do is an exportable model (e.g., the plant guide for healing and being able to use it). What would be the best approach? Thank you for your answer.

  • @mistercakes
    @mistercakes 9 หลายเดือนก่อน +2

    Little confused about how there was similarity with "who is the story's primary villain". How can we know it's RAG that is providing this answer and not the inference model?
    Would also be nice to see what was the similar chunks and then convert them back to string to understand what the model got as input before it responded with the answer about "Hook".
    I think that's the only major thing missing from your tutorial.
    Thanks again for the good content.

    • @decoder-sh
      @decoder-sh  9 หลายเดือนก่อน

      Thank you for this feedback! This is a great point, I should've kept logging the chunks that are being passed to context, I'll keep that in mind for the next video.
      I encourage you to try this on your own system at home, but Mistral is good enough at instruction following that it adhered to our system prompt and only used the context it was provided. I'll try to explicitly show a failure case to demonstrate that our model is behaving as expected in future videos.
      Thanks for watching :)

    • @Djeez2
      @Djeez2 9 หลายเดือนก่อน

      You could test that by replacing "Hook" in the returned chunks by some other name and then see what the LLM returns as an answer.

    • @decoder-sh
      @decoder-sh  9 หลายเดือนก่อน

      @@Djeez2Yep this is a great idea. You could even set up unit tests for different models that either include or don't include the answer to a given question. The only challenge there is getting the model to say exactly "I can't answer that with the given context" or something that plays nice with simple tests.

  • @EricBurgers-qc6ox
    @EricBurgers-qc6ox 2 หลายเดือนก่อน

    wow, excellent vid. Thanks!

  • @albertoavendano7196
    @albertoavendano7196 24 วันที่ผ่านมา

    Top notch video ...
    Thanks a lot ...
    I would love to know if we can use GPT locally as a way of having other options for ollama.

    • @decoder-sh
      @decoder-sh  7 วันที่ผ่านมา

      Hi there, could you explain a bit more about what you want? I love running local models so I'd like to help figure this out. If you're asking to use OpenAI's ChatGPT: You can access it through openAI's API, but that wouldn't be a "local" model since you're sending all of your requests to OpenAI.

  • @jernr
    @jernr 8 หลายเดือนก่อน

    Does anyone know how to increase the size of the paragraphs so that the responses are more useful? That part was skipped over in the video, if I recall correctly.

  • @Van-Helssen
    @Van-Helssen 7 หลายเดือนก่อน +1

    Amazing, thanks mate ❤

  •  5 หลายเดือนก่อน

    Thanks a lot for the detailed explanation in the video! I have a question regarding Ollama: Is it possible to use Ollama and the models available on it in a production environment? I would love to hear your thoughts or any experiences you might have with it.

  • @skperera-g8l
    @skperera-g8l 5 หลายเดือนก่อน

    Fantastic video! The RAG example given is for a single document, but a repository usually contains dozens of documents. Is there a way to bulk-upload the documents at once to the LLM (for chunking and embedding)? Thanks.

  • @cj_is_here
    @cj_is_here 9 หลายเดือนก่อน

    Awesome video. Very well explained. Congratulations

    • @decoder-sh
      @decoder-sh  9 หลายเดือนก่อน

      Thank you CJ!

  • @CV-wo9hj
    @CV-wo9hj 8 หลายเดือนก่อน

    Would love to see you do a version of this using nomic , llama3 and chroma DB for your vector stores

  • @Aberger789
    @Aberger789 8 หลายเดือนก่อน

    Fantastic video! I've been trying to come up with a streamlit chat with your document but also have historical context of the chat. But, I'm beginning to wonder if that's not as useful as just sequencing it out considering each time will require context retrieval. All good

  • @tinytube4me-z8h
    @tinytube4me-z8h 9 หลายเดือนก่อน +1

    This is the best, easy to understand intro of RAG. Do I need to install ollama on my Windows machine before running your code? And replace 'nomic-embed-text' with the model when running your code. Could you please do a video llama 2?I downloaded llama 2 from Meta, but don't know how to use it, such as how to open it or query it. Your code-driven approach best explains everything.

    • @decoder-sh
      @decoder-sh  9 หลายเดือนก่อน

      Thanks for watching! You will need to install Ollama. Furthermore, you can actually use the llama-2 model directly from ollama! ollama.com/library/llama2

  •  9 หลายเดือนก่อน

    Great video. Looking forward to see a 2nd part with Lola an index and what do you recommend for working with pdf files!

    • @decoder-sh
      @decoder-sh  9 หลายเดือนก่อน

      Definitely! I could do a whole video just on working with PDFs, it can be a pain 😰

  • @robots_id9112
    @robots_id9112 9 หลายเดือนก่อน +1

    great videos! i want to ask how do we set up it so it can answer multiple follow up questions while retaining same context?

    • @decoder-sh
      @decoder-sh  9 หลายเดือนก่อน +1

      The chat method takes a list of messages, so you can just add your previous interactions to that list to give your LLM context. Here's an example th-cam.com/video/ZHZKPmzlBUY/w-d-xo.html

  • @ΛΑΦ
    @ΛΑΦ 7 หลายเดือนก่อน +1

    Great explanation! How this implementation performs compared to langchain? I heard alot of times that langchain is not suitable for production. Do you think that the responses are faster here due to less abstractions compared to langchain. Thanks once again.

    • @decoder-sh
      @decoder-sh  7 หลายเดือนก่อน +1

      If I was deploying a RAG app in a production environment, langchain gives you a lot of useful debugging, tracing and serving capabilities that a raw implementation like mine would not have. And if speed was your main concern, I would probably start by replacing ollama with groq, or at least vllm if you’re determined not to use a paid api. Thanks for your question! I have part 1 of a langchain series coming out in the next couple days

    • @ΛΑΦ
      @ΛΑΦ 7 หลายเดือนก่อน

      Thank you so much for your answer. Another question i have with rag applications is how we can handle contextual questions in a chat.for example lets say i have txt about a project with description, implementation details and applications. If i ask what is "projectname", the embedding model will feed to the llm the description paragraph.but if my next question is "and how this is implemented?" The embedding model will not know that i am refering to the "projectname" so the llm will either not answer or hallucinate. If i feed it with more similar paragraphs so that it can answer based on chat history then i will start facing context-window limitations. Is there a solution for this?

  • @billmarshall383
    @billmarshall383 8 วันที่ผ่านมา

    Great video. Is there somewhere to download the code you used? Currently, I am stopping the presentation and typing from your screen but there must be a better way. Thanks this is the best RAG description I have found.

    • @decoder-sh
      @decoder-sh  7 วันที่ผ่านมา

      Hey Bill, thanks for watching! You can find all the code from my videos here
      decoder.sh/videos/rag-from-the-ground-up-with-python-and-ollama

  • @doctorbill37
    @doctorbill37 9 หลายเดือนก่อน

    Excellent video, explaining from the ground up how RAG works without immediately starting with LangChain. I have played with some other RAG implementations using Ollama and have intentionally asked questions outside of the purview of the RAG content and they do come back with the correct "not found in the documentation" response. But what still remains unclear to me is how an LLM is kept restricted to only accessing RAG content. How is that fence guaranteed? Does setting the temperature affect that fence?

    • @decoder-sh
      @decoder-sh  9 หลายเดือนก่อน +1

      It's difficult to make guarantees with LLMs, but your best bet is to use a model whose training data includes instruction following. Mistral is good at this by default, but also has an instruct fine-tuned version available on ollama!

  • @AaronGayah-dr8lu
    @AaronGayah-dr8lu 7 หลายเดือนก่อน

    This was well done. Thank you. Will you, by chance, be working on any videos to make RAG web apps?

    • @decoder-sh
      @decoder-sh  7 หลายเดือนก่อน +1

      I am more interested in the backend than the frontend of RAG apps, but what did you have in mind?

    • @AaronGayah-dr8lu
      @AaronGayah-dr8lu 7 หลายเดือนก่อน

      @@decoder-sh I'm thinking something along the lines of the RAG interacting with databases as well as individual documents. In my case, it's PostgreSQL. The issue really is the number of options available such as Pinecone, pgvector, ChromaDB, and the like. Maybe my search has not been as comprehensive as it needs to be, but having a proper comparison of these options would help with making the optimal selection.
      My use case involves analyzing an ERP database of tens of thousands of inventory items to identify issues such as potential duplication, errors and typos, the potential to standardize items - meaning have one inventory item for multiple applications instead of different items per application (this involves looking at item specifications), and the like. There will likely be attached product specification sheets and other literature available - this video you shared has demonstrated how I can manage these attachments accordingly.
      Thank you for taking the time to respond to my comment.

  • @maheshsanjaychivateres982
    @maheshsanjaychivateres982 9 หลายเดือนก่อน +1

    Thank you for this video lesson

    • @decoder-sh
      @decoder-sh  9 หลายเดือนก่อน

      You are welcome!

  • @juliandarley
    @juliandarley 8 หลายเดือนก่อน

    this is very well done, thank you. when is your next video and will it be on refining and improving RAG?

    • @decoder-sh
      @decoder-sh  8 หลายเดือนก่อน

      My next videos will be discussing the new models from meta and microsoft, and I'm working on another one that introduces langchain :)

  • @user-he8qc4mr4i
    @user-he8qc4mr4i 9 หลายเดือนก่อน +1

    very very nice! thx for sharing!

    • @decoder-sh
      @decoder-sh  9 หลายเดือนก่อน

      You're on a roll, thanks for watching!

  • @lilaiyad2173
    @lilaiyad2173 หลายเดือนก่อน

    is there a way which can be used if I want to let Lama answer from its own knowledge base when my own data is not relevant or is not accurate to be an answer to the user's prompt?

  • @khalidkifayat
    @khalidkifayat 8 หลายเดือนก่อน

    NICE One. ..question was how to give this to client as a remote work task project ?? and what are the cost optimization factors to be considered ?

  • @mahaltech
    @mahaltech 5 หลายเดือนก่อน

    hello pro
    its very good tutorial
    i have some file contain some article
    its good to with small articles from 2 - 3 line
    but if lines is more than 20 line its give response from no were
    can i increase chunk size or any solution to solve this ?
    thank you in advance

  • @CV-wo9hj
    @CV-wo9hj 8 หลายเดือนก่อน

    Keep it up bud your videos are great 👍

  • @mbarsot
    @mbarsot 9 หลายเดือนก่อน +1

    hello. Two separate questions.
    a) Your code works (thanks) and I'm using it for a task at my workplace (basically help in extracting info from a repository of dozens of .docx). However it seems that the model only know about the input file (the embeddings) and "forgot" everything else. That is, I can query the txt and get mostly correct answers, but if my answer need to be complemented by additional info (that I know is available if I query mistral on ollama normally) it does not have a clue. Or am i doing something wrong? the "augmented" in RAG seems to indicate that we add (maybe with higher priority) the knowledge in the input docs to the existing one, but it seems not to be the case...

    • @decoder-sh
      @decoder-sh  9 หลายเดือนก่อน +1

      Hi there, I'm glad to hear the code works! It sounds like you need to change the system prompt, since we're explicitly telling it to only use the provided context and no knowledge that it already has. Let me know if that works!

    • @mbarsot
      @mbarsot 9 หลายเดือนก่อน

      @@decoder-sh thank you for your answer. I changed the systemp prompt as follows, but no change... :
      "SYSTEM_PROMPT = """You are a helpful reading assistant who answers questions based on snippets of text provided in context and also with general knowledge. Be as concise as possible. If you're unsure, just say that you don't know. Reply in the same language used by the user question Context: """
      (PS i added the language thing as it is digesting documents in french but I want answers in EN.)

  • @user-wr4yl7tx3w
    @user-wr4yl7tx3w 8 หลายเดือนก่อน +1

    do you think it's possible to use this, ollama+rag, in conjunction with crewai?

    • @decoder-sh
      @decoder-sh  8 หลายเดือนก่อน

      Definitely! I know that CrewAI has rag tools and can use ollama for inference. I plan on covering CrewAI in a future video :)

  • @MidSeasonGroup
    @MidSeasonGroup 9 หลายเดือนก่อน

    Great video. Do you recommend RAG as an ideal way to update llms with framework libraries changes and updates or is finetuning the way to go?

    • @decoder-sh
      @decoder-sh  9 หลายเดือนก่อน

      Yes I think RAG is definitely the best way to empower your LLM to give up-to-date answers about dynamic data like documentation. You'll just need to update your embeddings when the data changes.

  • @aroonsubway2079
    @aroonsubway2079 8 หลายเดือนก่อน

    Thanks for this great video. Just a newbie question from me: how to formulate RAG setup to make Ollama understand a specific text format? I provided mixtral a list of timed activites for a person like this:
    Day1 7-00-00 : breakfast
    Day1 7-00-08 : workout
    Day1 7-00-16 : reading
    ......
    Day20 7-00-00 : breakfast
    Day20 7-00-08 : reading
    Day20 7-00-16 : workout
    I asked LLM to print out all instances of activities at 7-00-00 across all these 20 days, which is super easy for human, but the results from LLM were always wrong ... Can you give me some instructions? Is RAG suitable for processing such text data?

    • @decoder-sh
      @decoder-sh  8 หลายเดือนก่อน +1

      With data that is as cleanly structured as this, why couldn't you just write some simple code to filter the results for you? Like `data.split('
      ').map(line => line.split(' ')).filter(([day, time, topic]) => time === '7-00-00')` ? Or you could put your data into a database, give the schema to an LLM and have it write queries for you - this is called self querying python.langchain.com/docs/modules/data_connection/retrievers/self_query/

  • @mohamedkeddache4202
    @mohamedkeddache4202 9 หลายเดือนก่อน

    please, can someone tell me how to activate streaming on the response?

  • @nicosoftnt
    @nicosoftnt 9 หลายเดือนก่อน

    Very interesting thanks. Do you know how this compares to Haystack 2.0? I understand that Haystack being a framework offers more scalability while this is more of a DIY script?

    • @decoder-sh
      @decoder-sh  9 หลายเดือนก่อน +1

      This is my first time hearing about Haystack, but this is definitely just a simple DIY script to build an understanding of what's happening under the hood of more advanced libraries like Haystack, Langchain, etc

  • @emil8367
    @emil8367 9 หลายเดือนก่อน

    many thanks ! that's awesome what you share with us. Would be great to see more how to use bge-base 🙂 or not, actually I watched your YT video to change gguf file to ollama model and replaced it in: prompt_embedding = ollama.embeddings(model="bge-base-en-16", prompt=prompt)["embedding"], seems working, but I need to test it more 🙂
    in this case maybe you could create a tutorial how to build from scratch a speech to speech multimodal assistant operating eg on Ubuntu ?

  • @arvinsim
    @arvinsim 7 หลายเดือนก่อน

    For me, I have to remove "[embedding"]" in line 22 for some reason. Any reason why?

    • @decoder-sh
      @decoder-sh  7 หลายเดือนก่อน

      It''s possible that the interface has changed since I created the video?

    • @arvinsim
      @arvinsim 7 หลายเดือนก่อน

      @@decoder-sh It's possible. I am getting an error on `needle_norm = norm(needle)`
      TypeError: unsupported operand type(s) for *: 'dict' and 'dict'
      May I know what numpy version you are using?

  • @myhificloud
    @myhificloud 9 หลายเดือนก่อน

    Curious, do you github? Thanks for another great video and useful content, very much appreciated.

    • @decoder-sh
      @decoder-sh  9 หลายเดือนก่อน +2

      Yes I have githubbed once or twice, why do you ask?
      I appreciate you watching, it means a lot!

    • @myhificloud
      @myhificloud 9 หลายเดือนก่อน

      @@decoder-sh Thank you for your reply. I am not a developer, yet learning/absorbing.
      Github key points (for my workflow):
      - presents a reliable single source of truth
      - centralized repository for rapidly growing/scaling projects
      - serves as a hub to linked resources (e.g. websites, youtube, code, updates, changes, new projects, etc)
      - project versioning
      - lowers barrier to entry, while simplifying versioned resource aggregation for everyone, at every level.

  • @squiddymute
    @squiddymute 9 หลายเดือนก่อน

    can you have embedding with images ? using the llava model ?

    • @decoder-sh
      @decoder-sh  9 หลายเดือนก่อน

      Very interesting question! It's definitely possible to create image embeddings, and in fact that's how image search works on Google. However I don't think you can create image embeddings directly from Ollama. It looks like Llava embeds images using the CLIP model internally.
      github.com/haotian-liu/LLaVA/blob/main/llava/model/multimodal_encoder/clip_encoder.py
      huggingface.co/docs/transformers/v4.21.0/en/model_doc/clip

    • @decoder-sh
      @decoder-sh  9 หลายเดือนก่อน

      Interesting conversation on the topic x.com/willdepue/status/1772050291757850826?s=46

  • @skyblaze6687
    @skyblaze6687 8 หลายเดือนก่อน

    thnx man in million i really hate those xxxx lama index people absurd way of intentionally forcing their own set of storage

    • @decoder-sh
      @decoder-sh  7 หลายเดือนก่อน +1

      I'll look at llama index in a few videos, but the great thing about code is that you can always jam in your own solution 😎

  • @PenicheJose1
    @PenicheJose1 8 หลายเดือนก่อน

    Do I have to install Ollama in every new python environment I create?

    • @decoder-sh
      @decoder-sh  8 หลายเดือนก่อน +1

      Yes you will need to install the Ollama python library in every new python environment you create, no you will not need to install the ollama application (which serves the models and API that the python library talks to) for each new python environment

    • @PenicheJose1
      @PenicheJose1 8 หลายเดือนก่อน

      @@decoder-sh I see, thank you!

  • @GaryHost-qs9pg
    @GaryHost-qs9pg 8 หลายเดือนก่อน

    How long did it take to generate embeddings of entire book

    • @decoder-sh
      @decoder-sh  7 หลายเดือนก่อน

      It will depend based on your hardware, embedding parameters, and embedding model. Ideally it would take a minute or two. Using mistral, I think it took 5-10 minutes on a macbook M1, which is why I recommend against using that as your embedding model.

  • @zeeshanfakhar1933
    @zeeshanfakhar1933 8 หลายเดือนก่อน +1

    Hi,
    What are your computer specs for running ollama?

    • @decoder-sh
      @decoder-sh  8 หลายเดือนก่อน

      Hi, I’m using an M1 MacBook Pro, but ollama itself has minimal requirements (you don’t even need a gpu!). It really comes down to the model you’re trying to run.

  • @lowpolyduck
    @lowpolyduck 9 หลายเดือนก่อน

    LETS GO

    • @decoder-sh
      @decoder-sh  9 หลายเดือนก่อน +2

      💻🦆

  • @Aristocle
    @Aristocle 8 หลายเดือนก่อน

    Could you give a similar example, but using graph databases like neo4js?

    • @decoder-sh
      @decoder-sh  8 หลายเดือนก่อน

      Yes I would love to do something with graph databases! Do you have a specific use case in mind?

    • @Aristocle
      @Aristocle 8 หลายเดือนก่อน

      ​@@decoder-sh Try texts with examples of Aristotelian syllogisms(from which propositional logic originated), to see if it holds the thread of reasoning. 😁 Are short, but hard to follow.
      Or logical document in general, that you can see if it does good or bad.

  • @grumpyguy7656
    @grumpyguy7656 8 หลายเดือนก่อน

    wow my PC took 269.115 to embed 85 sentences with tinydolphin. I think I need to use the api key method.

    • @decoder-sh
      @decoder-sh  7 หลายเดือนก่อน

      Try out nomic! I've found it to be really fast since it's trained specifically for this task.