"I want Llama3 to perform 10x with my private knowledge" - Local Agentic RAG w/ llama3

แชร์
ฝัง
  • เผยแพร่เมื่อ 18 พ.ค. 2024
  • Advanced RAG 101 - build agentic RAG with llama3
    Get free HubSpot report of how AI is redefining startup GTM strategy: clickhubspot.com/4hx
    🔗 Links
    - Follow me on twitter: / jasonzhou1993
    - Join my AI email list: www.ai-jason.com/
    - My discord: / discord
    - Corrective RAG agent: github.com/langchain-ai/langg...
    - LlamaParse: github.com/run-llama/llama_parse
    - Firecrawl: www.firecrawl.dev/
    - Jerry Liu build production-ready RAG: • Building Production-Re...
    ⏱️ Timestamps
    0:00 Intro
    1:33 How to give LLM knowledge
    3:05 Problem with simple RAG
    5:55 Better Parser
    9:01 Chunk size
    11:40 Rerank
    12:39 Hybrid search
    13:10 Agentic RAG - Query translation
    14:35 Agentic RAG - metadata filtering
    15:52 Agentic RAG - Corrective RAG agent
    17:33 Install LLama3
    18:00 Code walkthrough
    👋🏻 About Me
    My name is Jason Zhou, a product designer who shares interesting AI experiments & products. Email me if you need help building AI apps! ask@ai-jason.com
    #llama3 #rag #llamaparse #llamaindex #gpt5 #autogen #gpt4 #autogpt #ai #artificialintelligence #tutorial #stepbystep #openai #llm #chatgpt #largelanguagemodels #largelanguagemodel #bestaiagent #chatgpt #agentgpt #agent #babyagi
  • วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 155

  • @Jim-ey3ry
    @Jim-ey3ry 18 วันที่ผ่านมา +88

    This is prob one of the best RAG video I've seen, so many learnings in 20 mins

  • @jaanireel
    @jaanireel 15 วันที่ผ่านมา +14

    00:05 AI can revolutionize Knowledge Management
    01:46 Llama3 can process precise knowledge with fast inference
    05:27 Market strategy for AI startups
    07:16 Convert PDF files to markdown format for enhanced accuracy and control
    10:47 Finding the optimal chunk size through experiments
    12:34 Hybrid search combines Vector search and keyword search for better results
    16:12 Building a local agentic RAG with llama3
    17:48 Running Llama3 model on local machine and using Visual Studio Code
    20:53 Setting up key components for Llama3 performance
    22:20 Creating a complex agentic RAG workflow for document retrieval and answering

  • @titusblair
    @titusblair 18 วันที่ผ่านมา +3

    Yet again an amazing tutorial, thanks so much Jason!

  • @kenchang3456
    @kenchang3456 18 วันที่ผ่านมา +46

    Man, your videos keep getting better every time I look. You have a great mind and your presentation is excellent. Thank you very much, again, for sharing!

    • @magicismagic123
      @magicismagic123 10 วันที่ผ่านมา +1

      he is much better than 99.9% wanna be over hyped ai gurus on youtubu, twitter and linkedin!

  • @contractorwolf
    @contractorwolf 15 นาทีที่ผ่านมา

    Jason, I watch a lot of AI videos but I learn the most from yours. I am actually excited everytime i see you have put another one out. Keep up the great work!

  • @shyamvai
    @shyamvai 14 วันที่ผ่านมา +2

    One of the most informative RAG videos I’ve seen. Can’t wait to see more from your channel.

  • @starmap
    @starmap 7 วันที่ผ่านมา +1

    Great content! Thanks for putting in the effort. Will use this.

  • @FightFlixTv
    @FightFlixTv 17 วันที่ผ่านมา +2

    This is the best RAG video on the internet, awesome job, no fluff, high complexity but easy to understand, nice work

  • @PIOT23
    @PIOT23 17 วันที่ผ่านมา +1

    What a great video! Thanks for sharing your knowledge

  • @MyAmazingUsername
    @MyAmazingUsername 14 วันที่ผ่านมา +1

    Really great tutorial, teaches a lot in very short time! Thanks!

  • @seventhapex
    @seventhapex 17 วันที่ผ่านมา +1

    dude... great video! Thanks for the knowledge!

  • @fredygerman_
    @fredygerman_ 14 วันที่ผ่านมา +4

    You always amaze me by the amount of knowledge I get from your videos

  • @CynicalWilson
    @CynicalWilson 18 วันที่ผ่านมา +18

    Holy crap! This gave me such amazing background knowledge, love it! Now, what would be extra cool, would be if you could do a real "hands-on" type of workshop to go through it all by setting up the environment completely, including the actual training/RAG implementation of a set of various document types (PDF, excel, website etc..) to extend a locally running llama 3 instance 😊

  • @scottmiller2591
    @scottmiller2591 18 วันที่ผ่านมา +33

    1) The link for the corrective RAG agent had an extra URL attached at the end which caused it to fail; manually tracing the link got me to the proper location
    2) LlamaParse looks like a wonderful tool, since I have a lot of documents with equations, and I really need it to grab equations, if for no other reason than to return them. Unfortunately, LlamaParse requires an API key and seems to send PDFs off for processing, something that others have noted and there is an open issue from 2 weeks ago. As of 3 hours ago, it's still an open issue - clearly most companies don't want to send internal docs out of house. Hopefully this gets resolved soon.
    3) Really liked your presentation - easy to follow every step with the provided materials.

    • @dennou2012
      @dennou2012 16 วันที่ผ่านมา +1

      Hopefully we will have more better options for local use - shame it's not a local only pipeline yet

    • @yunxinglu4020
      @yunxinglu4020 14 วันที่ผ่านมา +1

      yes - I have found this issue too. LlamaParse seems use OpenAI llm to process the pdf and it leads to the privacy concerns.

  • @free_thinker4958
    @free_thinker4958 18 วันที่ผ่านมา +3

    You're the man 💯👏

  • @dataanalysiscourse785
    @dataanalysiscourse785 18 วันที่ผ่านมา +3

    Awesome content!

  • @beelzebub2808
    @beelzebub2808 2 วันที่ผ่านมา +1

    This is extremely helpful! Awesome!

  • @jasonfinance
    @jasonfinance 18 วันที่ผ่านมา +4

    Didn't know about the Agentic RAG techniques, thanks for sharing!! That's definitely a trade off between speed & quality, but good to have the option

  • @priyankajain1691
    @priyankajain1691 14 วันที่ผ่านมา

    Amazing tutorial! Thank you

  • @jorper98
    @jorper98 15 วันที่ผ่านมา

    Amazing info shared -. Thank you!

  • @Max-hj6nq
    @Max-hj6nq 17 วันที่ผ่านมา +1

    Solid video Jason

  • @user-rj1eu6kp3u
    @user-rj1eu6kp3u 18 วันที่ผ่านมา

    right when i needed it, thank you man!
    also, just finished watching and i understood the theory behind it but kinda got lost during the code explanation, i might watching again and again

  • @MrSuntask
    @MrSuntask 18 วันที่ผ่านมา

    Great tutorial! Thank you

  • @Hash_Boy
    @Hash_Boy 18 วันที่ผ่านมา +1

    many many thanks, bro!

  • @tkp2843
    @tkp2843 18 วันที่ผ่านมา +36

    Firecrawl boosted our RAG accuracy at our company. fast + provided good markdown format.
    Llama parse also super helpful too! Amazing video Jason! This is gold!
    Edit: thanks for the likes :)

    • @rafaelmiller9147
      @rafaelmiller9147 18 วันที่ผ่านมา

      The search api is just insane on firecrawl

  • @puzitrajSinghKR
    @puzitrajSinghKR 11 วันที่ผ่านมา +2

    Thanks!

  • @liamlarsen9286
    @liamlarsen9286 18 วันที่ผ่านมา

    awesome jason thank you

  • @renderwood
    @renderwood 6 วันที่ผ่านมา

    Keep this up. This answered to loads of questions I have had previously, and were not answered in any of the HuggingFace tutorials!

  • @tunesafari8952
    @tunesafari8952 16 วันที่ผ่านมา

    Great video, thanks

  • @jackmermigas9465
    @jackmermigas9465 21 ชั่วโมงที่ผ่านมา

    wow nice work thanks!

  • @AdahAugustine-fy6xx
    @AdahAugustine-fy6xx 5 วันที่ผ่านมา

    Thanks... Awesome video

  • @MyWatermelonz
    @MyWatermelonz 18 วันที่ผ่านมา +5

    I prefer finetuning to RAG first then RAG on top of the finetuned model. Just a simple QLORA is all you need. It really helps a ton.

    • @helix8847
      @helix8847 17 วันที่ผ่านมา

      How would you go about doing that, as in just do it backwards from the video?

  • @mathavansg9227
    @mathavansg9227 17 วันที่ผ่านมา

    Best video💯

  • @gaijinshacho
    @gaijinshacho 18 วันที่ผ่านมา +10

    Great timing! Why do you always read my mind JASON!!?! lol

  • @jaydencollier9339
    @jaydencollier9339 9 วันที่ผ่านมา

    I am literally using this technique now in my internship for a project. I went through so many approaches and ended up on my version of this one. Wish you released this video about 2 months ago lol

  • @abdallahelra3y118
    @abdallahelra3y118 15 วันที่ผ่านมา

    This is epic! keep up...

  • @Entropy67
    @Entropy67 12 วันที่ผ่านมา

    Subscribed, dont have an AI company since I'm still a poor student... this video was very informative, the man speaks at two times speed just like my professor. I respect it 😁

  • @MrStevemur
    @MrStevemur 10 วันที่ผ่านมา

    Thanks! It's so fascinating how these programs 'think.' Even if I don't install one, concepts like chunking seem to translate to humans as well.

  • @kartiknighania8588
    @kartiknighania8588 17 วันที่ผ่านมา +1

    OG Jin Yang from Silicon Valley.. Amazing video 🎉

  • @szpiegzkrainydeszczowcow8476
    @szpiegzkrainydeszczowcow8476 18 วันที่ผ่านมา

    You are relevant, Subscribing to your channel!

  • @mrkubajski9528
    @mrkubajski9528 12 วันที่ผ่านมา

    I have to say, it is great :D

  • @ConsultingjoeOnline
    @ConsultingjoeOnline 16 วันที่ผ่านมา

    Clicked that BELL too! 🔔

  • @LibertyRecordsFree
    @LibertyRecordsFree 16 วันที่ผ่านมา +1

    Amazing lesson! I learned a lot in just 20 min!

  • @asetkn
    @asetkn 17 วันที่ผ่านมา

    Platform agnostic LLM space overview videos from Jason are the best on AI YT

  • @ex3aliber
    @ex3aliber 18 วันที่ผ่านมา

    Amazinnnnggggg🎉🎉🎉🎉

  • @azathought_games
    @azathought_games 16 วันที่ผ่านมา +1

    Such a bait and switch. Thumbnail promises fine tuning tutorial. Delivers best improve-your-RAG video on the internet. Excellent work.

  • @drakouzdrowiciel9237
    @drakouzdrowiciel9237 11 วันที่ผ่านมา

    thx

  • @arianetrek7049
    @arianetrek7049 3 วันที่ผ่านมา

    The corrective RAG schema explains why AI often tries to bring results from the web even when you tell them not to in prompt. If it doesn't understand the source properly it will look elsewhere. This was insightful, thank you.

  • @rab0309
    @rab0309 18 วันที่ผ่านมา +11

    great video keep making these please.. only "criticism" / advice if you can call if that is to keep things focused on local / open source solutions as much as possible.. love the use of Ollama here for example.. things that perhaps don't require API keys, subscriptions, external integrations / dependencies help people like me understand more of what's going on in a workflow like this! thanks again!

  • @faktogeek
    @faktogeek 18 วันที่ผ่านมา +1

    here come dat boi!!!!!!

  • @mikahundin
    @mikahundin 17 วันที่ผ่านมา +1

    The speaker in the transcript discusses the use of AI, particularly large language models, in knowledge management. They highlight that AI can provide value in managing vast amounts of documentation and meeting notes, which can be overwhelming for humans to process. The speaker also mentions the potential disruption of traditional search engines like Google by large language models, which can provide hyper-personalized answers based on their extensive knowledge.
    The speaker then introduces the concept of a retrieval augmented generation (RAG) pipeline, which involves extracting information from real data sources, converting them into a vector database, and retrieving relevant information to answer user queries. However, they also note the challenges in building a production-ready RAG application, including dealing with messy real-world data, accurately retrieving relevant information, and handling complex queries that may involve multiple data sources.
    The speaker also discusses various tactics to mitigate these challenges, such as better data preprocessing, optimal chunk size, relevance-based retrieval, and hybrid search methods. They also mention the use of agentic RAG, which utilizes agents' dynamic and reasoning abilities to decide the optimal RAG pipeline and improve the answer quality.
    The speaker concludes by expressing their curiosity about how AI-native startups operate and embed AI into their business processes. They recommend a research document on the subject for those interested.
    In summary, the speaker's points are:
    1. AI, particularly large language models, can provide significant value in knowledge management.
    2. Traditional search engines could potentially be disrupted by large language models.
    3. Retrieval augmented generation (RAG) pipelines can be used to answer user queries based on private knowledge.
    4. Building a production-ready RAG application is complex due to challenges like messy real-world data, accurate retrieval of relevant information, and handling complex queries.
    5. Various tactics can mitigate these challenges, including better data preprocessing, optimal chunk size, relevance-based retrieval, and hybrid search methods.
    6. Agentic RAG can further improve answer quality by utilizing agents' dynamic and reasoning abilities.
    7. The speaker is interested in how AI-native startups operate and embed AI into their business processes, and recommends a research document on the subject.

    • @pithlyx9576
      @pithlyx9576 15 วันที่ผ่านมา

      Dead internet thory is getting closer and closer every day

  • @98hghghg98
    @98hghghg98 15 วันที่ผ่านมา

    great video jason! quick question, im wondering if a knowledge graph in place of vector database would be better since it mitigates the lost in the middle problem?

  • @FernandoOtt
    @FernandoOtt 18 วันที่ผ่านมา

    Awesome content Jason. A Question. I need to create an AI psychologist and store college data, but this college data is a guide of what to speak, not the content itself.
    In that case, what is the best approach, RAG or Fine-tuning?

  • @jonm6834
    @jonm6834 11 วันที่ผ่านมา

    You got a sub. Finally, an AI channel that actually teaches.

  • @EverythinTechnology
    @EverythinTechnology 18 วันที่ผ่านมา

    I thought we were gonna fine tune llama3 😢 but the fire crawl implementation looks unreal I’ll have to check that out and add it to my rags.
    I don’t know how well it’ll work for RAGs but people have extended the context window like crazy and still can do the needle in haystack to around 130k.
    If you have 64gb on the Mac you can try out the 256k context window Llama 3 released by Eric Hartford. Would love to see a side by side with both of them using the same embeddings.

  • @MrLiteratur
    @MrLiteratur 14 วันที่ผ่านมา +2

    Thanks, Jason, incredible as always! Would you consider sharing the code from the walkthrough? 🙏

    • @AIJasonZ
      @AIJasonZ  12 วันที่ผ่านมา

      Thanks mate, appreciate it! Code is in the description link!

    • @yashsrivastava677
      @yashsrivastava677 6 วันที่ผ่านมา

      @@AIJasonZ Link is not there

  • @shimin3356
    @shimin3356 13 วันที่ผ่านมา

    Hey Jason thanks for the video, I think it helps a lot. Can I apply on GPT as well?

  • @freddy29228
    @freddy29228 11 วันที่ผ่านมา

    Thanks Jason, great video, this explains RAG pretty well. Subscribed!

  • @EveDe-ug3zv
    @EveDe-ug3zv 17 วันที่ผ่านมา +1

    Great video Jason, I only missed routing as a technique to determine if your question should really go through the RAG. James Briggs has done a few good videos on “semantic routing”.
    Is your example notebook available somewhere?

    • @christenjacquottet9799
      @christenjacquottet9799 2 วันที่ผ่านมา

      I'm wondering the same thing. Don't see a link to a github repo

  • @tonygil8617
    @tonygil8617 17 วันที่ผ่านมา +1

    Hi brilliant session , do you have a link for the notebook ?

  • @shephusted2714
    @shephusted2714 18 วันที่ผ่านมา +5

    too many api calls here - do it local with no api calls - better and the model has to be able to crawl more doc formats - people will probably do p2p, real time and uncensored models for 'real' open source ai that has no limiting factors like api calls or tokens - this is where things need to go in order to take off, gain relevance and leverage economies of scale, of course cxl and better i/o will help but those are on the way. real open source ai will hit smb mkt in about 4-5 years and there will be more innovation and discovery - exciting times as we all watch the development curve

  • @Joe-bp5mo
    @Joe-bp5mo 18 วันที่ผ่านมา

    This answer a lot of questions why my chat with PDF doesn't work, llama parser & firecrawl looks so freaking good!

  • @Truzian
    @Truzian 18 วันที่ผ่านมา +1

    would be great to get a video on best methods for data extraction from these pdfs

  • @nrusimha11
    @nrusimha11 9 วันที่ผ่านมา

    Thank you. Can you say a little about your hardware setup for this work? This information is missing from a lot of online sources.

  • @RenAok
    @RenAok 10 วันที่ผ่านมา

    Very usefull, thank you! Is it posible for the model to retrieve images or graphs from a PDF, or it's only text?

  • @mikey1836
    @mikey1836 14 วันที่ผ่านมา +1

    Interesting. Someone needs to create a wrapper which works out the best way to answer questions / queries, based on the input and question/query. I think intelligence of system could then be increased.

  • @sharex21
    @sharex21 17 วันที่ผ่านมา +1

    I'm a simple man. I see a new AI Jason video, I click.

  • @biiiiiimm
    @biiiiiimm 17 วันที่ผ่านมา

    What about preparing data, for exemple as question / response, the response would be used to generate embedding and the response would be the data retrieved ?

  • @sd5853
    @sd5853 15 วันที่ผ่านมา

    I don’t understand everything but I can feel the gold penetrating my ears

  • @KouadioJeanCyrilleNgoran
    @KouadioJeanCyrilleNgoran 3 วันที่ผ่านมา

    thanks Jason, can i use llama on API and train PDf files in a specify directory train to respond

  • @junmagic8847
    @junmagic8847 17 วันที่ผ่านมา

    amazing as always. could you share the notebook please

  • @nuluai
    @nuluai 12 วันที่ผ่านมา

    We been trying to build a middleware that connects with any inventory ERP to be able to have real time data information about inventory data for the chatbot

  • @sayfeddinehammami6762
    @sayfeddinehammami6762 17 วันที่ผ่านมา

    Good rag video, the thumbnail taking about "training llama3" is hurting my brain tho

  • @morffisTFT
    @morffisTFT 18 วันที่ผ่านมา +40

    Can you share the code in the video?

    • @basedmuslimbooks
      @basedmuslimbooks 15 วันที่ผ่านมา +3

      I was hoping that was the case since it's a "simple" workflow

    • @pollywops9242
      @pollywops9242 14 วันที่ผ่านมา

      The code is personal you need to apply for a download link with meta and it will provide the code to copy / paste

    • @christenjacquottet9799
      @christenjacquottet9799 2 วันที่ผ่านมา

      @@pollywops9242 apply where? I don’t see it

  • @eventsjamaicamobileapp1426
    @eventsjamaicamobileapp1426 15 วันที่ผ่านมา

    Great video. How do I add PDF documents and llama_parse to the python notebook?

  • @JaimeGuajardo
    @JaimeGuajardo 14 วันที่ผ่านมา

    👍👍

  • @gdr189
    @gdr189 12 วันที่ผ่านมา

    Hi, what are the areas current LLMs excel at?
    I am new to this world of AI, but not IT (familiar with infra). It is good that people are trying out things to see what it can do. But my naïve thoughts are that as a language tool, it just looks for patterns of words that appear close together, and knows enough of the formation of language that it produces text that is not only readable, but also relevant. But this surely must have limits, if it does not actually understand?
    Would it be serving up answers from a well vetted and written sources such as internal KMS by using this RAG method? Our team was thinking about it use for education / learning - perhaps tied into custom flashcard and evaluation of human provided answers. Alongside the still very useful text summarisation, alternative wording suggestions.

  • @sinasec
    @sinasec 10 วันที่ผ่านมา

    Great thanks. Can we get the repo and link to the colab notebook?

  • @AbdulMajeed-lf5sq
    @AbdulMajeed-lf5sq 10 วันที่ผ่านมา

    I watch lots of AI videos and 99% of them are just a waste of time. As an AI engineer, this channel is hands down the BEST yet
    KEEP UP👏🏼

  • @VipulChaudhary1337
    @VipulChaudhary1337 18 วันที่ผ่านมา +1

    Goddamn it Jian Yang

  • @CecilMerrell
    @CecilMerrell 12 วันที่ผ่านมา

    I like using gemini for getting quick up to date answers, and chat gpt for stuff that doesn't require up to date stuff

  • @yunxinglu4020
    @yunxinglu4020 14 วันที่ผ่านมา

    Does this code require a good GPU as a must? I am using my 32 Gp cpu and it is super super slow to generate the answer. If the GPU is a must, any commandation for GPU model? I am seeing Jason in the video generate the answer in seconds and I know he is using a mac. Thanks in advanced!

  • @mahmood392
    @mahmood392 12 วันที่ผ่านมา

    Would you have plans to create a tutorial that connects what ur teaching here and running thing on something like AnythingLLM that allows document reading to create embeddings.

  • @user-lw3fs3tl9x
    @user-lw3fs3tl9x 18 วันที่ผ่านมา +1

    Is those steps and advices are explained on your website ? It would be amazing if you could share the code 😮

  • @KemalCanKara
    @KemalCanKara 12 วันที่ผ่านมา

    You said that you can fine tune a model to teach it new knowledge. But is it really correct? A decoder based models are fined tuned for aligment.

  • @gsprlls
    @gsprlls 18 วันที่ผ่านมา

    Curious how this workflow changes with bigger context length. Gradient just released Llama-3 8B with a 1M context length

  • @dmy_tro
    @dmy_tro 16 วันที่ผ่านมา

    Can we also finetune the 70B model? Even if its not local

  • @henry_room
    @henry_room 17 วันที่ผ่านมา

    Would there be a way to automate this with Obsidian? I sporadically log everything in Obsidian and it would be amazing to find a way to do this with Obsidian

  • @MarcusLaw-ss5lt
    @MarcusLaw-ss5lt 13 วันที่ผ่านมา

    is there a good parser for powerpoint?

  • @simon_colby
    @simon_colby 8 วันที่ผ่านมา

    Hey, Jason. This video is 🔥🔥! Congrats, I was wondering if there is a chance to reach out to you? I might have an interesting offer for you.

  • @ConsultingjoeOnline
    @ConsultingjoeOnline 16 วันที่ผ่านมา

    Great video. Thanks! A lot of very good tips!

  • @Dom-zy1qy
    @Dom-zy1qy 15 วันที่ผ่านมา

    4:36 Someone walks into the void and disappears

  • @florentflote
    @florentflote 18 วันที่ผ่านมา

  • @mayanknagwanshi
    @mayanknagwanshi 12 วันที่ผ่านมา

    Damn it Jin yiang 😂

  • @FunkyByteAcademy
    @FunkyByteAcademy 11 วันที่ผ่านมา

    Fucking dope bra

  • @teapotexorcist
    @teapotexorcist 15 วันที่ผ่านมา

    There is a problem with the "Corrective RAG agent" URL in the description.

  • @NileshKumar_NK
    @NileshKumar_NK 13 วันที่ผ่านมา

    Hi Jason, Amazing stuff, can u please share the code?

  • @user-nc8kp5kg5c
    @user-nc8kp5kg5c 15 วันที่ผ่านมา

    Can u create end to end custome fine tuning LLM LLAMA with API

  • @titusblair
    @titusblair 18 วันที่ผ่านมา

    The Corrective RAG agent: is not working for me. Also do you have a github project for this tutorial? Thanks!

  • @faizanjaved1443
    @faizanjaved1443 18 วันที่ผ่านมา

    GPT-5 has been released. Shall we explore it together?