Build Your Own Local PDF RAG Chatbot (Tutorial)

แชร์
ฝัง
  • เผยแพร่เมื่อ 26 ม.ค. 2025

ความคิดเห็น • 70

  • @uchihaerenyeager1756
    @uchihaerenyeager1756 2 หลายเดือนก่อน +4

    Your content is amazing! ⭐️⭐️⭐️⭐️⭐️ Thank you for all the effort you put into it-I’m so grateful I found your channel. You’ve earned my sub, and I can’t wait to see more from you!

    • @tonykipkemboi
      @tonykipkemboi  2 หลายเดือนก่อน +1

      Thank you so much for your kind words! I’m glad you find it helpful.

  • @watchthemanual2774
    @watchthemanual2774 2 หลายเดือนก่อน +3

    Saw this posted on Reddit today, hopped on my laptop right away. Very detailed, yet simply explained. Just picked up a new subscriber, thanks.

    • @tonykipkemboi
      @tonykipkemboi  2 หลายเดือนก่อน

      @@watchthemanual2774 thank you! Glad you found it helpful.

  • @ivoryontrack
    @ivoryontrack 2 หลายเดือนก่อน +3

    thank you so much! had an assignment to learn how to create a rag chatbot w multiple pdfs as the data source and i came across your channel while researching. the previous tutorials you made were already helpful but i saw you were going to make an updated video and i was super excited. this was great, subscribed to you for more content in the future too. 🚀

    • @tonykipkemboi
      @tonykipkemboi  2 หลายเดือนก่อน

      @@ivoryontrack awesome! glad you found it helpful.

  • @free_thinker4958
    @free_thinker4958 2 หลายเดือนก่อน +2

    Great tutorial ❤, we're looking forward for you to make tutoriels on langgraph for agentic workflows with chainlit as frontend 🎉🎉

    • @tonykipkemboi
      @tonykipkemboi  2 หลายเดือนก่อน +1

      Great suggestion!

  • @alexsandrotabosa4461
    @alexsandrotabosa4461 2 หลายเดือนก่อน +1

    Thank you so much for you content. You are help me a lot. Hugs from Brazil!

    • @tonykipkemboi
      @tonykipkemboi  2 หลายเดือนก่อน

      @@alexsandrotabosa4461 am happy that you found it helpful! 😌

  • @dawmro
    @dawmro หลายเดือนก่อน +2

    Yeah, newer packages versions are always source of problems for me. Especially when I forget to run pip freeze > requirements.txt, to save specific versions.

  • @MarahTal
    @MarahTal 21 วันที่ผ่านมา +1

    Thank you again for this update tutorial! It is really helpful. I have a question
    what python version you used for this updated code?

    • @tonykipkemboi
      @tonykipkemboi  21 วันที่ผ่านมา

      @@MarahTal thank you! Python 3.12.7

  • @MinoasPediadas
    @MinoasPediadas 2 หลายเดือนก่อน +1

    Thank you very much for your content and efforts. Assume the following scenario: let's say a document describes some criteria in specific paragraphs and a second document describes a project proposal. I want to check how well the project proposal addresses the criteria as set in the first document. Would something like that be a feasible use-case and what would it take to implement it?

  • @tradertube
    @tradertube หลายเดือนก่อน +1

    Awesome video. Thanks for sharing.

  • @techchitti2.o118
    @techchitti2.o118 หลายเดือนก่อน +1

    Very informative 👏

    • @tonykipkemboi
      @tonykipkemboi  หลายเดือนก่อน

      Glad you found it helpful!

  • @datacentricsystems9584
    @datacentricsystems9584 27 วันที่ผ่านมา +1

    Beyond Awesome!

    • @tonykipkemboi
      @tonykipkemboi  27 วันที่ผ่านมา

      Am glad you found it useful.

  • @ШохрухАбдивоитов
    @ШохрухАбдивоитов 2 หลายเดือนก่อน +1

    HI Tony I found your tutorial about Hot to chat with pdf files which reduce your time for process information. I have one question. I went through whole process in tutorial step by step. And I cloned your repository. When I run streamlit_app.py file to locally deploy computer can not see ollama models. But I dowloaded it to my computer. Can you explain me this case. Thank you in advance for your response.

  • @thenoobdev7365
    @thenoobdev7365 6 วันที่ผ่านมา

    Thank you !

  • @surbhi.emergingtech
    @surbhi.emergingtech หลายเดือนก่อน +2

    When working with Streamlit it showing module_info name error and on cells it showing error of DLL load fails how to fix it

    • @tonykipkemboi
      @tonykipkemboi  หลายเดือนก่อน

      @@surbhi.emergingtech can you share the full error here

  • @quiksilver10152
    @quiksilver10152 2 หลายเดือนก่อน +1

    Building this with a virtual environment and installing the latest C++ distribution solved all problems. What a useful app, thank you! The pdf preview window on the left seems to be low resolution on the browsers I have tried this on. Any way to increase the pixel density?

    • @quiksilver10152
      @quiksilver10152 2 หลายเดือนก่อน +1

      added (resolution = 1440) to the two instances of page.to_image() in the streamlit_app to solve the issue.

    • @tonykipkemboi
      @tonykipkemboi  2 หลายเดือนก่อน

      @@quiksilver10152 awesome work! Thanks for sharing the solutions. Yes, you got the resolution input right.

  • @JumpingStar-t9v
    @JumpingStar-t9v 2 หลายเดือนก่อน +2

    Getting error while running it as below.
    "DLL load failed while importing onnx_copy2py_export: a dynamic link Libra (DLL) initialization routine failed.
    Suggest on above.
    Please suggest.

    • @tonykipkemboi
      @tonykipkemboi  2 หลายเดือนก่อน

      @JumpingStar-t9v
      Please try these solutions:
      1. Install Microsoft Visual C++ Redistributable:
      - Download and install both x64 and x86 versions from Microsoft's official website: learn.microsoft.com/en-us/cpp/windows/latest-supported-vc-redist?view=msvc-170#latest-microsoft-visual-c-redistributable-version
      - Restart your computer after installation
      2. If the error persists, try installing ONNX Runtime manually:
      ```bash
      pip uninstall onnxruntime onnxruntime-gpu
      pip install onnxruntime
      ```
      If you're running on a CPU-only system:
      1. Ensure you have the CPU version of ONNX Runtime:
      ```bash
      pip uninstall onnxruntime-gpu # Remove GPU version if installed
      pip install onnxruntime # Install CPU-only version
      ```
      2. You may need to modify the chunk size in the code to prevent memory issues:
      - Reduce `chunk_size` to 500-1000 if you experience memory problems
      - Increase `chunk_overlap` for better context preservation
      Note: The application will run slower on CPU-only systems, but it will still work effectively.
      I will add these tips to the README as well.
      Please let me know if it resolves the issue.

    • @danielgitau2054
      @danielgitau2054 2 หลายเดือนก่อน

      Check if this works for you
      onnx==1.16.1

    • @sifu1077
      @sifu1077 หลายเดือนก่อน

      @@danielgitau2054 hi, downgraded from onnx-1.17.0 to onnx-1.16.1, errors gone but the python script exits without any message.

  • @muhammadsawaiz1064
    @muhammadsawaiz1064 หลายเดือนก่อน

    i am not be able to get rid of that error
    "Error: failed to find libmagic. Check your installation"
    please somenone have any idea of about it. i am using a ollama on cpu based laptop

  • @ruchirahasaranga8076
    @ruchirahasaranga8076 2 หลายเดือนก่อน +1

    Can I ask a question using a page number?

  • @karthikbsk144
    @karthikbsk144 2 หลายเดือนก่อน

    Thank you will 16gb RAM is enough to run the llama in local exsmple M4 Mac mini

  • @khurramumair8162
    @khurramumair8162 หลายเดือนก่อน +1

    The error i am encountering is specifically related input of chain.invoke() method in LangChain.
    Expected JSON/Dict Input but it seems an empty string ('') and this mismatch triggers a ValidationError in Pydantic.

    • @tonykipkemboi
      @tonykipkemboi  หลายเดือนก่อน

      @@khurramumair8162 add your question inside the ''

  • @AkshayKumar-qc4rz
    @AkshayKumar-qc4rz 2 หลายเดือนก่อน +1

    Windows based install
    Getting error while running it as below.
    "DLL load failed while importing onnx_copy2py_export: a dynamic link Libra (DLL) initialization routine failed.
    Im also having this same issue, additionally tried the steps given on your github to rectify this. They didnt work
    even tried rolling back onnx to both 1.16.1 and 1.16.0 (1.15.0 doesnt install)
    i need help, this project seems very interesting and I want to implement this.

    • @tonykipkemboi
      @tonykipkemboi  2 หลายเดือนก่อน

      @@AkshayKumar-qc4rz I will test it out on my Windows system later today and report back.

  • @zandanshah
    @zandanshah 28 วันที่ผ่านมา +1

    What version of python you're using?

  • @samthrimavithana8243
    @samthrimavithana8243 2 หลายเดือนก่อน +2

    Hey can you do a tutorial how to clone and run the application step by step

    • @quiksilver10152
      @quiksilver10152 2 หลายเดือนก่อน

      Finding the correct combination of libraries is a nightmare.

  • @doughimes
    @doughimes หลายเดือนก่อน

    Getting error... "streamlit.runtime.caching.cache_errors.UnhashableParamError: Cannot hash argument 'models_info' (of type ollama._types.ListResponse) in 'extract_model_names'." running on a Windows WSL environment. Everything installed but then when I open the Web Interface it gives me the model's error

  • @yashhurkadli1341
    @yashhurkadli1341 8 วันที่ผ่านมา

    idk why but i got soo many errors on data = loader.load(). can you please help me..?

  • @karansingh-ce8yy
    @karansingh-ce8yy 2 หลายเดือนก่อน

    i am getting the dll error in onnx, i reinstalled it, installed the x86 and x 64 c++ redistributable but so far nothing helped. i am running widows 11

  • @tonywhite4476
    @tonywhite4476 หลายเดือนก่อน +1

    FYI, the requirements.txt file has gradio instead of streamlit

    • @tonykipkemboi
      @tonykipkemboi  หลายเดือนก่อน

      @@tonywhite4476 thanks for pointing that out, just added it.

  • @ammaransari4329
    @ammaransari4329 2 หลายเดือนก่อน

    Spent hours resolving multiple dependencies error, installing multiple softwares, and following exact steps from your github.
    Still can't get it working. Using Windows with GPU. Tried re-installing onnxruntime-gpu.
    Visual C++ Redistributable is already installed.
    Please help.
    Error:
    ImportError: DLL load failed while importing onnx_cpp2py_export: A dynamic link library (DLL) initialization routine failed.

  • @WHLau-g2z
    @WHLau-g2z หลายเดือนก่อน

    an error for the section of loading pdf > ModuleNotFoundError: No module named 'pi_heif', how to solve it?

  • @armandf.s4036
    @armandf.s4036 2 หลายเดือนก่อน +1

    Has anyone her encountered the error when chatting with PDFs? I get this nonetype object is not iterable error, even though I've already listed all of my ollama models and installed ollama on my project.

    • @tonykipkemboi
      @tonykipkemboi  2 หลายเดือนก่อน

      @@armandf.s4036 can you share the full error message?

    • @armandf.s4036
      @armandf.s4036 2 หลายเดือนก่อน

      I fixed the problem btw, I just needed to upgrade a few dependencies. I have another problem though, @The How-To Guy, how do I fix this error when importing pdf's to my streamlit ui?
      ImportError: DLL load failed while importing onnx_cpp2py_export: A dynamic link library (DLL) initialization routine failed.

  • @AayushSahu-m8g
    @AayushSahu-m8g 2 หลายเดือนก่อน +1

    how do we check the accuracy of this model??

    • @tonykipkemboi
      @tonykipkemboi  2 หลายเดือนก่อน

      You mean retrieval accuracy? You can use something like RAGAS for evals and find truthiness. It's a bit involved.

  • @ItalianTiger955i
    @ItalianTiger955i 11 วันที่ผ่านมา +1

    I think the parts that are still missing are:
    1) prerequisites
    2) what to install to begin, which app, program and packages we need
    Because of this, this tutorial since first minute is impossible to understand for beginners

    • @tonykipkemboi
      @tonykipkemboi  11 วันที่ผ่านมา

      @@ItalianTiger955i thanks for your feedback. the target audience is intermediate to experienced for most tutorials on my channel. it's a balancing act that is hard to get right. most folks don't want the simple steps mentioned because they infer they know it. good thing there's models like chatgpt for refrrence

  • @ruchirahasaranga8076
    @ruchirahasaranga8076 2 หลายเดือนก่อน +2

    Do not try this on windows. I wasted 4 hours on installation errors!

    • @tonykipkemboi
      @tonykipkemboi  2 หลายเดือนก่อน

      @@ruchirahasaranga8076 what errors are you getting exactly? It would be really helpful so I can help troubleshoot and help others who might run into the same issue.

    • @danielgitau2054
      @danielgitau2054 2 หลายเดือนก่อน +1

      hectic! Then i had to downgrade onnx to 1.16.1. It works but very slow for common queries.
      Otherwise great stuff Tony👍

    • @tonykipkemboi
      @tonykipkemboi  2 หลายเดือนก่อน +1

      @@danielgitau2054 thank you for helping troubleshoot. I'm reviving my old windows laptop for testing as well.

    • @vanidixit5506
      @vanidixit5506 20 วันที่ผ่านมา

      ​@@tonykipkemboi How to increase the speed in windows..it is very slow.. please suggest

  • @johncult6948
    @johncult6948 2 หลายเดือนก่อน +1

    Getting this module error even after mentioning it in the requirements.txt file.
    from langchain_ollama import OllamaEmbeddings
    ModuleNotFoundError: No module named 'langchain_ollama'
    wasted my 2 hours figuring out but no use.

    • @tonykipkemboi
      @tonykipkemboi  2 หลายเดือนก่อน

      Did you use a virtual environment? Also try installing it directly on the notebook like this
      !pip install langchain-ollama

    • @johncult6948
      @johncult6948 2 หลายเดือนก่อน +1

      @@tonykipkemboi Did everything but no luck. Still getting the same error.

    • @tonykipkemboi
      @tonykipkemboi  2 หลายเดือนก่อน

      @@johncult6948 what system are you using mac or windows?

    • @johncult6948
      @johncult6948 2 หลายเดือนก่อน

      @@tonykipkemboi Mac

  • @vinaykumarkongaleti7848
    @vinaykumarkongaleti7848 2 หลายเดือนก่อน

    whatever I do I cannot be able to get the load() working, installing and uninstalling different versions but to no use
    loader = UnstructuredPDFLoader(file_path=local_path)
    data = loader.load()
    ImportError: cannot import name 'open_filename' from 'pdfminer.utils' or
    ModuleNotFoundError: No module named 'pdfminer.utils'

  • @danielmprazeres
    @danielmprazeres 2 หลายเดือนก่อน

    Thanks for the tutorial. I'm struggling with pip install chromadb==0.4.22 on my Mac. According to GPT: "The error occurs because the clang++ compiler cannot find the iostream file, which is part of the C++ standard library (libc++). This typically happens when the C++ development environment on macOS is not correctly configured."
    But Xcode is already installed.