Deep dive into RAG Chunking Strategies : CharacterText Splitter to Semantic Chunking

แชร์
ฝัง
  • เผยแพร่เมื่อ 5 ก.ย. 2024
  • - Why do we need chunking in RAG
    - Different chunking strategies and Pros and Cons
    - CharacterText Splitter
    - RecursiveCharacter Text Splitter
    - TokenText Splitter
    - based embedding max tokens
    - LLM Context length
    - Semantic Chunking
    Notebook: github.com/ari...

ความคิดเห็น • 7

  • @awakenwithoutcoffee
    @awakenwithoutcoffee 3 หลายเดือนก่อน +1

    I appreciate this excellent breakdown Artira . The semantic chunker is something that could be a real breaktrough for technical documentations. One of the problems we keep facing is that "summarization" is often leaving out too many details (e.g.: techincal lists that are incomplete, instructions that should NOT be summarized are still summarized etc.).😶

    • @AritraSen
      @AritraSen  3 หลายเดือนก่อน +1

      Hey, glad you liked it :)
      One of the thing you can try out is giving instructions in the prompt and ask the model to think in a step by step fashion with those instructions.

  • @oguzhanylmaz4586
    @oguzhanylmaz4586 3 หลายเดือนก่อน

    Hello, I want to make a project like this: fast api or flask api + reactjs + (open source llm (Llama2, 3, mistral etc.). But I couldn't find any project that I can reference. Can you make such a project?
    I don't want to use openai. The project must be able to run without internet.

    • @AritraSen
      @AritraSen  3 หลายเดือนก่อน +1

      We already have a playlist - RAG LLM App with FastAPI and Gradio: th-cam.com/play/PLOrU905yPYXIqQLY6ulQqB8e414-DFuyd.html

    • @awakenwithoutcoffee
      @awakenwithoutcoffee 3 หลายเดือนก่อน

      @@AritraSen I think OP is looking to integrate a React front-end using an API. Gradio seems good for a demo but React is allot more customizable.

    • @AritraSen
      @AritraSen  3 หลายเดือนก่อน +1

      @@awakenwithoutcoffee hey sorry React is not my skill set...

    • @awakenwithoutcoffee
      @awakenwithoutcoffee 3 หลายเดือนก่อน

      @@AritraSen totally fine brother, keep on creating !