Semantic Chunking for RAG with

แชร์
ฝัง
  • เผยแพร่เมื่อ 24 ธ.ค. 2024

ความคิดเห็น • 15

  • @AI-Makerspace
    @AI-Makerspace  9 หลายเดือนก่อน +1

    Google Colab notebook: colab.research.google.com/drive/1gGLd-rdPsM1iy4JmL1V1mfZm90CmDcXR?usp=sharing
    Event Slides: www.canva.com/design/DAGAtxFPH2M/3oo8gElRKU21fQH-ZzYNNA/view?DAGAtxFPH2M&

  • @damiangilgonzalez8011
    @damiangilgonzalez8011 9 หลายเดือนก่อน +1

    Awesome job guys! I wached this video with my coffe this morning and it was a perfect way to start my day (learning, drinking a coffe and lisening a really good spekears/teachers)

    • @AI-Makerspace
      @AI-Makerspace  9 หลายเดือนก่อน

      This is awesome Damian - thank you! We're pumped we got to spend the morning with you :)

  • @bananamaker4877
    @bananamaker4877 9 หลายเดือนก่อน

    Love this video and new strategy of semantic chunking. Thanks to Greg and Chris for explaining this concept the way how it should be. Again thanks for making it open source.

    • @AI-Makerspace
      @AI-Makerspace  9 หลายเดือนก่อน +1

      Thanks bananamaker!! We enjoyed getting down into the weeds of some often-overlooked pieces today, and we're also fans of the new strategy! Look for more content like this from us soon!

  • @DataScienceandAI-doanngoccuong
    @DataScienceandAI-doanngoccuong 2 หลายเดือนก่อน

    Trong thang đánh giá kỹ thuật Chunking thì Chunking theo ngữ nghĩa và chunking theo agent được đánh giá ở cấp 4 và 5. Thực nghiệm cho thấy chunking agentic sử dụng LLMs cho kết quả cao nhất.
    Cấp 1: Tách ký tự - Các đoạn dữ liệu ký tự tĩnh đơn giản
    Cấp 2: Tách văn bản ký tự đệ quy - Chia nhỏ đệ quy dựa trên danh sách các dấu phân cách
    Cấp 3: Tách theo từng loại tài liệu - Các phương pháp chia nhỏ khác nhau cho các loại tài liệu khác nhau (PDF, Python, Markdown)
    Cấp 4: Tách ngữ nghĩa - Chia nhỏ dựa trên embedding. Kỹ thuật này chia đoạn văn bản thành các đoạn nhỏ dựa trên ngữ nghĩa, thay vì chỉ dựa vào độ dài cố định.
    Cấp 5: Tách dùng agent - Agentic Chunker: Agentic Chunker tự động nhóm các propositions (mệnh đề) có liên quan vào các chunks (nhóm). Khi thêm một proposition mới, hệ thống sẽ xác định xem có nên thêm nó vào một chunk hiện có hay tạo một chunk mới.

  • @channel_panel193
    @channel_panel193 8 หลายเดือนก่อน +1

    heyyy u guys look familiar from the fourthbrain bootcamp i took! nice

  • @JankayYashwant
    @JankayYashwant 7 หลายเดือนก่อน

    Please make many more awesome explainers like this!

    • @AI-Makerspace
      @AI-Makerspace  7 หลายเดือนก่อน +1

      You can count on it @JankayYashwant!

  • @NhatNguyen-bq6jj
    @NhatNguyen-bq6jj 7 หลายเดือนก่อน

    Can you introduce some related articles? Thanks!

    • @AI-Makerspace
      @AI-Makerspace  4 หลายเดือนก่อน

      medium.com/the-ai-forum/semantic-chunking-for-rag-f4733025d5f5

  • @zugbob
    @zugbob 8 หลายเดือนก่อน

    When doing RAG in general is it best to insert it into the system prompt or to have an assistant message for it?

    • @AI-Makerspace
      @AI-Makerspace  8 หลายเดือนก่อน

      It's really up to you - and depends on if you're using examples or not.

  • @MrDespik
    @MrDespik 9 หลายเดือนก่อน

    You forgot to show how we can combine semantic chunking with parent document retriever)
    I mean what chunks we need to use as parents and as childs.

    • @AI-Makerspace
      @AI-Makerspace  8 หลายเดือนก่อน

      I'm sorry! We didn't intend to explore this in the session!