Semantic Chunking for RAG with

แชร์
ฝัง
  • เผยแพร่เมื่อ 2 ก.พ. 2025

ความคิดเห็น • 15

  • @AI-Makerspace
    @AI-Makerspace  10 หลายเดือนก่อน +1

    Google Colab notebook: colab.research.google.com/drive/1gGLd-rdPsM1iy4JmL1V1mfZm90CmDcXR?usp=sharing
    Event Slides: www.canva.com/design/DAGAtxFPH2M/3oo8gElRKU21fQH-ZzYNNA/view?DAGAtxFPH2M&

  • @damiangilgonzalez8011
    @damiangilgonzalez8011 10 หลายเดือนก่อน +1

    Awesome job guys! I wached this video with my coffe this morning and it was a perfect way to start my day (learning, drinking a coffe and lisening a really good spekears/teachers)

    • @AI-Makerspace
      @AI-Makerspace  10 หลายเดือนก่อน

      This is awesome Damian - thank you! We're pumped we got to spend the morning with you :)

  • @bananamaker4877
    @bananamaker4877 10 หลายเดือนก่อน

    Love this video and new strategy of semantic chunking. Thanks to Greg and Chris for explaining this concept the way how it should be. Again thanks for making it open source.

    • @AI-Makerspace
      @AI-Makerspace  10 หลายเดือนก่อน +1

      Thanks bananamaker!! We enjoyed getting down into the weeds of some often-overlooked pieces today, and we're also fans of the new strategy! Look for more content like this from us soon!

  • @JankayYashwant
    @JankayYashwant 8 หลายเดือนก่อน

    Please make many more awesome explainers like this!

    • @AI-Makerspace
      @AI-Makerspace  8 หลายเดือนก่อน +1

      You can count on it @JankayYashwant!

  • @channel_panel193
    @channel_panel193 10 หลายเดือนก่อน +1

    heyyy u guys look familiar from the fourthbrain bootcamp i took! nice

  • @DataScienceandAI-doanngoccuong
    @DataScienceandAI-doanngoccuong 4 หลายเดือนก่อน

    Trong thang đánh giá kỹ thuật Chunking thì Chunking theo ngữ nghĩa và chunking theo agent được đánh giá ở cấp 4 và 5. Thực nghiệm cho thấy chunking agentic sử dụng LLMs cho kết quả cao nhất.
    Cấp 1: Tách ký tự - Các đoạn dữ liệu ký tự tĩnh đơn giản
    Cấp 2: Tách văn bản ký tự đệ quy - Chia nhỏ đệ quy dựa trên danh sách các dấu phân cách
    Cấp 3: Tách theo từng loại tài liệu - Các phương pháp chia nhỏ khác nhau cho các loại tài liệu khác nhau (PDF, Python, Markdown)
    Cấp 4: Tách ngữ nghĩa - Chia nhỏ dựa trên embedding. Kỹ thuật này chia đoạn văn bản thành các đoạn nhỏ dựa trên ngữ nghĩa, thay vì chỉ dựa vào độ dài cố định.
    Cấp 5: Tách dùng agent - Agentic Chunker: Agentic Chunker tự động nhóm các propositions (mệnh đề) có liên quan vào các chunks (nhóm). Khi thêm một proposition mới, hệ thống sẽ xác định xem có nên thêm nó vào một chunk hiện có hay tạo một chunk mới.

  • @NhatNguyen-bq6jj
    @NhatNguyen-bq6jj 8 หลายเดือนก่อน

    Can you introduce some related articles? Thanks!

    • @AI-Makerspace
      @AI-Makerspace  5 หลายเดือนก่อน

      medium.com/the-ai-forum/semantic-chunking-for-rag-f4733025d5f5

  • @zugbob
    @zugbob 9 หลายเดือนก่อน

    When doing RAG in general is it best to insert it into the system prompt or to have an assistant message for it?

    • @AI-Makerspace
      @AI-Makerspace  9 หลายเดือนก่อน

      It's really up to you - and depends on if you're using examples or not.

  • @MrDespik
    @MrDespik 10 หลายเดือนก่อน

    You forgot to show how we can combine semantic chunking with parent document retriever)
    I mean what chunks we need to use as parents and as childs.

    • @AI-Makerspace
      @AI-Makerspace  10 หลายเดือนก่อน

      I'm sorry! We didn't intend to explore this in the session!