How language model post-training is done today

แชร์
ฝัง
  • เผยแพร่เมื่อ 9 ม.ค. 2025

ความคิดเห็น • 5

  • @broyojo
    @broyojo 2 วันที่ผ่านมา +2

    such a valuable video, thank you for this great content

  • @Pingu_astrocat21
    @Pingu_astrocat21 วันที่ผ่านมา

    Thank you for sharing! So much to learn.

  • @420_gunna
    @420_gunna 6 วันที่ผ่านมา

    If there's something that I'm curious about learning more about from you re: post-training, it would be your thoughts on tool use. I basically don't know much more than the Gorilla/Toolformer papers. I think the topic sort of dovetails with reasoning/verification when you're talking about tools like calculators and code executors, but there's also search/retrieval-type tools. I think some months ago I remembered you not being as into RAG as I would have expected you to be, wonder if that's changed at all.
    I too have always loved CAI and am excited to play (maybe with little SmolLM models) in these "not-traditionally-verifiable domains" using RL -- character-related things!

    • @interconnects
      @interconnects  5 วันที่ผ่านมา

      We're working on tools in 2025!