LLM Self-Taught Reasoning - Explained!

แชร์
ฝัง
  • เผยแพร่เมื่อ 29 ม.ค. 2025

ความคิดเห็น • 10

  • @MohitKumar-iv5ri
    @MohitKumar-iv5ri 15 วันที่ผ่านมา +2

    Hello Brother i love your videos ❤ Can make a complete evolution of ai video where you can talk and explain about the idea behind ai and how it evolved

    • @CodeEmporium
      @CodeEmporium  15 วันที่ผ่านมา

      Thank you! This is Currently the playlist I am making. You can check the playlist “Evolution of neural networks”

  • @eugeniivakulenko5281
    @eugeniivakulenko5281 6 วันที่ผ่านมา

    Your timer-clock simulation is amazing 🤩

  • @EricKaysan
    @EricKaysan 2 หลายเดือนก่อน

    Great job!

  • @EobardUchihaThawne
    @EobardUchihaThawne 2 หลายเดือนก่อน +1

    why do you think task spesific models (basic, non llm models) cant do arithmetic without cot dataset?

  • @raihanpahlevi6870
    @raihanpahlevi6870 หลายเดือนก่อน

    how do you tell the llm to generate rationale in star algorithm?

  • @raihanpahlevi6870
    @raihanpahlevi6870 หลายเดือนก่อน

    u mentioned that we need 10 data examples with rationale, where do we use this 10 data in the diagram of star algorithm process?

  • @willw4957
    @willw4957 2 หลายเดือนก่อน +1

    How does the back-propagation, fine-tuning and inference work though? The rationale is a more detailed answer, this is bootstrapping the dataset with model outputs hoping theres enough context in the question answer to generate a rationale? which is probably why the rationales are still wrong.

    • @CodeEmporium
      @CodeEmporium  2 หลายเดือนก่อน +1

      For backprop, the output answer (without the rationale) generated during the rationale generation phase is compared to the label output. From this, we can get a loss and hence backprop comes in for the network to learn during the fine tuning phase.
      The issue here, and with STaR is that even though the answer may be right, the rationale could be wrong

  • @CyberwizardProductions
    @CyberwizardProductions 2 หลายเดือนก่อน

    it was a good video - until you got to quiz time and decided to try to click your tongue and make incredibly annoying jeopardy sounds. that cost you a like and a subscribe