Hello Brother i love your videos ❤ Can make a complete evolution of ai video where you can talk and explain about the idea behind ai and how it evolved
How does the back-propagation, fine-tuning and inference work though? The rationale is a more detailed answer, this is bootstrapping the dataset with model outputs hoping theres enough context in the question answer to generate a rationale? which is probably why the rationales are still wrong.
For backprop, the output answer (without the rationale) generated during the rationale generation phase is compared to the label output. From this, we can get a loss and hence backprop comes in for the network to learn during the fine tuning phase. The issue here, and with STaR is that even though the answer may be right, the rationale could be wrong
it was a good video - until you got to quiz time and decided to try to click your tongue and make incredibly annoying jeopardy sounds. that cost you a like and a subscribe
Hello Brother i love your videos ❤ Can make a complete evolution of ai video where you can talk and explain about the idea behind ai and how it evolved
Thank you! This is Currently the playlist I am making. You can check the playlist “Evolution of neural networks”
Your timer-clock simulation is amazing 🤩
Great job!
why do you think task spesific models (basic, non llm models) cant do arithmetic without cot dataset?
how do you tell the llm to generate rationale in star algorithm?
u mentioned that we need 10 data examples with rationale, where do we use this 10 data in the diagram of star algorithm process?
How does the back-propagation, fine-tuning and inference work though? The rationale is a more detailed answer, this is bootstrapping the dataset with model outputs hoping theres enough context in the question answer to generate a rationale? which is probably why the rationales are still wrong.
For backprop, the output answer (without the rationale) generated during the rationale generation phase is compared to the label output. From this, we can get a loss and hence backprop comes in for the network to learn during the fine tuning phase.
The issue here, and with STaR is that even though the answer may be right, the rationale could be wrong
it was a good video - until you got to quiz time and decided to try to click your tongue and make incredibly annoying jeopardy sounds. that cost you a like and a subscribe