1:17:00 this is the first time someone has explained the gist of a paper in his own words and I immediately understood what is meant, brilliant. Totally underrated podcast!
Why didn't you mention Quiet-STaR (not to be confused with "Q*") in the coverage of Coconut? It was so slick... very underappreciated IMO. Their reinforcement learning algorithm results in a model that thinks (outputs thought tokens) in gibberish. And they DON'T use special chain-of-thought data; just next-token prediction with the same data used in pretraining. The thought tokens start as natural language, since you start with a normal pretrained model, but they evolve to gibberish as the model gets better. IIRC they didn't call it a "continuous latent space" and generally didn't do much of the usual this-is-a-groundbreaking-paper signaling (e.g. give a convoluted Bayesian derivation and no simple explanation, so the ideas seem more sophisticated). But yeah, pretty much a continuous latent space. Anyway, love your show!
IIRC, they pointed to making the number of thought tokens per output token variable as an essential open problem. For some fixed k, the model they train "thinks" for k tokens before *every* output token, so the inference time compute is k times greater than the base model; very costly. Obviously that's not optimal since most output tokens are easy to predict.
That's a good point! Honestly we probably just did not think to - there's been a ton of cool research in the general domain of LLM reasoning over the past year, there was lots we could have gotten into... Nice to see that the Coconut paper did cite Quiet-Star though.
Appreciate the detailed breakdown! Could you help me with something unrelated: My OKX wallet holds some USDT, and I have the seed phrase. (alarm fetch churn bridge exercise tape speak race clerk couch crater letter). How should I go about transferring them to Binance?
1:17:00 this is the first time someone has explained the gist of a paper in his own words and I immediately understood what is meant, brilliant. Totally underrated podcast!
Of course we watch to the end! At least I do! No one gives insights to the news like you guys do!
First time listening to you guys! Pretty nice!
❤
Why didn't you mention Quiet-STaR (not to be confused with "Q*") in the coverage of Coconut? It was so slick... very underappreciated IMO. Their reinforcement learning algorithm results in a model that thinks (outputs thought tokens) in gibberish. And they DON'T use special chain-of-thought data; just next-token prediction with the same data used in pretraining. The thought tokens start as natural language, since you start with a normal pretrained model, but they evolve to gibberish as the model gets better. IIRC they didn't call it a "continuous latent space" and generally didn't do much of the usual this-is-a-groundbreaking-paper signaling (e.g. give a convoluted Bayesian derivation and no simple explanation, so the ideas seem more sophisticated). But yeah, pretty much a continuous latent space.
Anyway, love your show!
IIRC, they pointed to making the number of thought tokens per output token variable as an essential open problem. For some fixed k, the model they train "thinks" for k tokens before *every* output token, so the inference time compute is k times greater than the base model; very costly. Obviously that's not optimal since most output tokens are easy to predict.
That's a good point! Honestly we probably just did not think to - there's been a ton of cool research in the general domain of LLM reasoning over the past year, there was lots we could have gotten into... Nice to see that the Coconut paper did cite Quiet-Star though.
biggest week
Appreciate the detailed breakdown! Could you help me with something unrelated: My OKX wallet holds some USDT, and I have the seed phrase. (alarm fetch churn bridge exercise tape speak race clerk couch crater letter). How should I go about transferring them to Binance?
How is this podcast so underrated