LLM Reddit 1d ago 2 min read
A r/MachineLearning post argues that Meta’s COCONUT results may owe more to curriculum design and sequential processing than to the headline mechanism of recycling hidden states as latent thought tokens.
A r/MachineLearning post argues that Meta’s COCONUT results may owe more to curriculum design and sequential processing than to the headline mechanism of recycling hidden states as latent thought tokens.