LLM Reddit 1d ago 1 min read
A high-scoring r/MachineLearning post resurfaced David Noel Ng's long-form write-up, centering on the claim that duplicating a seven-layer middle block in Qwen2-72B, without changing weights, was enough to reach the top of the open leaderboard.