LLM Hacker News 1h ago 2 min read
A high-ranking Hacker News thread amplified Apple's paper on simple self-distillation for code generation, a training recipe that improves pass@1 without verifier models or reinforcement learning.
A high-ranking Hacker News thread amplified Apple's paper on simple self-distillation for code generation, a training recipe that improves pass@1 without verifier models or reinforcement learning.
A Hacker News discussion surfaced a new paper showing that a model can improve coding performance by training on its own sampled answers. The authors report Qwen3-30B-Instruct rising from 42.4% to 55.3% pass@1 on LiveCodeBench v6 without a verifier, a teacher model, or reinforcement learning.