Cursor details Composer 2's training stack in a new technical report
Original: Cursor details Composer 2's training stack in a new technical report View original →
On March 24, 2026, Cursor’s X account said it was releasing a technical report explaining how Composer 2 was trained. The announcement points developers to a deeper write-up that moves beyond the launch blog and explains the training, evaluation, and systems work behind the model.
What Cursor disclosed
According to Cursor’s March 27 technical report and March 19 launch post, Composer 2 is trained in two phases: continued pretraining on a code-heavy mix to strengthen base coding knowledge, followed by large-scale reinforcement learning to improve end-to-end agent performance. Cursor says the RL setup uses the same tools and harness as the deployed product, with tasks drawn from the kinds of ambiguous, multi-file problems developers actually ask the product to solve.
Cursor also introduced CursorBench as an internal benchmark derived from real engineering sessions. On the metrics cited in the official posts, Composer 2 scores 61.3 on CursorBench, 61.7 on Terminal-Bench 2.0, and 73.7 on SWE-bench Multilingual. Cursor positions the model as a frontier-level coding system with lower cost than prior Composer releases, listing $0.50 per million input tokens and $2.50 per million output tokens for the standard variant.
Why it matters
The announcement is notable because coding-model vendors rarely disclose much about the path from base model to agent behavior. Cursor’s report makes the case that agentic coding quality depends not just on a stronger base model, but on training in realistic tool-using environments, long-horizon tasks, and evaluation loops that look closer to production than to narrow benchmark prompts. For teams using coding assistants, the key takeaway is that model training recipes are increasingly being optimized around full workflow completion rather than isolated code generation.
Source materials include Cursor’s X post, its technical report, and the Composer 2 launch article.
Related Articles
Cursor has published the Composer 2 technical report, outlining its code-focused continued pretraining, large-scale reinforcement learning pipeline, and CursorBench-led evaluation strategy. The report offers an unusually detailed first-party look at how a production coding agent is trained and measured.
Cursor said on March 26, 2026 that real-time reinforcement learning lets it ship improved Composer 2 checkpoints every five hours. Cursor’s March 27 technical report says the model combines continued pretraining on Kimi K2.5 with large-scale RL in realistic Cursor sessions, scores 61.3 on CursorBench, and runs on an asynchronous multi-region RL stack with large sandbox fleets.
Cursor said on March 26, 2026 that real-time reinforcement learning lets it ship improved Composer checkpoints as often as every five hours. Cursor's research post says the loop trains on billions of production tokens from real user interactions, runs evals including CursorBench before deployment, and has already shown gains in edit persistence, dissatisfied follow-ups, and latency.
Comments (0)
No comments yet. Be the first to comment!