Cursor details Composer 2's training stack in a new technical report
Original: Cursor details Composer 2's training stack in a new technical report View original →
On March 24, 2026, Cursor’s X account said it was releasing a technical report explaining how Composer 2 was trained. The announcement points developers to a deeper write-up that moves beyond the launch blog and explains the training, evaluation, and systems work behind the model.
What Cursor disclosed
According to Cursor’s March 27 technical report and March 19 launch post, Composer 2 is trained in two phases: continued pretraining on a code-heavy mix to strengthen base coding knowledge, followed by large-scale reinforcement learning to improve end-to-end agent performance. Cursor says the RL setup uses the same tools and harness as the deployed product, with tasks drawn from the kinds of ambiguous, multi-file problems developers actually ask the product to solve.
Cursor also introduced CursorBench as an internal benchmark derived from real engineering sessions. On the metrics cited in the official posts, Composer 2 scores 61.3 on CursorBench, 61.7 on Terminal-Bench 2.0, and 73.7 on SWE-bench Multilingual. Cursor positions the model as a frontier-level coding system with lower cost than prior Composer releases, listing $0.50 per million input tokens and $2.50 per million output tokens for the standard variant.
Why it matters
The announcement is notable because coding-model vendors rarely disclose much about the path from base model to agent behavior. Cursor’s report makes the case that agentic coding quality depends not just on a stronger base model, but on training in realistic tool-using environments, long-horizon tasks, and evaluation loops that look closer to production than to narrow benchmark prompts. For teams using coding assistants, the key takeaway is that model training recipes are increasingly being optimized around full workflow completion rather than isolated code generation.
Source materials include Cursor’s X post, its technical report, and the Composer 2 launch article.
Related Articles
Cursor has published the Composer 2 technical report, outlining its code-focused continued pretraining, large-scale reinforcement learning pipeline, and CursorBench-led evaluation strategy. The report offers an unusually detailed first-party look at how a production coding agent is trained and measured.
Cursor said on March 26, 2026 that real-time reinforcement learning lets it ship improved Composer 2 checkpoints every five hours. Cursor’s March 27 technical report says the model combines continued pretraining on Kimi K2.5 with large-scale RL in realistic Cursor sessions, scores 61.3 on CursorBench, and runs on an asynchronous multi-region RL stack with large sandbox fleets.
Anthropic said on March 30, 2026 that computer use is now available in Claude Code in research preview for Pro and Max plans. Claude Code docs say the feature lets Claude open apps, click through UI flows, and see the screen on macOS from the CLI, targeting native app testing, visual debugging, and other GUI-only tasks.
Comments (0)
No comments yet. Be the first to comment!