Together AI Open-Sources Open Deep Research v2 with Dataset, Code, and a Multi-Step Research Workflow

On March 13, 2026, Together AI said on X that v2 of its Open Deep Research app is now fully free and open source. The company said it is releasing the evaluation dataset, code, app, and blog together with the update. That matters because deep research has quickly become one of the most visible agent workflows in AI: instead of returning a short answer, the system plans a task, searches the web, evaluates evidence, and then produces a longer report with citations.

The companion Open Deep Research blog post explains the mechanics. Together describes a workflow built around planning and self-reflection. The system starts by generating search queries, collects web results, checks for knowledge gaps, and iterates until it has enough material to write a report. The company frames this as a response to multi-hop questions where a single search step is not enough and where users need synthesis rather than a list of links.

What ships in v2

The public app announced on X.
An evaluation dataset at Hugging Face.
The open-source codebase at GitHub.
A technical write-up describing architecture, benchmarks, and limitations.

Together also makes clear that this is not a single-model demo. In the blog, different models are assigned to planning, summarization, JSON extraction, and final report writing. The company says this role-based design is intended to balance quality, latency, and cost. It also describes caching to reduce repeated search expense during evaluation, and says a typical reply takes 2 to 5 minutes without podcast generation. That is a useful reminder that high-quality research agents are still materially slower than ordinary chat completions.

For developers, the more durable signal is openness. Together is not only publishing a polished demo but also the pieces needed to benchmark, fork, and extend it. That gives teams a reference implementation for multi-step web research, source ranking, and long-form report generation with citations. The company also openly lists limitations around hallucinations, search bias, and freshness, which makes the release more credible than a simple launch post.

The result is less a model announcement than an attempt to establish an open baseline for research agents. If the community adopts the code and dataset, Open Deep Research v2 could become a practical benchmark for comparing planning loops, retrieval strategies, and report quality across open LLM stacks.

Together AI Open-Sources Open Deep Research v2 with Dataset, Code, and a Multi-Step Research Workflow

What ships in v2

Related Articles

Senior SWE-Bench tests coding agents against the messy idea of seniority

ChatGPT Voice now controls desktop Codex and multi-agent workflows

A 28.9M-parameter LLM runs on an $8 ESP32, with flash doing the heavy lifting

Related Articles

Senior SWE-Bench tests coding agents against the messy idea of seniority
LLM Hacker News Jul 2, 2026 1 min read

ChatGPT Voice now controls desktop Codex and multi-agent workflows
LLM X/Twitter Jul 24, 2026 1 min read

A 28.9M-parameter LLM runs on an $8 ESP32, with flash doing the heavy lifting