Together AI expands fine-tuning to tool calling, reasoning traces, and VLM post-training

Original: R to @togethercompute: What's new: 👉 Tool call fine-tuning with end-to-end OpenAI-compatible schema validation 👉 Reasoning fine-tuning with native thinking token support 👉 Vision-language model fine-tuning for domain-specific visual data 👉 Up to 6x throughput gains on MoE models with cost and time estimation before and during training View original →

Read in other languages: 한국어日本語
LLM Mar 23, 2026 By Insights AI 2 min read 1 views Source

What Together AI posted on X

On March 19, 2026, Together AI highlighted four parts of a single fine-tuning update on X: tool call fine-tuning with end-to-end OpenAI-compatible schema validation, reasoning fine-tuning with native thinking token support, vision-language model fine-tuning for domain-specific visual data, and up to 6x throughput gains on MoE models with cost and time estimates before and during training.

That combination matters because it treats post-training as an agent-systems problem, not just a generic supervised fine-tune job. Once teams depend on tool use, long reasoning traces, and multimodal inputs, small formatting or infrastructure failures can break production behavior even when the base model is strong.

What the Together AI blog adds

The March 18 blog post fills in the implementation details. Together says the service now supports tool-call data in an OpenAI-compatible schema and validates that every tool_calls entry matches a declared tool before training begins. At inference time, the company says it also improved tool-call parsing and validation so the gains from tool-call fine-tuning carry into production.

For reasoning models, Together says fine-tuning can now use reasoning or reasoning_content fields in assistant messages, which is meant to preserve structured thinking traces for domain-specific reasoning tasks. For vision-language models, the service supports inline base64 images, mixed image-text and text-only datasets, and optional train_vision=true training when teams want to update both the vision encoder and language layers.

The infrastructure update is just as notable. Together says it upgraded the stack to handle 100B+ parameter models more efficiently, support datasets up to 100GB, and deliver at least 2x throughput improvements across models, with larger systems like Kimi K2.5 improving by more than 6x. It also added price estimates before launch and a live ETA during training.

Why this matters

The practical signal is that post-training is becoming a product surface rather than a research-only workflow. Teams want fine-tuning systems that understand structured tool schemas, long reasoning traces, and multimodal examples without forcing them to build custom pipelines around each model family.

If Together's reliability and planning claims hold in real workloads, the bigger shift is operational: more frequent domain-specific post-training, less guesswork around cost and completion times, and faster iteration for agent products that depend on tool use and multimodal context. That pushes fine-tuning closer to regular application engineering instead of a one-off infrastructure project.

Sources: Together AI X post · Together AI blog

Share: Long

Related Articles

LLM Reddit 5d ago 2 min read

A high-engagement r/LocalLLaMA post highlighted Unsloth Studio, a beta open-source web UI that aims to train, run, and export open models from one local interface. The discussion framed it as a possible LM Studio challenger in the GGUF ecosystem, while top commenters noted that many advanced users still lean on vLLM or direct llama.cpp workflows.

Comments (0)

No comments yet. Be the first to comment!

Leave a Comment

© 2026 Insights. All rights reserved.