Together AI expands fine-tuning with tool calling, reasoning, and VLM support plus faster MoE training
Original: Together Fine-tuning now supports tool calling, reasoning, and vision-language model fine-tuning. Train models up to 1T parameters with up to 6x higher throughput on MoE architectures. View original →
What Together AI announced on X
On March 19, 2026, Together AI said its fine-tuning service now supports tool calling, reasoning, and vision-language model training. It also claimed up to 6x higher throughput on mixture-of-experts architectures and framed the service as capable of handling models up to the 1T-parameter class. That makes the update more than a simple feature release: it is a push to turn fine-tuning into a broader post-training stack for agent and multimodal workloads.
What the official blog adds
Together AI’s post lays out the failure modes it is trying to solve: tool calls that do not match schemas, reasoning quality that degrades across longer interactions, and models that miss domain-specific visual cues. The updated service addresses those with OpenAI-compatible schema support for tool-call training, direct fine-tuning on thinking traces for reasoning models, and native vision-language model fine-tuning using image-plus-text training data.
- The service now supports datasets up to 100GB.
- The launch materials talk about support for very large models, while the blog says the stack was upgraded to handle 100B+ parameter models more efficiently and discusses the challenges of trillion-parameter training.
- Together AI lists new fine-tuning support for models including Qwen 3.5 variants, Kimi K2.5, Kimi K2, GLM-4.7, and GLM-4.6.
- The company also added job cost estimates before launch and live ETA tracking while training runs are in progress.
Why this matters
Enterprise fine-tuning demand is shifting away from “make the base model sound more like us” and toward more operational goals: get agents to call tools reliably, preserve structured reasoning across long workflows, and adapt models to domain-specific images. Together AI is trying to package those needs into one managed training service rather than leaving them spread across custom scripts, separate inference fixes, and infrastructure-heavy experimentation.
The throughput claim matters just as much as the feature list. Together AI says every model in the updated training stack improved by at least 2x, with larger models seeing gains above 6x. If that holds in real usage, the practical effect is not just shorter training jobs. It means faster iteration loops, more experiments per team, and a lower barrier to treating post-training as a continuous product workflow rather than an occasional specialized project. In other words, the competitive frontier is moving from model access alone toward how quickly platforms can help teams shape, validate, and deploy those models for production tasks.
Sources: Together AI X post · Together AI fine-tuning update
Related Articles
A high-engagement r/LocalLLaMA post highlighted Unsloth Studio, a beta open-source web UI that aims to train, run, and export open models from one local interface. The discussion framed it as a possible LM Studio challenger in the GGUF ecosystem, while top commenters noted that many advanced users still lean on vLLM or direct llama.cpp workflows.
OpenAI said on March 5, 2026 that GPT-5.4 Thinking shows low Chain-of-Thought controllability, which for now strengthens CoT monitoring as a safety signal. The release pairs an X post with a new open-source evaluation suite and research paper.
A project post in r/MachineLearning points to mlx-tune, a library that wraps Apple’s MLX stack in an Unsloth-compatible training API for SFT, DPO, GRPO, LoRA, and vision-language fine-tuning on Apple Silicon Macs.
Comments (0)
No comments yet. Be the first to comment!