#together-ai

LLM X/Twitter Apr 5, 2026 1 min read

Together Research says LLMs can repair bad database query plans

Together Research says LLMs can patch faulty database query plans instead of regenerating them from scratch, and claims up to 4.78x speedups on some TPC-H and TPC-DS workloads. The tweet points to DBPlanBench, a DataFusion-based harness that exposes a physical operator graph to an LLM and uses iterative search to refine plan edits.

#together-ai #dbplanbench #query-optimization

AI X/Twitter Apr 4, 2026 2 min read

Together AI brings Wan 2.7 video generation and editing workflows onto one API surface

Together AI said on April 3, 2026 that Wan 2.7 from Alibaba Cloud is now available on its platform. The accompanying product post says text-to-video is live now, with image-to-video, reference-to-video, and video edit workflows rolling out on the same API, auth, and billing surface.

#together-ai #wan-2-7 #video-generation

LLM X/Twitter Apr 1, 2026 2 min read

Together Research releases Aurora for RL-based adaptive speculative decoding

Together Research said on March 31, 2026 that Aurora is an open-source framework for adaptive speculative decoding that learns from live inference traces and updates the speculator asynchronously without interrupting serving. Together’s blog and paper say Aurora reframes the problem as asynchronous RL and can deliver 1.25x additional speedup over a strong static speculator as traffic shifts.

#together-ai #aurora #speculative-decoding

LLM X/Twitter Mar 27, 2026 2 min read

Together Research says divide-and-conquer long-context pipelines can beat GPT-4o single-shot

Together Research said on March 27, 2026 that a smaller model using divide-and-conquer can match or outperform GPT-4o on long-context tasks, with the work accepted at ICLR 2026. Together's blog and the arXiv paper say the method uses a planner-worker-manager pipeline and explains long-context failures in terms of task, model, and aggregator noise.

#together-ai #long-context #multi-agent

LLM X/Twitter Mar 23, 2026 2 min read

Together AI expands fine-tuning to tool calling, reasoning traces, and VLM post-training

Together AI said on March 19, 2026 that its fine-tuning service now supports tool-call, reasoning, and vision-language workflows. The linked Together AI blog adds 100B+ parameter model support, datasets up to 100GB, up to 6x higher throughput on large MoE models, and upfront cost plus ETA estimates.

#together-ai #fine-tuning #tool-calling

LLM X/Twitter Mar 22, 2026 2 min read

Together AI expands fine-tuning with tool calling, reasoning, and VLM support plus faster MoE training

Together AI said on March 19, 2026 that its fine-tuning service now supports tool calling, reasoning, and vision-language model training, with up to 6x higher throughput on MoE architectures. The company says the update also targets very large models, supports datasets up to 100GB, and adds pre-run cost estimates plus live ETAs during training.

#together-ai #fine-tuning #tool-calling

LLM Hacker News Feb 21, 2026 2 min read

HN Highlights CDLM: Block-Wise KV Caching and Step Reduction for Faster Diffusion LLM Inference

A high-score Hacker News discussion surfaced Together AI's CDLM post, which claims up to 14.5x latency improvements for diffusion language models by combining trajectory-consistent step reduction with exact block-wise KV caching.

#diffusion-language-models #llm-inference #kv-cache