Together AI brings Wan 2.7 video generation and editing workflows onto one API surface

What Together AI posted on X

On April 3, 2026, Together AI announced that Wan 2.7 from Alibaba Cloud is coming to its platform as a more unified video workflow stack. The framing in the X post is not just about model availability. Together is emphasizing that teams can move from an initial generated clip to continuation, reference-driven control, and editing on one production platform rather than stitching those tasks across separate tools.

That positioning addresses a real problem in multimodal development. Video generation is easy to demo, but difficult to steer once a project needs continuity, reference matching, revision, or editorial control. Teams often end up bouncing between different model providers and post-processing systems. Together’s message is that Wan 2.7 can reduce that fragmentation by bringing more of the workflow into a single operational surface.

What the product post says

Together’s product note describes Wan 2.7 as a four-model suite covering generation, continuation, reference-driven workflows, and editing. Text-to-video is available now through the model Wan-AI/wan2.7-t2v, while image-to-video, reference-to-video, and video edit are planned to roll out next on the same platform.

The currently available text-to-video offering supports 720P and 1080P outputs, 2 to 15 second durations, optional audio input, and prompt-driven multi-shot direction. Together also says the service uses the same APIs, authentication, SDKs, and billing surface that customers already use for the rest of their multimodal stack, with pricing starting at $0.10 per second of generated video.

Why it matters

This is significant because the differentiator in video AI is shifting from “can the model generate a clip?” to “can the platform support production iteration?” Teams need continuation, reference control, editing, and predictable operational interfaces. If those pieces arrive behind one API contract instead of a collection of disconnected services, video becomes easier to integrate into real applications and internal pipelines.

For Together AI, Wan 2.7 is also a positioning move. The company is trying to show that multimodal infrastructure can be unified in the same way text inference was unified: one account, one billing model, one developer surface, multiple capabilities. If that approach works, the platform becomes more valuable not because it hosts one model, but because it reduces the operational cost of using many video workflows together.

Together AI brings Wan 2.7 video generation and editing workflows onto one API surface

What Together AI posted on X

What the product post says

Why it matters

Related Articles

Google DeepMind Launches Gemini Omni: Generate Video from Any Input

NVIDIA's SANA-WM: Open-Source 2.6B World Model for 1-Minute 720p Video

Google's 'Omni' Video Model Leaks with Notably Coherent Text Rendering