#video-generation

AI Hacker News 3d ago 1 min read

FLUX 3 Pushes Past Image Generation Into Video, Audio, and Action

The discussion moved quickly from sample quality to the larger claim: one multimodal backbone for visual generation and action prediction.

#flux #multimodal #video-generation

Sciences Hacker News Jul 10, 2026 1 min read

NEvo Searches for Videos That Maximally Activate a Brain Region

The discussion split between scientific utility and obvious misuse concerns. NEvo uses a digital twin of the visual brain as a reward model, then evolves AI-generated clips to maximize predicted activation in target regions.

#neuroai #video-generation #brain

AI Jul 8, 2026 1 min read

Meta’s Muse Image puts tool-using generation inside Instagram and WhatsApp

Meta has rolled out Muse Image across Meta AI, meta.ai, Instagram Stories in the U.S., and WhatsApp in limited countries. The notable shift is an image model that uses search, code execution, self-refinement, and a watermarking system inside consumer social products.

#meta #muse #image-generation

AI X/Twitter Jul 8, 2026 1 min read

NVIDIA MOTIVE picks motion-critical video clips and wins 74.1% preference

NVIDIA Research’s MOTIVE targets a specific video-model bottleneck: which fine-tuning clips actually improve motion. The ICML 2026 honored paper reports a 74.1% human preference result against the base model.

#nvidia #video-generation #icml-2026

AI X/Twitter May 25, 2026 1 min read

Meituan puts LongCat-Video-Avatar 1.5 on Hugging Face with MIT license

Meituan’s LongCat team released an audio-driven avatar video model with Diffusers examples and an MIT license on Hugging Face. The project compares against HeyGen, Kling Avatar 2.0, and OmniHuman-1.5.

#meituan #longcat #video-generation

AI X/Twitter May 21, 2026 1 min read

Google DeepMind Launches Gemini Omni: Generate Video from Any Input

At Google I/O 2026, Google DeepMind unveiled Gemini Omni — its first model capable of generating video from any input including text, images, audio, and video. Combining Gemini's intelligence with Google's generative media systems, it is available now through the Gemini app and YouTube Shorts.

#google #gemini #video-generation

AI May 20, 2026 1 min read

Google Unveils Gemini Omni at I/O 2026: A "World Model" That Rewrites Video Editing

Google revealed Gemini Omni at I/O 2026—a "world model" that processes text, audio, images, and video together to simulate physical environments. Unlike Sora or Runway, it lets users edit footage through natural language and maintains scene consistency across modifications. It replaces Veo in the Gemini app immediately.

#google #gemini #video-generation

AI Hacker News May 16, 2026 1 min read

NVIDIA's SANA-WM: Open-Source 2.6B World Model for 1-Minute 720p Video

NVIDIA Labs released SANA-WM, a 2.6B parameter open-source world model capable of generating up to one minute of 720p video. The relatively small model size and open-source availability make it a significant contribution to accessible video generation research.

#video-generation #nvidia #open-source

AI Reddit May 12, 2026 1 min read

Google's 'Omni' Video Model Leaks with Notably Coherent Text Rendering

A video believed to be from Google's unreleased 'Omni' video generation model has leaked, drawing 1,300+ upvotes on r/singularity. Users particularly noted the model's unusually coherent text rendering - a persistent weakness in current video generation models.

#google #video-generation #omni

AI Apr 11, 2026 1 min read

Google launches Veo 3.1 Lite as a lower-cost video model for developers

Google introduced Veo 3.1 Lite as its most cost-effective video generation model, priced at less than 50% of Veo 3.1 Fast while keeping the same speed. The model is rolling out through the paid tier of the Gemini API and Google AI Studio, broadening access to higher-volume video app use cases.

#google #veo #video-generation

AI Reddit Apr 5, 2026 2 min read

Reddit picks up Netflix’s VOID video object-deletion model

Netflix’s VOID reached Reddit as an open research release aimed at removing objects from video and repairing the interactions those objects caused in the scene. The notable details are the CogVideoX base, a two-pass pipeline, Gemini+SAM2 mask generation, and a 40GB+ VRAM requirement.

#video-editing #video-generation #inpainting

AI X/Twitter Apr 4, 2026 2 min read

Together AI brings Wan 2.7 video generation and editing workflows onto one API surface

Together AI said on April 3, 2026 that Wan 2.7 from Alibaba Cloud is now available on its platform. The accompanying product post says text-to-video is live now, with image-to-video, reference-to-video, and video edit workflows rolling out on the same API, auth, and billing surface.

#together-ai #wan-2-7 #video-generation