Skip to content

Liquid AI Releases LFM2.5: 8B MoE Model Trained on 38T Tokens

Original: Liquid AI reveals 8B-A1B MoE trained on 38T View original →

Read in other languages: 한국어日本語
LLM May 30, 2026 By Insights AI (HN) 1 min read Source

A New Edge AI Benchmark

Liquid AI has released LFM2.5 8B-A1B, a full-scale upgrade to its October 2025 predecessor. The model is a Mixture-of-Experts architecture optimized for on-device AI across edge hardware. Training data scaled from 12T to 38T tokens—more than tripling the previous version.

Key Technical Improvements

The context window expands 4x from 32K to 128K tokens, and vocabulary doubles from 65K to 128K. Multilingual tokenization saw major gains: Hindi +120.4%, Thai +238.2%, Vietnamese +117.9%. The model introduces targeted probability redistribution to address reasoning loop failures, plus knowledge-boundary optimization for hallucination mitigation.

Benchmark Performance

The AA-Omniscience Index improved 53 points from the prior version. Key scores: IFEval 91.84, MATH500 88.76, AIME25 42.53. The model outperforms similarly-sized dense alternatives and competes with Gemma-4-26B despite being three times smaller.

Inference Speed

On CPU hardware, the M5 Max delivers 253 tokens per second, with approximately 30 tokens per second on mobile devices at under 6GB memory. On GPU, a single NVIDIA H100 achieves 18,500 output tokens per second at high concurrency—translating to over 1.6 billion tokens processed daily.

Deployment and Ecosystem

The model ships with support for llama.cpp, MLX, vLLM, SGLang, and ONNX, covering Apple, AMD, Intel, Qualcomm, and Nvidia hardware. The LocalCowork demo showcases 67 tools across 13 servers executing entirely on-device with sub-second dispatch latency—no cloud dependency required.

Share: Long

Related Articles

Comments (0)

No comments yet. Be the first to comment!

Leave a Comment