Liquid AI Releases LFM2.5: 8B MoE Model Trained on 38T Tokens

A New Edge AI Benchmark

Liquid AI has released LFM2.5 8B-A1B, a full-scale upgrade to its October 2025 predecessor. The model is a Mixture-of-Experts architecture optimized for on-device AI across edge hardware. Training data scaled from 12T to 38T tokens—more than tripling the previous version.

Key Technical Improvements

The context window expands 4x from 32K to 128K tokens, and vocabulary doubles from 65K to 128K. Multilingual tokenization saw major gains: Hindi +120.4%, Thai +238.2%, Vietnamese +117.9%. The model introduces targeted probability redistribution to address reasoning loop failures, plus knowledge-boundary optimization for hallucination mitigation.

Benchmark Performance

The AA-Omniscience Index improved 53 points from the prior version. Key scores: IFEval 91.84, MATH500 88.76, AIME25 42.53. The model outperforms similarly-sized dense alternatives and competes with Gemma-4-26B despite being three times smaller.

Inference Speed

On CPU hardware, the M5 Max delivers 253 tokens per second, with approximately 30 tokens per second on mobile devices at under 6GB memory. On GPU, a single NVIDIA H100 achieves 18,500 output tokens per second at high concurrency—translating to over 1.6 billion tokens processed daily.

Deployment and Ecosystem

The model ships with support for llama.cpp, MLX, vLLM, SGLang, and ONNX, covering Apple, AMD, Intel, Qualcomm, and Nvidia hardware. The LocalCowork demo showcases 67 tools across 13 servers executing entirely on-device with sub-second dispatch latency—no cloud dependency required.

Liquid AI Releases LFM2.5: 8B MoE Model Trained on 38T Tokens

A New Edge AI Benchmark

Key Technical Improvements

Benchmark Performance

Inference Speed

Deployment and Ecosystem

Related Articles

Snyk’s 300-run test exposes unstable LLM security-review queues

LongCat-2.0 makes the infrastructure story as important as the MoE scale

Senior SWE-Bench tests coding agents against the messy idea of seniority

Related Articles

Snyk’s 300-run test exposes unstable LLM security-review queues
LLM Jun 29, 2026 2 min read

LongCat-2.0 makes the infrastructure story as important as the MoE scale
LLM Hacker News Jul 2, 2026 1 min read

Senior SWE-Bench tests coding agents against the messy idea of seniority
LLM Hacker News Jul 2, 2026 1 min read