Z.ai Releases GLM-5: 744B Parameter Open-Source Powerhouse

Technical Specifications

GLM-5 represents a significant scaling increase from its predecessor. The model grows from 355B parameters (32B active) to 744B parameters (40B active), with pre-training data expanding from 23T to 28.5T tokens. A notable architectural addition is the integration of DeepSeek Sparse Attention (DSA), which reportedly "reduces deployment cost while preserving long-context capacity."

Performance Highlights

The model demonstrates strong capabilities across multiple evaluation frameworks:

Academic Benchmarks: Achieves "best-in-class performance among all open-source models" in reasoning, coding, and agentic tasks
Real-World Tasks: On CC-Bench-V2, GLM-5 significantly outperforms GLM-4.7 in frontend, backend, and long-horizon tasks
Long-Horizon Planning: Ranks #1 among open-source models on Vending Bench 2, completing a simulated year-long business scenario with a final balance of $4,432

What Makes It Significant

Beyond scaling, GLM-5 introduces slime, described as "an asynchronous RL infrastructure that substantially improves training throughput and efficiency." This addresses a critical challenge: deploying reinforcement learning at scale for large language models.

The model is purpose-built for "complex systems engineering and long-horizon agentic tasks," positioning it as a bridge between traditional language models and autonomous agent capabilities.

Background

The release received significant attention, scoring 730 points on Reddit's LocalLLaMA and 289 points on singularity. Z.ai openly acknowledged GPU scarcity, stating "compute is very tight."

Z.ai Releases GLM-5: 744B Parameter Open-Source Powerhouse

Technical Specifications

Performance Highlights

What Makes It Significant

Background

Related Articles

Qwen3.6 lit up LocalLLaMA because the agent actually debugged the app

Codex crosses 4 million weekly developers as OpenAI builds its services channel

LlamaIndex LiteParse keeps PDF tables intact with grid projection

Comments (0)

Leave a Comment

Related Articles

Qwen3.6 lit up LocalLLaMA because the agent actually debugged the app
LLM Reddit Apr 20, 2026 2 min read

Codex crosses 4 million weekly developers as OpenAI builds its services channel

LlamaIndex LiteParse keeps PDF tables intact with grid projection