Z.ai Releases GLM-5: 744B Parameter Open-Source Powerhouse
Original: GLM-5 Officially Released View original →
Technical Specifications
GLM-5 represents a significant scaling increase from its predecessor. The model grows from 355B parameters (32B active) to 744B parameters (40B active), with pre-training data expanding from 23T to 28.5T tokens. A notable architectural addition is the integration of DeepSeek Sparse Attention (DSA), which reportedly "reduces deployment cost while preserving long-context capacity."
Performance Highlights
The model demonstrates strong capabilities across multiple evaluation frameworks:
- Academic Benchmarks: Achieves "best-in-class performance among all open-source models" in reasoning, coding, and agentic tasks
- Real-World Tasks: On CC-Bench-V2, GLM-5 significantly outperforms GLM-4.7 in frontend, backend, and long-horizon tasks
- Long-Horizon Planning: Ranks #1 among open-source models on Vending Bench 2, completing a simulated year-long business scenario with a final balance of $4,432
What Makes It Significant
Beyond scaling, GLM-5 introduces slime, described as "an asynchronous RL infrastructure that substantially improves training throughput and efficiency." This addresses a critical challenge: deploying reinforcement learning at scale for large language models.
The model is purpose-built for "complex systems engineering and long-horizon agentic tasks," positioning it as a bridge between traditional language models and autonomous agent capabilities.
Background
The release received significant attention, scoring 730 points on Reddit's LocalLLaMA and 289 points on singularity. Z.ai openly acknowledged GPU scarcity, stating "compute is very tight."
Related Articles
r/LocalLLaMA pushed this past 900 points because it was not another score table. The hook was a local coding agent noticing and fixing its own canvas and wave-completion bugs.
This is a distribution story, not just a usage milestone. OpenAI says Codex grew from more than 3 million weekly developers in early April to more than 4 million two weeks later, and it is pairing that demand with Codex Labs plus seven global systems integrators to turn pilots into production rollouts.
Why it matters: document agents fail when PDF parsing destroys table and column structure. LiteParse uses a monospace grid projection approach instead of heavy layout models, and the code is open source.
Comments (0)
No comments yet. Be the first to comment!