Z.ai Releases GLM-5: 744B Parameter Open-Source Powerhouse
Original: GLM-5 Officially Released View original →
Technical Specifications
GLM-5 represents a significant scaling increase from its predecessor. The model grows from 355B parameters (32B active) to 744B parameters (40B active), with pre-training data expanding from 23T to 28.5T tokens. A notable architectural addition is the integration of DeepSeek Sparse Attention (DSA), which reportedly "reduces deployment cost while preserving long-context capacity."
Performance Highlights
The model demonstrates strong capabilities across multiple evaluation frameworks:
- Academic Benchmarks: Achieves "best-in-class performance among all open-source models" in reasoning, coding, and agentic tasks
- Real-World Tasks: On CC-Bench-V2, GLM-5 significantly outperforms GLM-4.7 in frontend, backend, and long-horizon tasks
- Long-Horizon Planning: Ranks #1 among open-source models on Vending Bench 2, completing a simulated year-long business scenario with a final balance of $4,432
What Makes It Significant
Beyond scaling, GLM-5 introduces slime, described as "an asynchronous RL infrastructure that substantially improves training throughput and efficiency." This addresses a critical challenge: deploying reinforcement learning at scale for large language models.
The model is purpose-built for "complex systems engineering and long-horizon agentic tasks," positioning it as a bridge between traditional language models and autonomous agent capabilities.
Background
The release received significant attention, scoring 730 points on Reddit's LocalLLaMA and 289 points on singularity. Z.ai openly acknowledged GPU scarcity, stating "compute is very tight."
Related Articles
Andrej Karpathy has published autoresearch, a minimal repo that lets AI agents iterate on a stripped-down nanochat training loop overnight. The project turns agent evaluation into a closed-loop research workflow with fixed 5-minute runs, Git branches, and validation-loss-based selection.
China's GLM-5 model achieves a score of 50 on the Intelligence Index, claiming top performance among open-source large language models.
Meta has unveiled Llama 4 Scout and Maverick, the first open-weight natively multimodal models. With industry-leading 10 million token context and MoE architecture, they outperform GPT-4o and Gemini 2.0 Flash.
Comments (0)
No comments yet. Be the first to comment!