GLM 5.2 hits 64% on Vibe Code Bench as open weights close in

GLM 5.2 has crossed a notable line for open-weight coding models: building web applications from scratch. In a post on X, Vals AI wrote that “GLM 5.2 is the only open-weight model to break 60%” on Vibe Code Bench v1.1, with a reported score of 64%.

The number matters because the gap is not marginal. Vals AI said no other open-weight model on the board reaches 50%, putting GLM 5.2 14 percentage points ahead of the next open-weight entry. That makes the result less about a single leaderboard win and more about whether open models are becoming viable for real app-building workflows that previously leaned on closed frontier systems.

Vals AI describes itself as a public LLM evaluation group and typically posts benchmark comparisons rather than general product marketing. The tweet follows a broader wave of attention around Z.ai’s GLM 5.2, a model positioned around long-context coding and agentic engineering tasks. Vibe Code Bench is especially relevant because it focuses on the end-to-end ability to produce web applications, not only solve isolated programming questions.

The next thing to watch is repeatability. A 64% score is meaningful only if it holds across different prompts, app types, scaffolds, and evaluation settings. Developers will also care about serving cost, latency, tool compatibility, and whether the model’s advantage translates into fewer manual fixes. If the open-weight field follows GLM 5.2 past the 50% mark, the economics of coding agents could shift quickly.

GLM 5.2 hits 64% on Vibe Code Bench as open weights close in

Related Articles

Local LLM users want the missing 80-160B middle

GLM-5.2 pushes open weights into the cost-versus-reasoning debate

GLM-5 Becomes Top Open-Weights Model on Extended NYT Connections Benchmark

Related Articles

Local LLM users want the missing 80-160B middle

GLM-5.2 pushes open weights into the cost-versus-reasoning debate

GLM-5 Becomes Top Open-Weights Model on Extended NYT Connections Benchmark
LLM Reddit Feb 24, 2026 1 min read