r/LocalLLaMA spotlights GigaChat 3.1 open weights from 10B to 702B

A high-engagement r/LocalLLaMA post announced two new open-weights releases under the MIT license: GigaChat-3.1-Ultra, a 702B A36B mixture-of-experts model, and GigaChat-3.1-Lightning, a 10B A1.8B MoE aimed at far smaller deployments. The post is notable because it does not present the release as a minor fine-tune. The team says both models were pretrained from scratch on its own data and hardware, with English and Russian as core optimization targets and 14 languages in the training mix.

The smaller Lightning model is the more immediately practical story for the local-model community. The authors claim a 256k context window, strong tool-calling behavior, and FP8 plus multi-token prediction support that keeps throughput high on a single H100 benchmark setup. They report a BFCL v3 score of 0.76 for tool use and compare Lightning against Qwen3, SmolLM3, Gemma 3, and YandexGPT lite models. The larger Ultra release targets multi-node environments, with the post saying it can run on three HGX instances and outperform several open-weight comparators on the team's internal benchmark table.

What makes this interesting beyond the headline numbers is the packaging. The release includes weights and GGUF variants on Hugging Face, and the team links a longer technical report on Habr. That gives the community something more useful than a teaser: people can inspect licensing, evaluate deployment fit, and decide whether the multilingual and CIS-focused angle fills a gap that US- and China-centered open model ecosystems often leave open.

The usual caveat applies. These benchmark tables are vendor-reported claims, not independent reproductions, so the real test will be community evaluations on coding, reasoning, latency, and quantized inference. Even so, r/LocalLLaMA treated the announcement as a meaningful addition to the open-weights landscape because it spans both frontier-scale and genuinely deployable sizes.

Why the post stood out

It ships both a very large 702B MoE and a local-friendlier 10B A1.8B MoE.
The models are released under MIT terms with Hugging Face weights and GGUFs.
The team claims training from scratch rather than a simple downstream fine-tune.
Multilingual support and Russian/CIS optimization give the release a distinct regional angle.

r/LocalLLaMA spotlights GigaChat 3.1 open weights from 10B to 702B

Why the post stood out

Related Articles

LocalLLaMA surfaces MIT-licensed GigaChat 3.1 open weights in 702B and 10B sizes

r/LocalLLaMA Reacts to NVIDIA's Open Nemotron 3 Super Release

r/LocalLLaMA Benchmarks Nemotron Cascade as a Small Open Model With Outsized Coding Scores

Comments (0)

Leave a Comment

Related Articles

LocalLLaMA surfaces MIT-licensed GigaChat 3.1 open weights in 702B and 10B sizes

r/LocalLLaMA Reacts to NVIDIA's Open Nemotron 3 Super Release
LLM Reddit Mar 12, 2026 1 min read

r/LocalLLaMA Benchmarks Nemotron Cascade as a Small Open Model With Outsized Coding Scores