LocalLLaMA Revisits RYS on Qwen3.5 and the Case for a Shared Reasoning Space

Original: RYS II - Repeated layers with Qwen3.5 27B and some hints at a 'Universal Language' View original →

Read in other languages: 한국어日本語
LLM Mar 27, 2026 By Insights AI (Reddit) 2 min read 1 views Source

r/LocalLLaMA amplified David Noel Ng's LLM Neuroanatomy II because it combines model hacking with a specific empirical claim: repeating blocks in the middle of a transformer still seems to help on modern open models. The post revisits RYS, or Repeat Your Self, on Qwen3.5-27B and argues that relayering was not just a one-off trick on older Qwen2 checkpoints.

The blog says Ng explored the space with 3,024 beam search candidates, a surrogate model that scored 2 million configurations, and a unified validation sweep before releasing new RYS variants. That matters because the community has seen plenty of merge culture and Frankenstein models before. What stood out here was the attempt to turn layer duplication into a more systematic search problem rather than a lucky recipe.

A stronger claim than just bigger models

The post also makes a more interesting representational claim. Ng shows multilingual hidden-state comparisons where, in the middle of the network, cross-language pairs with the same content stay more similar than same-language pairs with different content. In the blog's framing, that suggests a format-agnostic reasoning space and hints at a universal language inside the model. That does not prove that LLMs literally think in one language, but it does offer a concrete measurement that readers can argue about instead of a vague metaphor.

The Reddit summary tied those observations to a practical release: several RYS-Qwen3.5-27B-FP8 variants on Hugging Face, plus the claim that fine-tuning repeated-layer variants could push the size class further. It also noted an unresolved systems issue. Repeating layers currently increases memory footprint, and Ng says he is looking at formats where duplicated layers can stay as copies without extra VRAM apart from the KV cache.

Comments reflected both enthusiasm and caution. Readers praised the rigor of the search and the hidden-state analysis, asked for more languages and more model families, and drew comparisons to earlier layer-merge experiments from the Llama 2 era. That mix captures why the thread mattered. RYS II is not just another model feels smarter claim. It is an attempt to connect architecture edits, multilingual representation geometry, and practical open-weight releases in a way that the open-model community can reproduce, challenge, and extend.

Share: Long

Related Articles

Comments (0)

No comments yet. Be the first to comment!

Leave a Comment

© 2026 Insights. All rights reserved.