Hacker News examines Percepta's claim that transformers can execute programs internally
Original: Executing programs inside transformers with exponentially faster inference View original →
One of the more provocative AI links on Hacker News was Percepta's March 11, 2026 post Can LLMs Be Computers? The public page makes a bold claim in just a few lines: the team says it built a computer inside a transformer that can execute arbitrary C programs for millions of steps, with exponentially faster inference via 2D attention heads. Even in teaser form, that is enough to trigger a familiar HN reaction: intense curiosity followed immediately by demands for harder evidence.
The claim matters because it targets a boundary that current LLM systems still treat as external. A lot of modern agent systems generate code or tool calls and then wait for another runtime to execute them. Percepta is framing its work differently. According to the post description, execution itself is being carried inside the transformer rather than delegated outside it. That is a much stronger statement than ordinary tool use, because it suggests a model architecture can become a computational substrate instead of only a planner wrapped around other software.
HN readers quickly connected the idea to two long-running research questions. The first is interpretability: if some behavior can be represented in a more program-like or pseudo-symbolic form, it may become easier to inspect than opaque end-to-end heuristics. The second is reasoning efficiency: several commenters read the post as evidence that next-token systems may be able to perform structured computation much more directly than today's tool-augmented stacks suggest. A few even speculated about combining this sort of mechanism with reinforcement learning or stronger planning loops.
But the enthusiasm came with obvious skepticism. Multiple readers said the write-up felt more like a teaser than a full explanation and asked for concrete benchmarks, practical examples, and a cleaner explanation of what the speedup actually measures. Others said the idea sounded brilliant but hard to evaluate from the public material alone. That criticism is fair. When a research claim is this ambitious, clarity and measurement matter as much as novelty.
So the HN thread is less a verdict than a marker. Percepta has put a high-upside research direction on the table: maybe transformers are not only sequence predictors, but can also serve as efficient internal executors for certain classes of computation. Whether that becomes a serious architectural shift will depend on the next step, which is not a sharper slogan but reproducible tasks, clearer exposition, and benchmarks the wider research community can test. Original source: Percepta. Community discussion: Hacker News.
Related Articles
A fast-rising LocalLLaMA post resurfaced David Noel Ng's write-up on duplicating a seven-layer block inside Qwen2-72B, a no-training architecture tweak that reportedly lifted multiple Open LLM Leaderboard benchmarks.
NVIDIA AI Developer introduced Nemotron 3 Super on March 11, 2026 as an open 120B-parameter hybrid MoE model with 12B active parameters and a native 1M-token context window. NVIDIA says the model targets agentic workloads with up to 5x higher throughput than the previous Nemotron Super model.
A high-scoring r/MachineLearning post resurfaced David Noel Ng's long-form write-up, centering on the claim that duplicating a seven-layer middle block in Qwen2-72B, without changing weights, was enough to reach the top of the open leaderboard.
Comments (0)
No comments yet. Be the first to comment!