#transformer

AI Hacker News Mar 29, 2026 2 min read

Hacker News revives ATTN/11, a Transformer trained in PDP-11 assembly

Hacker News surfaced ATTN/11, a project that trains a single-layer, single-head Transformer in PDP-11 assembly on a PDP-11/34A. The README says careful fixed-point math, per-layer learning rates, and a 32KB memory budget cut training from multi-hour estimates to a 5.5-minute run that reaches 10/10 accuracy on digit reversal.

#retrocomputing #transformer #pdp-11

LLM Reddit Mar 3, 2026 1 min read

Tiny Transformers with Under 100 Parameters Achieve 100% Accuracy on 10-Digit Addition

Researchers have demonstrated that transformer models with fewer than 100 parameters can add two 10-digit numbers with 100% accuracy using digit tokenization, challenging assumptions about the minimum complexity needed for arithmetic reasoning.

#transformer #machine-learning #research

LLM Hacker News Mar 2, 2026 1 min read

MicroGPT Explained Interactively

growingSWE has created an interactive walkthrough of Andrej Karpathy's 200-line pure Python GPT implementation, letting you tokenize names, watch softmax convert scores to probabilities, step through backpropagation, and explore attention heatmaps.

#gpt #transformer #neural-network

LLM Hacker News Mar 1, 2026 2 min read

HN Spotlight: Karpathy's <code>microgpt</code> distills GPT training and inference into ~200 lines

A Hacker News thread with score 732 and 120 comments highlighted <code>microgpt</code>, Andrej Karpathy’s single-file educational implementation of a GPT-style model. The project packages dataset handling, tokenization, autograd, Transformer layers, Adam optimization, and sampling into one compact Python script.

#llm-education #python #transformer

AI Feb 16, 2026 1 min read

Google DeepMind Introduces D4RT, Unifying 4D Scene Reconstruction and Tracking from 2D Video

Google DeepMind introduced D4RT, a single model framework for dynamic 4D scene reconstruction and tracking. The company reports up to 300x efficiency gains versus prior methods, highlighting real-time potential for robotics and AR workloads.

#computer-vision #robotics #transformer

102