Karpathy open-sources autoresearch for autonomous single-GPU nanochat experiments

Original: Karpathy open-sources autoresearch for autonomous single-GPU nanochat experiments View original →

Read in other languages: 한국어日本語
LLM Mar 9, 2026 By Insights AI 2 min read 1 views Source
Karpathy open-sources autoresearch for autonomous single-GPU nanochat experiments

What Karpathy published

On March 7, Andrej Karpathy said he had packaged his recent autoresearch work into a self-contained repository that others could try over a weekend. The tweet describes a stripped-down single-GPU version of the nanochat training core where the human edits Markdown instructions and the AI agent edits the Python training code. Instead of prompting for one-off answers, the project turns an agent into a loop that proposes code changes, runs training, measures the outcome, and keeps iterating.

How the repo works

The GitHub page describes autoresearch as AI agents running research on single-GPU nanochat training automatically. Each experiment is budgeted at exactly five minutes, which gives roughly 12 runs per hour and about 100 while a user sleeps. The agent works on a Git feature branch, accumulates commits, and selects for lower validation loss rather than for subjective impressions. Karpathy’s framing is that the human should stop hand-editing the training loop and instead program the research organization itself through files such as program.md.

The repository is intentionally minimal. Karpathy says the training core is compressed into about 630 lines for a single-GPU setup, which makes the loop easier for an agent to inspect and modify. The README also notes that the current version expects one NVIDIA GPU, while forks can extend the idea to other platforms. That scope choice matters because it keeps the benchmark small enough to iterate quickly but still real enough to test whether an agent can improve a nontrivial training system.

Why this matters

The broader significance is not the specific nanochat baseline. It is the attempt to make autonomous research measurable, cheap, and repeatable. Fixed five-minute runs, Git-native versioning, and validation-loss-based selection create a cleaner testbed for comparing prompts, agents, and coordination strategies. If projects like this mature, the relevant question for research teams shifts from can an agent write code to can an agent run a disciplined experimental program that compounds over time.

Sources: Karpathy X post, GitHub

Share:

Related Articles

Comments (0)

No comments yet. Be the first to comment!

Leave a Comment

© 2026 Insights. All rights reserved.