Demis Hassabis Proposes Definitive AGI Test: Could AI Discover General Relativity?
Original: Demis Hassabis: "The kind of test I would be looking for is training an AI system with a knowledge cutoff of, say, 1911, and then seeing if it could come up with general relativity, like Einstein did in 1915. That's the kind of test I think is a true test of whether we have a full AGI system" View original →
The Einstein Test: A Concrete Benchmark for AGI
DeepMind CEO Demis Hassabis has proposed a compelling and specific test for determining whether a true AGI has been achieved, sparking intense discussion across the AI research community.
In a YouTube interview, Hassabis described his vision: "The kind of test I would be looking for is training an AI system with a knowledge cutoff of, say, 1911, and then seeing if it could come up with general relativity, like Einstein did in 1915. That's the kind of test I think is a true test of whether we have a full AGI system."
Why This Test Matters
The power of this proposal lies in what it measures: not memorization or pattern recognition, but genuine scientific reasoning and creative discovery. General relativity required Einstein to synthesize existing mathematical tools and physical observations into an entirely new conceptual framework — something that goes far beyond recombining known information.
- Physics available by 1911: Newtonian mechanics, special relativity (1905), electromagnetism
- Einstein's 1915 achievement: Unifying gravity with spacetime curvature via the equivalence principle
- Required capability: Paradigm-breaking conceptual innovation
The Gap Between Current LLMs and AGI
While today's large language models excel at synthesizing and explaining existing concepts, their ability to independently construct fundamentally new physical theories remains unproven. Hassabis's test crystallizes this distinction sharply.
The comment earned over 2,800 upvotes on r/singularity, catalyzing deeper discussion about what the ultimate goal of AI research really is — and how far current systems remain from achieving it.
Competing Definitions of AGI
Hassabis's proposal also highlights the diversity of AGI definitions. While OpenAI defines AGI as a system capable of performing "most economically valuable tasks," Hassabis sets a far more rigorous bar: the ability to make genuine scientific discoveries. This distinction matters enormously for how we measure and evaluate progress in AI development.
Related Articles
DeepMind CEO Demis Hassabis proposed a concrete test for true AGI: train an AI with a 1911 knowledge cutoff, then see if it can independently derive general relativity — as Einstein did in 1915.
A counterintuitive study found that programming AI agents with more assertive, 'rude' conversational behaviors — including interrupting and strategic silence — significantly improved their performance on complex reasoning tasks.
Google DeepMind's Aletheia AI research agent solved 6 out of 10 open research-level math problems in the FirstProof Challenge as judged by expert mathematicians. The system also generated a fully autonomous research paper and solved 4 open conjectures from Bloom's Erdős database.
Comments (0)
No comments yet. Be the first to comment!