Demis Hassabis: "The True AGI Test Is an AI Deriving General Relativity on Its Own"

Original: Demis Hassabis: "The kind of test I would be looking for is training an AI system with a knowledge cutoff of, say, 1911, and then seeing if it could come up with general relativity, like Einstein did in 1915. That's the kind of test I think is a true test of whether we have a full AGI system" View original →

Read in other languages: 한국어日本語
AI Feb 22, 2026 By Insights AI (Reddit) 1 min read 4 views Source

Hassabis's AGI Test

Google DeepMind CEO Demis Hassabis has proposed a striking benchmark for determining whether we've achieved true artificial general intelligence. His remarks garnered over 1,800 upvotes on Reddit's r/singularity community.

The Einstein Test

Hassabis described his idea as follows:

"The kind of test I would be looking for is training an AI system with a knowledge cutoff of, say, 1911, and then seeing if it could come up with general relativity, like Einstein did in 1915. That's the kind of test I think is a true test of whether we have a full AGI system."

Why This Test Is Meaningful

This benchmark goes well beyond pattern recognition or memorization. Einstein's general relativity wasn't derived by analyzing data — it required synthesizing disparate physical principles in a fundamentally new way, driven by deep intuition and novel reasoning.

By Hassabis's standard, today's LLMs would fail this test. Current AI systems excel at synthesizing and summarizing existing knowledge but have not demonstrated the ability to discover genuinely new physical principles from first principles.

Implications for AGI Research

The statement offers a window into how Hassabis thinks about the goal of AGI at DeepMind. Rather than measuring task performance, he envisions AGI as a system capable of expanding the boundaries of human knowledge — not just operating within them.

This framing places the bar significantly higher than many current AGI benchmarks, which often focus on whether AI can perform human-level tasks rather than whether it can make Einstein-level scientific discoveries.

Share:

Related Articles

AI 6d ago 2 min read

Anthropic published a March 6, 2026 case study showing how Claude Opus 4.6 authored a working test exploit for Firefox vulnerability CVE-2026-2796. The company presents the result as an early warning about advancing model cyber capabilities, not as proof of reliable real-world offensive automation.

Comments (0)

No comments yet. Be the first to comment!

Leave a Comment

© 2026 Insights. All rights reserved.