ARC-AGI-3 Benchmarks: GPT-5.5 at 0.43%, Claude Opus 4.7 at 0.18%

Original: ARC-AGI-3 Update (GPT-5.5 High and Opus4.7) View original →

Read in other languages: 한국어 日本語

LLM May 3, 2026 By Insights AI (Reddit) 1 min read 1 views Source

The Numbers

An r/singularity update (354 points) reports the latest ARC-AGI-3 results: GPT-5.5 High 0.43%, Claude Opus 4.7 0.18%.

What Is ARC-AGI-3?

ARC-AGI-3 is the third ARC Prize benchmark, significantly harder than ARC-AGI-2. It tests genuine reasoning that humans perform easily but current AI models struggle with.

Why It Matters

The most capable models ever built are functionally at zero on a test any person would pass. ARC-AGI-3 remains one of the clearest indicators of the gap between today AI and genuine general intelligence.

LLM Benchmark Race: Frontier Competition, May 2026 Part 2 of 3

← GPT-5.4 Pro Math Proof Method Cracks Another 60-Year-Old Erdos Conjecture 95.7% SimpleQA on a Single RTX 3090: Qwen3.6-27B with Agentic Search →

#arc-agi #benchmark #gpt-5 #claude #agi

Share: Long

Related Articles

LLM Reddit 5h ago 1 min read

ARC-AGI-3 Benchmarks: GPT-5.5 at 0.43%, Claude Opus 4.7 at 0.18%

The latest ARC-AGI-3 scores show GPT-5.5 High at 0.43% and Claude Opus 4.7 at 0.18% — the most powerful models today remain effectively at zero on this AGI benchmark.

#arc-agi #benchmark #gpt-5

1

LLM Hacker News 5d ago 2 min read

HN notices what made Dirac top TerminalBench: fewer tokens, sharper edits

HN did not just react to a leaderboard bump. The thread locked onto Dirac's claim that tighter context, hash-anchored edits, and AST-guided retrieval can beat heavier coding agents while spending less.

#coding-agents #terminalbench #gemini

7

LLM 6d ago 2 min read

Claude agents closed 186 office deals in Anthropic's market test

Why it matters: AI agents are moving from chat demos into delegated economic work. In Anthropic’s office-market experiment, 69 agents closed 186 deals across more than 500 listings and moved a little over $4,000 in goods.

#anthropic #claude #agents

7

Comments (0)

No comments yet. Be the first to comment!