Model Selection ComparisonsComparison

ARC AGI 3 leaderboard near zero scores across frontier models

April 4, 2026ℏεsam

ℏεsam says ARC-AGI-3 is extremely hard, with GPT-5.4 High, Gemini 3.1 Pro Preview, and Anthropic Opus 4.6 Max all scoring around 0.2 to 0.3 percent and Grok 4.2 at 0.00 percent.

Open in PulseSee the full expert discussion →

QUOTES

ARC-AGI-3 IS BRUTAL.

10 days after the release and Grok 4.2 has scored a glorious 0.00%.

🥇 GPT-5.4 (High): 0.3%

🥈 Gemini 3.1 Pro (Preview): 0.2%

🥉 Anthropic Opus 4.6 (Max): 0.2%

GPT-5.4 (High): 0.3%

VOICES

ℏεsam

RELATED TERMS

benchmarksevaluationanthropicgeminigptopusgrokarc agi

OTHER FINDINGS IN MODEL SELECTION COMPARISONS

Gemma 4 runs locally on Mac hardware, near AGI on the go claims Google Gemma 4 positioned as free open source alternative Gemma 4 positioned as small open model outperforming much larger models

AMYGDALA PULSE

See what experts are saying right now

This finding is one of many signals tracked across Artificial Intelligence. The live feed updates every few hours with new expert voices, debates, and emerging ideas.

Open Artificial Intelligence Pulse Browse all topics

← Back to Artificial Intelligence