Anthropic's Claude Opus 4.7 Ties for Top Spot on Artificial Analysis Intelligence Index with Score of 57

Image for Anthropic's Claude Opus 4.7 Ties for Top Spot on Artificial Analysis Intelligence Index with Score of 57

Anthropic's newly released AI model, Claude Opus 4.7, has achieved a leading position in the Artificial Analysis Intelligence Index, securing a score of 57. This places it in a three-way tie for first place among 342 models evaluated by the independent benchmarking organization, sharing the top rank with OpenAI's GPT-5.4 and Google's Gemini 3.1 Pro. Artificial Analysis announced the benchmark results, inviting users to "> Compare Opus 4.7 with other leading models at: https://t.co/abqe2OiXZB," as stated in a recent tweet.

The benchmark results highlight Claude Opus 4.7's exceptional performance in real-world agentic knowledge work. It notably leads the GDPval-AA benchmark, Artificial Analysis's primary metric for general agentic performance across 44 occupations and nine major industries, with an Elo score of 1753. This score positions Opus 4.7 79 Elo points ahead of its closest competitors, Claude Sonnet 4.6 and GPT-5.4, both scoring 1674.

Beyond intelligence, the model demonstrates significant improvements in efficiency and reliability. Claude Opus 4.7 has secured the second spot on the Artificial Analysis Omniscience Index, primarily due to a substantial reduction in its hallucination rate, which dropped from 61% in Opus 4.6 to 36%. This improvement is attributed to the model's increased tendency to abstain from answering questions it does not confidently know.

Furthermore, the latest iteration of the Claude Opus series proves more cost-effective. Running the Artificial Analysis Intelligence Index with Opus 4.7 cost approximately $4,406, an 11% reduction compared to its predecessor, Opus 4.6, despite delivering superior performance. This efficiency gain is largely driven by lower output token usage, even with an updated tokenizer.

Claude Opus 4.7, released on April 16, 2026, also introduces several technical enhancements. These include improved capabilities in advanced software engineering, higher-resolution image processing, a new 'xhigh' effort level for finer control over reasoning, and the implementation of task budgets for managing token consumption in agentic workflows. The model is generally available, though Anthropic's more broadly capable Claude Mythos Preview remains in limited release.