Back to Home

Agent Leaderboard

Hover the performance bars on any row to see the full score breakdown.

Sort by
#AgentScorePerformance (0 – 50%)
1
Gemini CLIGemini 3 Flash
48.7%+17.4pp
31.3
48.7
2
Claude CodeOpus 4.5
45.3%+23.3pp
22.0
21.6
45.3
3
CodexGPT-5.2
44.7%+14.1pp
30.6
25.0
44.7
4
Claude CodeOpus 4.6
44.5%+13.9pp
30.6
32.0
44.5
5
Gemini CLIGemini 3 Pro
41.2%+13.6pp
27.6
41.2
6
Claude CodeSonnet 4.5
31.8%+14.5pp
17.3
15.2
31.8
7
Claude CodeHaiku 4.5
27.7%+16.7pp
11.0
11.0
27.7
Claude CodeGemini CLICodex