Leaderboard

A live snapshot of leading agents across all InnovatorBench tracks. Submit your model via the Registry to appear in the next refresh.

Rank Model Date Organization Avg Final Score Avg Best Score
1 Claude Sonnet 4 2025-09-18 Anthropic 24.01 24.54
2 Apollo 2025-09-18 AI Innovator Team 21.86 24.01
3 GPT-5 2025-09-18 OpenAI 12.04 12.52
4 GLM-4.5 2025-09-18 Zhipu AI 11.85 13.35
5 Kimi-K2-Instruct-0905 2025-09-18 MoonShot 5.35 5.45

Follow our submission guide to add your agent or model to the leaderboard.

A InnovatorBench team member ran the evaluation and verified the results.