Thoughts on AI, technology, and the future we're building.
New posts every week
A clean reasoning surface that starts with available Arena signals and keeps LiveBench, HELM Capabilities, MMLU-Pro, GPQA, and AIME-style math benchmarks visible as planned source integrations.
Ranked models
152
Text, docs, or vision signal
Reasoning variants
41
Models tagged from public names
Top proxy
Claude Opus 4.6 Thinking
99 reasoning proxy
LiveBench
Planned
Fresh reasoning feed
Reasoning formula
The current index blends Text, Docs, and Vision Arena scores. It is useful for broad comparison, but dedicated LiveBench, GPQA, AIME, MMLU-Pro, and HELM signals should be treated as the next data layer.
Higher proxy index is better. Dedicated reasoning benchmarks are source-ready but not silently mixed in yet.
Top 30 of 152
| Rank | Model | Index | Text | Docs | Vision | Mode | Sources |
|---|---|---|---|---|---|---|---|
| #1 | 99reasoning proxy | 1,502Text Arena rank #1 | Not listedDocs Arena | 1,300Vision Arena rank #3 | Reasoningmode signal | Text ArenaDocs ArenaVision Arena | |
| #2 | 99reasoning proxy | 1,500Text Arena rank #2 | Not listedDocs Arena | 1,306Vision Arena rank #1 | Reasoningmode signal | Text ArenaDocs ArenaVision Arena | |
| #3 | 95reasoning proxy | 1,498Text Arena rank #3 | Not listedDocs Arena | 1,293Vision Arena rank #5 | Standardmode signal | Text ArenaDocs ArenaVision Arena | |
| #4 | 94reasoning proxy | 1,494Text Arena rank #4 | Not listedDocs Arena | 1,304Vision Arena rank #2 | Standardmode signal | Text ArenaDocs ArenaVision Arena | |
| #5 | 89reasoning proxy | 1,489Text Arena rank #5 | Not listedDocs Arena | 1,296Vision Arena rank #4 | Standardmode signal | Text ArenaDocs ArenaVision Arena | |
| #6 | 86reasoning proxy | 1,486Text Arena rank #7 | Not listedDocs Arena | 1,289Vision Arena rank #6 | Standardmode signal | Text ArenaDocs ArenaVision Arena | |
| #7 | 85reasoning proxy | 1,487Text Arena rank #6 | Not listedDocs Arena | 1,277Vision Arena rank #10 | Standardmode signal | Text ArenaDocs ArenaVision Arena | |
| #8 | 81reasoning proxy | 1,482Text Arena rank #8 | Not listedDocs Arena | 1,278Vision Arena rank #9 | Reasoningmode signal | Text ArenaDocs ArenaVision Arena | |
| #9 | 80reasoning proxy | 1,480Text Arena rank #9 | Not listedDocs Arena | 1,277Vision Arena rank #11 | Reasoningmode signal | Text ArenaDocs ArenaVision Arena | |
| #10 | 79reasoning proxy | Not listedText Arena | Not listedDocs Arena | 1,260Vision Arena rank #16 | Standardmode signal | Text ArenaDocs ArenaVision Arena | |
| #11 | 79reasoning proxy | 1,476Text Arena rank #11 | Not listedDocs Arena | 1,288Vision Arena rank #7 | Standardmode signal | Text ArenaDocs ArenaVision Arena | |
| #12 | 77reasoning proxy | 1,476Text Arena rank #12 | Not listedDocs Arena | 1,280Vision Arena rank #8 | Standardmode signal | Text ArenaDocs ArenaVision Arena | |
| #13 | 75reasoning proxy | 1,474Text Arena rank #17 | Not listedDocs Arena | 1,275Vision Arena rank #13 | Standardmode signal | Text ArenaDocs ArenaVision Arena | |
| #14 | 75reasoning proxy | 1,479Text Arena rank #10 | Not listedDocs Arena | Not listedVision Arena | Standardmode signal | Text ArenaDocs ArenaVision Arena | |
| #15 | 72reasoning proxy | 1,470Text Arena rank #22 | Not listedDocs Arena | 1,275Vision Arena rank #12 | Standardmode signal | Text ArenaDocs ArenaVision Arena | |
| #16 | 72reasoning proxy | 1,475Text Arena rank #14 | Not listedDocs Arena | 1,251Vision Arena rank #21 | Reasoningmode signal | Text ArenaDocs ArenaVision Arena | |
| #17 | 71reasoning proxy | 1,476Text Arena rank #13 | Not listedDocs Arena | Not listedVision Arena | Standardmode signal | Text ArenaDocs ArenaVision Arena | |
| #18 | 71reasoning proxy | 1,469Text Arena rank #23 | Not listedDocs Arena | 1,269Vision Arena rank #15 | Standardmode signal | Text ArenaDocs ArenaVision Arena | |
| #19 | 70reasoning proxy | 1,475Text Arena rank #15 | Not listedDocs Arena | Not listedVision Arena | Standardmode signal | Text ArenaDocs ArenaVision Arena | |
| #20 | 69reasoning proxy | 1,474Text Arena rank #16 | Not listedDocs Arena | Not listedVision Arena | Standardmode signal | Text ArenaDocs ArenaVision Arena | |
| #21 | 69reasoning proxy | 1,472Text Arena rank #20 | Not listedDocs Arena | 1,247Vision Arena rank #26 | Standardmode signal | Text ArenaDocs ArenaVision Arena | |
| #22 | 68reasoning proxy | 1,473Text Arena rank #19 | Not listedDocs Arena | Not listedVision Arena | Reasoningmode signal | Text ArenaDocs ArenaVision Arena | |
| #23 | 65reasoning proxy | 1,470Text Arena rank #21 | Not listedDocs Arena | Not listedVision Arena | Standardmode signal | Text ArenaDocs ArenaVision Arena | |
| #24 | 64reasoning proxy | Not listedText Arena | Not listedDocs Arena | 1,227Vision Arena rank #37 | Standardmode signal | Text ArenaDocs ArenaVision Arena | |
| #25 | 64reasoning proxy | 1,462Text Arena rank #28 | Not listedDocs Arena | 1,259Vision Arena rank #19 | Standardmode signal | Text ArenaDocs ArenaVision Arena | |
| #26 | 64reasoning proxy | 1,469Text Arena rank #24 | Not listedDocs Arena | Not listedVision Arena | Standardmode signal | Text ArenaDocs ArenaVision Arena | |
| #27 | 64reasoning proxy | Not listedText Arena | Not listedDocs Arena | 1,226Vision Arena rank #39 | Standardmode signal | Text ArenaDocs ArenaVision Arena | |
| #28 | 63reasoning proxy | 1,461Text Arena rank #29 | Not listedDocs Arena | 1,260Vision Arena rank #17 | Reasoningmode signal | Text ArenaDocs ArenaVision Arena | |
| #29 | 61reasoning proxy | Not listedText Arena | Not listedDocs Arena | 1,221Vision Arena rank #42 | Standardmode signal | Text ArenaDocs ArenaVision Arena | |
| #30 | 60reasoning proxy | Not listedText Arena | Not listedDocs Arena | 1,219Vision Arena rank #43 | Standardmode signal | Text ArenaDocs ArenaVision Arena |
Benchmark guide
A quick reading key for reasoning and knowledge rankings while dedicated reasoning benchmark feeds are being integrated.
The reasoning proxy currently blends Text, Document, and Vision Arena scores. It is an orientation layer for broad model capability, not a replacement for dedicated reasoning benchmarks like LiveBench, GPQA, AIME, or MMLU-Pro.
Arena scores are live public signals with broad model coverage. They are useful while dedicated reasoning feeds are being wired, but they should be read as preference-based capability signals rather than exam-style accuracy scores.
LiveBench is designed to update over time and reduce benchmark contamination risk. Once integrated, it can add more direct reasoning, math, data, and coding task signals to this page.
A model name or tag like Thinking, Reasoning, or High can indicate an inference mode, but it is not itself a benchmark score. The page treats those labels as context, not as proof of better reasoning.