Thoughts on AI, technology, and the future we're building.
New posts every week
A focused coding view using Code Arena and SWE-bench today, with clear source slots for Terminal-Bench and Aider-style coding benchmarks as those feeds are wired.
Ranked models
102
Code Arena or SWE-bench signal
SWE-bench matches
22
Matched to model rows
Top coding index
Claude Opus 4.7 Thinking
100 coding index
Best SWE-bench
76.8%
Claude Opus 4.5 20251101
Coding formula
The coding index blends Code Arena and SWE-bench where available, with output speed as a small tie-breaker. Terminal-Bench and Aider remain visible as planned feed slots until we have a reliable ingestion path.
Higher coding index is better. SWE-bench cells show verified model matches only.
Top 30 of 102
| Rank | Model | Index | Code Arena | SWE-bench | Speed | Run mode | Sources |
|---|---|---|---|---|---|---|---|
| #1 | 100coding index | 1,567Code Arena rank #1 | Not listedNo confident match | Not listedoutput speed | PlannedTerminal-Bench + Aider feed | Code ArenaSWE-bench | |
| #2 | 95coding index | 1,541Code Arena rank #4 | Not listedNo confident match | Not listedoutput speed | PlannedTerminal-Bench + Aider feed | Code ArenaSWE-bench | |
| #3 | 88coding index | 1,508Code Arena rank #9 | Not listedNo confident match | Not listedoutput speed | PlannedTerminal-Bench + Aider feed | Code ArenaSWE-benchArtificial Analysis | |
| #4 | 87coding index | 1,505Code Arena rank #11 | Not listedNo confident match | Not listedoutput speed | PlannedTerminal-Bench + Aider feed | Code ArenaSWE-bench | |
| #5 | 86coding index | 1,538Code Arena rank #5 | 75.6%SWE-bench rank #4 | 42 tok/soutput speed | bash-onlymini 2.0.0 | Code ArenaSWE-benchArtificial Analysis | |
| #6 | 84coding index | 1,490Code Arena rank #12 | Not listedNo confident match | Not listedoutput speed | PlannedTerminal-Bench + Aider feed | Code ArenaSWE-bench | |
| #7 | 83coding index | 1,438Code Arena rank #24 | 74.2%SWE-bench rank #6 | Not listedoutput speed | bash-onlymini 1.15.0 | Code ArenaSWE-benchArtificial Analysis | |
| #8 | 82coding index | 1,562Code Arena rank #2 | Not listedNo confident match | 42 tok/soutput speed | PlannedTerminal-Bench + Aider feed | Code ArenaSWE-benchArtificial Analysis | |
| #9 | 80coding index | 1,467Code Arena rank #16 | 76.8%SWE-bench rank #1 | 54 tok/soutput speed | bash-onlymini 2.0.0 | Code ArenaSWE-benchArtificial Analysis | |
| #10 | 79coding index | 1,542Code Arena rank #3 | Not listedNo confident match | 47 tok/soutput speed | PlannedTerminal-Bench + Aider feed | Code ArenaSWE-benchArtificial Analysis | |
| #11 | 78coding index | 1,404Code Arena rank #32 | 72.8%SWE-bench rank #9 | Not listedoutput speed | bash-onlymini 2.0.0 | Code ArenaSWE-benchArtificial Analysis | |
| #12 | 77coding index | 1,533Code Arena rank #6 | Not listedNo confident match | 49 tok/soutput speed | PlannedTerminal-Bench + Aider feed | Code ArenaSWE-benchArtificial Analysis | |
| #13 | 77coding index | 1,457Code Arena rank #19 | Not listedNo confident match | Not listedoutput speed | PlannedTerminal-Bench + Aider feed | Code ArenaSWE-bench | |
| #14 | 76coding index | 1,523Code Arena rank #7 | Not listedNo confident match | 47 tok/soutput speed | PlannedTerminal-Bench + Aider feed | Code ArenaSWE-benchArtificial Analysis | |
| #15 | 75coding index | 1,506Code Arena rank #10 | Not listedNo confident match | 198 tok/soutput speed | PlannedTerminal-Bench + Aider feed | Code ArenaSWE-benchArtificial Analysis | |
| #16 | 75coding index | 1,518Code Arena rank #8 | Not listedNo confident match | 42 tok/soutput speed | PlannedTerminal-Bench + Aider feed | Code ArenaSWE-benchArtificial Analysis | |
| #17 | 74coding index | 1,436Code Arena rank #27 | 72.8%SWE-bench rank #8 | 80 tok/soutput speed | bash-onlymini 2.0.0 | Code ArenaSWE-benchArtificial Analysis | |
| #18 | 73coding index | 1,387Code Arena rank #42 | 75.8%SWE-bench rank #2 | 184 tok/soutput speed | bash-onlymini 2.0.0 | Code ArenaSWE-benchArtificial Analysis | |
| #19 | 73coding index | 1,437Code Arena rank #25 | Not listedNo confident match | Not listedoutput speed | PlannedTerminal-Bench + Aider feed | Code ArenaSWE-bench | |
| #20 | 72coding index | 1,382Code Arena rank #45 | 75.8%SWE-bench rank #3 | 202 tok/soutput speed | bash-onlymini 2.0.0 | Code ArenaSWE-benchArtificial Analysis | |
| #21 | 71coding index | 1,431Code Arena rank #29 | 70.8%SWE-bench rank #13 | 33 tok/soutput speed | bash-onlymini 2.0.0 | Code ArenaSWE-benchArtificial Analysis | |
| #22 | 69coding index | 1,486Code Arena rank #13 | Not listedNo confident match | 41 tok/soutput speed | PlannedTerminal-Bench + Aider feed | Code ArenaSWE-benchArtificial Analysis | |
| #23 | 68coding index | 1,479Code Arena rank #14 | Not listedNo confident match | 61 tok/soutput speed | PlannedTerminal-Bench + Aider feed | Code ArenaSWE-benchArtificial Analysis | |
| #24 | 67coding index | 1,386Code Arena rank #43 | 71.4%SWE-bench rank #12 | 43 tok/soutput speed | bash-onlymini 2.0.0 | Code ArenaSWE-benchArtificial Analysis | |
| #25 | 67coding index | 1,332Code Arena rank #58 | 70%SWE-bench rank #15 | Not listedoutput speed | bash-onlymini 2.0.0 | Code ArenaSWE-benchArtificial Analysis | |
| #26 | 67coding index | 1,408Code Arena rank #30 | Not listedNo confident match | Not listedoutput speed | PlannedTerminal-Bench + Aider feed | Code ArenaSWE-bench | |
| #27 | 66coding index | 1,471Code Arena rank #15 | Not listedNo confident match | 46 tok/soutput speed | PlannedTerminal-Bench + Aider feed | Code ArenaSWE-benchArtificial Analysis | |
| #28 | 66coding index | Not listedCode Arena | 67.6%SWE-bench rank #18 | 37 tok/soutput speed | bash-onlymini 1.0.0 | Code ArenaSWE-benchArtificial Analysis | |
| #29 | 65coding index | 1,464Code Arena rank #17 | Not listedNo confident match | 45 tok/soutput speed | PlannedTerminal-Bench + Aider feed | Code ArenaSWE-benchArtificial Analysis | |
| #30 | 65coding index | 1,460Code Arena rank #18 | Not listedNo confident match | 53 tok/soutput speed | PlannedTerminal-Bench + Aider feed | Code ArenaSWE-benchArtificial Analysis |
Benchmark guide
A quick reading key for comparing coding models without confusing source coverage, preference scores, and real issue-resolution benchmarks.
The coding index combines Code Arena and SWE-bench signals where available, with output speed used as a small tie-breaker. It is a practical coding comparison, not a guarantee that the model will solve every repository task.
Code Arena is a public preference-style benchmark for coding outputs, while SWE-bench measures real software issue resolution. Code Arena is broader and more available; SWE-bench is more task-specific and harder to match across model names.
Terminal-Bench is relevant for agentic coding workflows, but this page does not mix it into rankings until a reliable source feed is wired. Planned benchmarks are called out so the table does not imply hidden or invented scores.
Empty SWE-bench cells mean there is no public model row or no confident match to the model name in this table. They should be read as missing source coverage, not as a zero percent result.