Thoughts on AI, technology, and the future we're building.
New posts every week
A focused page for document-heavy and retrieval-heavy work. The current view uses Docs, Search, and Text Arena signals while HELM Long Context and context-window metadata are prepared as the next data layer.
Ranked models
123
Docs, search, or text signal
Docs coverage
0
Document Arena rows
Search coverage
29
Search Arena rows
HELM context
Planned
Long-context leaderboard feed
Context formula
The context proxy blends Document Arena, Search Arena, and Text Arena scores. It does not claim a context-window size or true long-context pass rate until a dedicated source row is available.
Higher proxy index is better. HELM Long Context is shown as a planned source rather than a hidden assumption.
Top 30 of 123
| Rank | Model | Index | Docs | Search | Text | Context | Sources |
|---|---|---|---|---|---|---|---|
| #1 | 100context proxy | Not listedDocs Arena | Not listedSearch Arena | 1,502Text Arena rank #1 | HELM pendinglong-context feed | Docs ArenaSearch ArenaText Arena | |
| #2 | 100context proxy | Not listedDocs Arena | 1,223Search Arena rank #1 | Not listedText Arena | HELM pendinglong-context feed | Docs ArenaSearch ArenaText Arena | |
| #3 | 98context proxy | Not listedDocs Arena | Not listedSearch Arena | 1,500Text Arena rank #2 | HELM pendinglong-context feed | Docs ArenaSearch ArenaText Arena | |
| #4 | 98context proxy | Not listedDocs Arena | 1,219Search Arena rank #2 | Not listedText Arena | HELM pendinglong-context feed | Docs ArenaSearch ArenaText Arena | |
| #5 | 96context proxy | Not listedDocs Arena | Not listedSearch Arena | 1,498Text Arena rank #3 | HELM pendinglong-context feed | Docs ArenaSearch ArenaText Arena | |
| #6 | 93context proxy | Not listedDocs Arena | 1,212Search Arena rank #4 | Not listedText Arena | HELM pendinglong-context feed | Docs ArenaSearch ArenaText Arena | |
| #7 | 93context proxy | Not listedDocs Arena | 1,214Search Arena rank #3 | 1,494Text Arena rank #4 | HELM pendinglong-context feed | Docs ArenaSearch ArenaText Arena | |
| #8 | 87context proxy | Not listedDocs Arena | 1,202Search Arena rank #7 | Not listedText Arena | HELM pendinglong-context feed | Docs ArenaSearch ArenaText Arena | |
| #9 | 86context proxy | Not listedDocs Arena | Not listedSearch Arena | 1,489Text Arena rank #5 | HELM pendinglong-context feed | Docs ArenaSearch ArenaText Arena | |
| #10 | 85context proxy | Not listedDocs Arena | 1,199Search Arena rank #9 | Not listedText Arena | HELM pendinglong-context feed | Docs ArenaSearch ArenaText Arena | |
| #11 | 84context proxy | Not listedDocs Arena | 1,197Search Arena rank #10 | Not listedText Arena | HELM pendinglong-context feed | Docs ArenaSearch ArenaText Arena | |
| #12 | 84context proxy | Not listedDocs Arena | Not listedSearch Arena | 1,487Text Arena rank #6 | HELM pendinglong-context feed | Docs ArenaSearch ArenaText Arena | |
| #13 | 83context proxy | Not listedDocs Arena | 1,196Search Arena rank #11 | Not listedText Arena | HELM pendinglong-context feed | Docs ArenaSearch ArenaText Arena | |
| #14 | 83context proxy | Not listedDocs Arena | 1,196Search Arena rank #12 | Not listedText Arena | HELM pendinglong-context feed | Docs ArenaSearch ArenaText Arena | |
| #15 | 82context proxy | Not listedDocs Arena | Not listedSearch Arena | 1,486Text Arena rank #7 | HELM pendinglong-context feed | Docs ArenaSearch ArenaText Arena | |
| #16 | 81context proxy | Not listedDocs Arena | 1,211Search Arena rank #5 | 1,472Text Arena rank #20 | HELM pendinglong-context feed | Docs ArenaSearch ArenaText Arena | |
| #17 | 79context proxy | Not listedDocs Arena | 1,190Search Arena rank #14 | Not listedText Arena | HELM pendinglong-context feed | Docs ArenaSearch ArenaText Arena | |
| #18 | 79context proxy | Not listedDocs Arena | 1,199Search Arena rank #8 | 1,476Text Arena rank #13 | HELM pendinglong-context feed | Docs ArenaSearch ArenaText Arena | |
| #19 | 78context proxy | Not listedDocs Arena | 1,188Search Arena rank #15 | Not listedText Arena | HELM pendinglong-context feed | Docs ArenaSearch ArenaText Arena | |
| #20 | 78context proxy | Not listedDocs Arena | Not listedSearch Arena | 1,482Text Arena rank #8 | HELM pendinglong-context feed | Docs ArenaSearch ArenaText Arena | |
| #21 | 76context proxy | Not listedDocs Arena | Not listedSearch Arena | 1,480Text Arena rank #9 | HELM pendinglong-context feed | Docs ArenaSearch ArenaText Arena | |
| #22 | 75context proxy | Not listedDocs Arena | Not listedSearch Arena | 1,479Text Arena rank #10 | HELM pendinglong-context feed | Docs ArenaSearch ArenaText Arena | |
| #23 | 74context proxy | Not listedDocs Arena | 1,181Search Arena rank #16 | Not listedText Arena | HELM pendinglong-context feed | Docs ArenaSearch ArenaText Arena | |
| #24 | 74context proxy | Not listedDocs Arena | 1,193Search Arena rank #13 | 1,470Text Arena rank #21 | HELM pendinglong-context feed | Docs ArenaSearch ArenaText Arena | |
| #25 | 73context proxy | Not listedDocs Arena | 1,179Search Arena rank #17 | Not listedText Arena | HELM pendinglong-context feed | Docs ArenaSearch ArenaText Arena | |
| #26 | 71context proxy | Not listedDocs Arena | Not listedSearch Arena | 1,476Text Arena rank #11 | HELM pendinglong-context feed | Docs ArenaSearch ArenaText Arena | |
| #27 | 71context proxy | Not listedDocs Arena | Not listedSearch Arena | 1,476Text Arena rank #12 | HELM pendinglong-context feed | Docs ArenaSearch ArenaText Arena | |
| #28 | 70context proxy | Not listedDocs Arena | Not listedSearch Arena | 1,475Text Arena rank #14 | HELM pendinglong-context feed | Docs ArenaSearch ArenaText Arena | |
| #29 | 70context proxy | Not listedDocs Arena | Not listedSearch Arena | 1,475Text Arena rank #15 | HELM pendinglong-context feed | Docs ArenaSearch ArenaText Arena | |
| #30 | 69context proxy | Not listedDocs Arena | Not listedSearch Arena | 1,474Text Arena rank #16 | HELM pendinglong-context feed | Docs ArenaSearch ArenaText Arena |
Benchmark guide
A quick reading key for long-context rankings, document-heavy work, retrieval signals, and planned context-window data.
The long-context proxy blends Document Arena, Search Arena, and Text Arena scores. It is meant to surface models that appear strong on document-heavy and retrieval-heavy work while dedicated long-context benchmarks are being wired.
Not yet. Context window size and max output tokens need a reliable source before they become table fields. This ranking focuses on public task performance, not advertised token limits.
HELM Long Context is the right benchmark family for more direct long-context evaluation, but it is not mixed into the score until the feed is integrated and matched cleanly to model rows.
Document scores point toward long-form reading and file-style tasks. Search scores point toward retrieval-style workflows. Both are useful signals, but neither proves that a model can reliably use every token in a huge context window.