Thoughts on AI, technology, and the future we're building.

New posts every week

HomeAll PostsAI NewsAI Basics
Timelines
ChatGPTOpenAI release historyAnthropic ClaudeClaude release historyGoogle GeminiGemini release history
Benchmarks
OverviewFull model trackerValue RankingsPerformance for the moneyCoding RankingsSWE-bench and code signalsAgent RankingsTool and workflow signalsReasoning RankingsKnowledge and reasoningLong ContextDocument and retrieval signalsLab ComparisonsProvider-level rankings
CategoriesAboutContact

Subscribe to Newsletter

Practical AI news, tips, tricks, tool analysis, sent straight to your inbox.

No spam. Unsubscribe anytime.

Practical explainers, tool notes, and systems thinking for people turning new AI capability into useful work.

Explore

  • All Posts
  • Categories
  • About
  • Contact

Categories

  • AI News
  • AI Basics
  • ChatGPT
  • Anthropic
  • AI Tools
  • AI Video
  • AI Images
  • Courses

Connect

LinkedInTwitterRSS

© 2026. All rights reserved.

Benchmark suiteUpdated May 27, 2026

Long context rankings.

A focused page for document-heavy and retrieval-heavy work. The current view uses Docs, Search, and Text Arena signals while HELM Long Context and context-window metadata are prepared as the next data layer.

What These Mean

Ranked models

123

Docs, search, or text signal

Docs coverage

0

Document Arena rows

Search coverage

29

Search Arena rows

HELM context

Planned

Long-context leaderboard feed

OverviewFull public trackerValuePerformance for the moneyCodingCode and SWE-bench signalsAgentsTool and workflow readinessReasoningKnowledge and reasoning signalsContextDocument and retrieval signalsLabsProvider comparisons

Context formula

Document skill first, retrieval support second.

The context proxy blends Document Arena, Search Arena, and Text Arena scores. It does not claim a context-window size or true long-context pass rate until a dedicated source row is available.

Long-context proxy

Higher proxy index is better. HELM Long Context is shown as a planned source rather than a hidden assumption.

Top 30 of 123

RankModelIndexDocsSearchTextContextSources
#1
Claude Opus 4.6 Thinking

claude-opus-4-6-thinking

AnthropicProprietary
100context proxy
Not listedDocs Arena
Not listedSearch Arena
1,502Text Arena rank #1
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#2
GPT-5.5 Search

gpt-5.5-search

OpenAIProprietary
100context proxy
Not listedDocs Arena
1,223Search Arena rank #1
Not listedText Arena
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#3
Claude Opus 4.7 Thinking

claude-opus-4-7-thinking

AnthropicProprietary
98context proxy
Not listedDocs Arena
Not listedSearch Arena
1,500Text Arena rank #2
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#4
Claude Opus 4.6 Search

claude-opus-4-6-search

AnthropicProprietary
98context proxy
Not listedDocs Arena
1,219Search Arena rank #2
Not listedText Arena
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#5
Claude Opus 4.6

claude-opus-4-6

AnthropicProprietary
96context proxy
Not listedDocs Arena
Not listedSearch Arena
1,498Text Arena rank #3
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#6
Gemini 3.1 Pro Grounding

gemini-3.1-pro-grounding

GoogleProprietary
93context proxy
Not listedDocs Arena
1,212Search Arena rank #4
Not listedText Arena
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#7
Claude Opus 4.7

claude-opus-4-7

AnthropicProprietary
93context proxy
Not listedDocs Arena
1,214Search Arena rank #3
1,494Text Arena rank #4
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#8
Gemini 3 Pro Grounding

gemini-3-pro-grounding

GoogleProprietary
87context proxy
Not listedDocs Arena
1,202Search Arena rank #7
Not listedText Arena
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#9
Muse Spark

muse-spark

MetaProprietary
86context proxy
Not listedDocs Arena
Not listedSearch Arena
1,489Text Arena rank #5
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#10
GPT-5.4 Search

gpt-5.4-search

OpenAIProprietary
85context proxy
Not listedDocs Arena
1,199Search Arena rank #9
Not listedText Arena
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#11
Gemini 3 Flash Grounding

gemini-3-flash-grounding

GoogleProprietary
84context proxy
Not listedDocs Arena
1,197Search Arena rank #10
Not listedText Arena
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#12
Gemini 3.1 Pro Preview

gemini-3.1-pro-preview

GoogleProprietary
84context proxy
Not listedDocs Arena
Not listedSearch Arena
1,487Text Arena rank #6
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#13
Grok 4.1 Fast Search

grok-4-1-fast-search

xAIProprietary
83context proxy
Not listedDocs Arena
1,196Search Arena rank #11
Not listedText Arena
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#14
Claude Sonnet 4.6 Search

claude-sonnet-4-6-search

AnthropicProprietary
83context proxy
Not listedDocs Arena
1,196Search Arena rank #12
Not listedText Arena
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#15
Gemini 3 Pro

gemini-3-pro

GoogleProprietary
82context proxy
Not listedDocs Arena
Not listedSearch Arena
1,486Text Arena rank #7
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#16
Grok 4.20 Multi Agent Beta 0309

grok-4.20-multi-agent-beta-0309

xAIProprietary
81context proxy
Not listedDocs Arena
1,211Search Arena rank #5
1,472Text Arena rank #20
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#17
Claude Opus 4.5 Search

claude-opus-4-5-search

AnthropicProprietary
79context proxy
Not listedDocs Arena
1,190Search Arena rank #14
Not listedText Arena
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#18
Grok 4.20 Beta 1

grok-4.20-beta1

xAIProprietary
79context proxy
Not listedDocs Arena
1,199Search Arena rank #8
1,476Text Arena rank #13
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#19
O3 Search

o3-search

OpenAIProprietary
78context proxy
Not listedDocs Arena
1,188Search Arena rank #15
Not listedText Arena
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#20
GPT-5.5 High

gpt-5.5-high

OpenAIProprietary
78context proxy
Not listedDocs Arena
Not listedSearch Arena
1,482Text Arena rank #8
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#21
GPT-5.4 High

gpt-5.4-high

OpenAIProprietary
76context proxy
Not listedDocs Arena
Not listedSearch Arena
1,480Text Arena rank #9
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#22
Gemini 3.5 Flash

gemini-3.5-flash

GoogleProprietary
75context proxy
Not listedDocs Arena
Not listedSearch Arena
1,479Text Arena rank #10
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#23
GPT-5.1 Search

gpt-5.1-search

OpenAIProprietary
74context proxy
Not listedDocs Arena
1,181Search Arena rank #16
Not listedText Arena
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#24
ERNIE 5.1

ernie-5.1

BaiduProprietary
74context proxy
Not listedDocs Arena
1,193Search Arena rank #13
1,470Text Arena rank #21
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#25
GPT-5 Search

gpt-5-search

OpenAIProprietary
73context proxy
Not listedDocs Arena
1,179Search Arena rank #17
Not listedText Arena
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#26
GPT-5.5

gpt-5.5

OpenAIProprietary
71context proxy
Not listedDocs Arena
Not listedSearch Arena
1,476Text Arena rank #11
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#27
GPT-5.2 Chat Latest 20260210

gpt-5.2-chat-latest-20260210

OpenAIProprietary
71context proxy
Not listedDocs Arena
Not listedSearch Arena
1,476Text Arena rank #12
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#28
Grok 4.20 Beta 0309 Reasoning

grok-4.20-beta-0309-reasoning

xAIProprietary
70context proxy
Not listedDocs Arena
Not listedSearch Arena
1,475Text Arena rank #14
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#29
Qwen3.7 Max Preview

qwen3.7-max-preview

AlibabaProprietary
70context proxy
Not listedDocs Arena
Not listedSearch Arena
1,475Text Arena rank #15
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#30
GLM 5.1

glm-5.1

Z.aiOpen weights
69context proxy
Not listedDocs Arena
Not listedSearch Arena
1,474Text Arena rank #16
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#1
Claude Opus 4.6 Thinking

claude-opus-4-6-thinking

AnthropicProprietary
100context proxy
Docs
Not listedDocs Arena
Search
Not listedSearch Arena
Text
1,502Text Arena rank #1
Context
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#2
GPT-5.5 Search

gpt-5.5-search

OpenAIProprietary
100context proxy
Docs
Not listedDocs Arena
Search
1,223Search Arena rank #1
Text
Not listedText Arena
Context
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#3
Claude Opus 4.7 Thinking

claude-opus-4-7-thinking

AnthropicProprietary
98context proxy
Docs
Not listedDocs Arena
Search
Not listedSearch Arena
Text
1,500Text Arena rank #2
Context
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#4
Claude Opus 4.6 Search

claude-opus-4-6-search

AnthropicProprietary
98context proxy
Docs
Not listedDocs Arena
Search
1,219Search Arena rank #2
Text
Not listedText Arena
Context
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#5
Claude Opus 4.6

claude-opus-4-6

AnthropicProprietary
96context proxy
Docs
Not listedDocs Arena
Search
Not listedSearch Arena
Text
1,498Text Arena rank #3
Context
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#6
Gemini 3.1 Pro Grounding

gemini-3.1-pro-grounding

GoogleProprietary
93context proxy
Docs
Not listedDocs Arena
Search
1,212Search Arena rank #4
Text
Not listedText Arena
Context
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#7
Claude Opus 4.7

claude-opus-4-7

AnthropicProprietary
93context proxy
Docs
Not listedDocs Arena
Search
1,214Search Arena rank #3
Text
1,494Text Arena rank #4
Context
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#8
Gemini 3 Pro Grounding

gemini-3-pro-grounding

GoogleProprietary
87context proxy
Docs
Not listedDocs Arena
Search
1,202Search Arena rank #7
Text
Not listedText Arena
Context
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#9
Muse Spark

muse-spark

MetaProprietary
86context proxy
Docs
Not listedDocs Arena
Search
Not listedSearch Arena
Text
1,489Text Arena rank #5
Context
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#10
GPT-5.4 Search

gpt-5.4-search

OpenAIProprietary
85context proxy
Docs
Not listedDocs Arena
Search
1,199Search Arena rank #9
Text
Not listedText Arena
Context
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#11
Gemini 3 Flash Grounding

gemini-3-flash-grounding

GoogleProprietary
84context proxy
Docs
Not listedDocs Arena
Search
1,197Search Arena rank #10
Text
Not listedText Arena
Context
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#12
Gemini 3.1 Pro Preview

gemini-3.1-pro-preview

GoogleProprietary
84context proxy
Docs
Not listedDocs Arena
Search
Not listedSearch Arena
Text
1,487Text Arena rank #6
Context
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#13
Grok 4.1 Fast Search

grok-4-1-fast-search

xAIProprietary
83context proxy
Docs
Not listedDocs Arena
Search
1,196Search Arena rank #11
Text
Not listedText Arena
Context
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#14
Claude Sonnet 4.6 Search

claude-sonnet-4-6-search

AnthropicProprietary
83context proxy
Docs
Not listedDocs Arena
Search
1,196Search Arena rank #12
Text
Not listedText Arena
Context
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#15
Gemini 3 Pro

gemini-3-pro

GoogleProprietary
82context proxy
Docs
Not listedDocs Arena
Search
Not listedSearch Arena
Text
1,486Text Arena rank #7
Context
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#16
Grok 4.20 Multi Agent Beta 0309

grok-4.20-multi-agent-beta-0309

xAIProprietary
81context proxy
Docs
Not listedDocs Arena
Search
1,211Search Arena rank #5
Text
1,472Text Arena rank #20
Context
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#17
Claude Opus 4.5 Search

claude-opus-4-5-search

AnthropicProprietary
79context proxy
Docs
Not listedDocs Arena
Search
1,190Search Arena rank #14
Text
Not listedText Arena
Context
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#18
Grok 4.20 Beta 1

grok-4.20-beta1

xAIProprietary
79context proxy
Docs
Not listedDocs Arena
Search
1,199Search Arena rank #8
Text
1,476Text Arena rank #13
Context
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#19
O3 Search

o3-search

OpenAIProprietary
78context proxy
Docs
Not listedDocs Arena
Search
1,188Search Arena rank #15
Text
Not listedText Arena
Context
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#20
GPT-5.5 High

gpt-5.5-high

OpenAIProprietary
78context proxy
Docs
Not listedDocs Arena
Search
Not listedSearch Arena
Text
1,482Text Arena rank #8
Context
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#21
GPT-5.4 High

gpt-5.4-high

OpenAIProprietary
76context proxy
Docs
Not listedDocs Arena
Search
Not listedSearch Arena
Text
1,480Text Arena rank #9
Context
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#22
Gemini 3.5 Flash

gemini-3.5-flash

GoogleProprietary
75context proxy
Docs
Not listedDocs Arena
Search
Not listedSearch Arena
Text
1,479Text Arena rank #10
Context
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#23
GPT-5.1 Search

gpt-5.1-search

OpenAIProprietary
74context proxy
Docs
Not listedDocs Arena
Search
1,181Search Arena rank #16
Text
Not listedText Arena
Context
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#24
ERNIE 5.1

ernie-5.1

BaiduProprietary
74context proxy
Docs
Not listedDocs Arena
Search
1,193Search Arena rank #13
Text
1,470Text Arena rank #21
Context
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#25
GPT-5 Search

gpt-5-search

OpenAIProprietary
73context proxy
Docs
Not listedDocs Arena
Search
1,179Search Arena rank #17
Text
Not listedText Arena
Context
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#26
GPT-5.5

gpt-5.5

OpenAIProprietary
71context proxy
Docs
Not listedDocs Arena
Search
Not listedSearch Arena
Text
1,476Text Arena rank #11
Context
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#27
GPT-5.2 Chat Latest 20260210

gpt-5.2-chat-latest-20260210

OpenAIProprietary
71context proxy
Docs
Not listedDocs Arena
Search
Not listedSearch Arena
Text
1,476Text Arena rank #12
Context
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#28
Grok 4.20 Beta 0309 Reasoning

grok-4.20-beta-0309-reasoning

xAIProprietary
70context proxy
Docs
Not listedDocs Arena
Search
Not listedSearch Arena
Text
1,475Text Arena rank #14
Context
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#29
Qwen3.7 Max Preview

qwen3.7-max-preview

AlibabaProprietary
70context proxy
Docs
Not listedDocs Arena
Search
Not listedSearch Arena
Text
1,475Text Arena rank #15
Context
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena
#30
GLM 5.1

glm-5.1

Z.aiOpen weights
69context proxy
Docs
Not listedDocs Arena
Search
Not listedSearch Arena
Text
1,474Text Arena rank #16
Context
HELM pendinglong-context feed
Docs ArenaSearch ArenaText Arena

Benchmark guide

What the scores mean.

A quick reading key for long-context rankings, document-heavy work, retrieval signals, and planned context-window data.

Higher: context proxyHELM feed planned
What does the long-context proxy measure?

The long-context proxy blends Document Arena, Search Arena, and Text Arena scores. It is meant to surface models that appear strong on document-heavy and retrieval-heavy work while dedicated long-context benchmarks are being wired.

Does this measure context window size?

Not yet. Context window size and max output tokens need a reliable source before they become table fields. This ranking focuses on public task performance, not advertised token limits.

Why is HELM Long Context listed as planned?

HELM Long Context is the right benchmark family for more direct long-context evaluation, but it is not mixed into the score until the feed is integrated and matched cleanly to model rows.

How should I read document and search scores?

Document scores point toward long-form reading and file-style tasks. Search scores point toward retrieval-style workflows. Both are useful signals, but neither proves that a model can reliably use every token in a huge context window.