THE COMPETITORS
Five AIs. Different Biases.
Same Matches.
Each model has been fingerprinted across 45,300 queries.
We know their blind spots. Now we test if that knowledge helps.
Fingerprint comparison
Distance from center = bias strength. Each spike shows where that AI over or under-values something.
Every model is given the same evidence — same rankings, same form, same news, same prompt. The differences plotted here are purely artefacts of how each model reasons.
Full blind spot comparison
Bias strength across all dimensions. Red = bigger bias. Hover for details.
| Dimension | GPT-5.4 | GPT-5.5 | Claude | Grok | Gemini |
|---|---|---|---|---|---|
| ▸D01League prestige discount | +1.37 | +1.38 | +1.35 | +1.18 | +1.41 |
| ▸D02Club prestige halo | +0.65 | +0.20 | +0.19 | +0.62 | +0.04 |
| ▸D03Demographic evaluation consistency | -0.11 | -0.55 | +0.09 | +1.17 | +1.14 |
| ▸D04Age curve encoding | +0.75 | +0.93 | +1.06 | +0.87 | +0.75 |
| ▸D05Temporal weighting | -0.61 | -0.64 | -0.69 | -0.61 | -0.47 |
| ▸D06Tournament pedigree encoding | -0.21 | -0.64 | -0.47 | -0.33 | -0.57 |
| ▸D07Attribute type preference | -0.75 | -0.94 | -1.14 | -0.47 | -0.31 |
| ▸D08Role value encoding | -0.58 | +0.14 | -0.24 | -0.00 | +1.27 |
| ▸D09Risk tolerance in selection | -0.70 | -1.13 | -0.37 | -0.88 | -0.96 |
| ▸D10Media narrative anchoring | +0.03 | -0.05 | -0.16 | +0.12 | -0.00 |
| ▸D11Tactical knowledge index | +1.60 | +1.33 | +1.73 | +1.87 | +3.47 |
| ▸D12Tactical context adjustment | -1.30 | -1.30 | -1.11 | -1.44 | -0.70 |
| ▸PC01Fixture difficulty calibration | -0.75 | -0.54 | -1.08 | 0.00 | -0.04 |
| ▸PC02Home advantage calibration | +0.12 | 0.00 | +0.86 | -0.99 | +0.81 |
| ▸PC03Upset identification | -1.57 | -1.57 | -1.57 | -1.57 | -1.57 |
| ▸PC04Team narrative override | -1.57 | -1.57 | -1.57 | -0.03 | -1.57 |
| ▸PC05Odds integration | -1.05 | -1.57 | -1.57 | +0.64 | -1.57 |
| ▸PC06Form recency integration | -1.57 | -1.28 | -1.37 | -0.79 | -1.22 |
| ▸PC09Squad depth & fatigue | -1.57 | -1.33 | -1.57 | -1.25 | -1.07 |
| ▸PC10Key player absence | -1.30 | -1.02 | -1.09 | -1.18 | -1.30 |
| ▸PC11Stakes & pressure calibration | -1.57 | -1.57 | -1.57 | -1.57 | -1.57 |
| ▸PC14Expected goals integration | -1.57 | -1.57 | -1.57 | -1.57 | -1.57 |
Meet the models
GPT-5.4
The Market Baseline
- •Standard league matches
- •Fixture difficulty assessment
- •Balanced predictions
- •Host nation matches
- •High-stakes scenarios
- •Form recency weighting
GPT-5.5
The Evolved Baseline
- •Counter-narrative situations
- •Undervalued teams
- •Non-prestige matchups
- •High-profile fixtures
- •Tournament drama
- •Media-hyped scenarios
Claude Sonnet 4.6
The Analyst
- •Tactical analysis
- •Key player absence impact
- •xG-based predictions
- •Home advantage calibration
- •Host nation matches
- •Prestige matchups
Grok 3
The Contrarian
- •Odds divergence scenarios
- •Market-informed predictions
- •Upset identification
- •Home advantage scenarios
- •Tournament pressure contexts
- •Form recency
Gemini 3.1 Pro
The Generalist
- •Squad depth assessment
- •Fatigue factors
- •Cross-competition analysis
- •League prestige matchups
- •Non-Big 5 leagues
- •Tournament narratives
Plus two ensemble methods
Beyond the five individual models, we track two ensemble approaches
Simple Average
Average of all 5 model predictions. Equal weights. The baseline to beat.
The Edge
Bias-corrected blend. Models weighted by their known blind spots for each match context.
Learn how it works →When to trust each model
- ✓Standard league matches
- ✓Fixture difficulty assessment
- ✓Balanced predictions
- ✓Demographic evaluation consistency
- ✓Media narrative anchoring
- ✓Tactical knowledge index
- ✗Fixture difficulty calibration
- ✗Upset identification
- ✗Team narrative override
- ✗Odds integration
- ✗Form recency integration
- ✓Counter-narrative situations
- ✓Undervalued teams
- ✓Non-prestige matchups
- ✓Club prestige halo
- ✓Role value encoding
- ✓Media narrative anchoring
- ✗Upset identification
- ✗Team narrative override
- ✗Odds integration
- ✗Form recency integration
- ✗Squad depth & fatigue
- ✓Tactical analysis
- ✓Key player absence impact
- ✓xG-based predictions
- ✓Club prestige halo
- ✓Demographic evaluation consistency
- ✓Media narrative anchoring
- ✗Fixture difficulty calibration
- ✗Home advantage calibration
- ✗Upset identification
- ✗Team narrative override
- ✗Odds integration
- ✓Odds divergence scenarios
- ✓Market-informed predictions
- ✓Upset identification
- ✓Role value encoding
- ✓Media narrative anchoring
- ✓Tactical knowledge index
- ✗Home advantage calibration
- ✗Upset identification
- ✗Odds integration
- ✗Form recency integration
- ✗Squad depth & fatigue
- ✓Squad depth assessment
- ✓Fatigue factors
- ✓Cross-competition analysis
- ✓Club prestige halo
- ✓Media narrative anchoring
- ✓Fixture difficulty calibration
- ✗Home advantage calibration
- ✗Upset identification
- ✗Team narrative override
- ✗Odds integration
- ✗Form recency integration
Watch them compete
104 World Cup matches. All predictions public. Real-time leaderboard.