Blog

Ongoing research updates and findings from the Modelball study.

Latest

June 18, 2026•

resultsmodel-accuracy

Four Favourites, Four Wins — But Who Read It Best?

Tuesday's four Group I and J matches all went to the pre-match favourite, giving every model a clean results sheet — yet the Brier scores reveal a wide spread in how confidently, and correctly, they called it.

Read full post

June 18, 2026•

resultsmodel-performance

Wednesday Washout and a Rout: Models Take Their Lumps on June 17

DR Congo held Portugal to a draw that no model really wanted to believe in, while England's 4-2 demolition of Croatia papered over some shaky probability estimates. Two findings from a busy Wednesday slate.

Read full post

June 18, 2026•

resultsgroup-stage

Four Matches, Four Draws: Models Get a Thorough Hiding

Monday's four Group G and H fixtures all ended in draws, and every single model got every single result wrong. A clean sweep for reality.

Read full post

June 11, 2026•

predictionsWorld Cup

Locked in: our pre-tournament World Cup predictions

Before the opening whistle at the Azteca, all 360 group-stage predictions are logged and timestamped. England edges Argentina as favourite, Gemini backs Portugal, and all five models are cool on Brazil.

Read full post

May 11, 2026•

modelsmethodologyGPT-5.5

GPT-5.5 takes the tactical-knowledge crown — once you let it finish answering

The 25-question Tactical Knowledge panel puts GPT-5.5 at the top of the field at 8.67/10 — but only after fixing a 500-token cap that had been silently hiding its responses.

Read full post

May 6, 2026•

World Cuppredictions

What 18 leagues tell us about the World Cup

League data isn't World Cup data. But the patterns we found point to which models will be reliable in June, where the corrections will matter most, and where the system might break.

Read full post

May 5, 2026•

modelsfindings

Per-model winners and losers across 18 leagues

Claude crushes it in La Liga, struggles in Bundesliga. Grok is steady but unspectacular. GPT-5.4 surprises in MLS. The full breakdown by model and league.

Read full post

May 4, 2026•

researchpatterns

The geography of AI bias

A pattern emerged across 18 leagues: the further a league sits from the AI training-data centre of gravity, the more bias correction helps. The map matters.

Read full post

May 3, 2026•

La Ligafindings

Why La Liga is the calibration sweet spot

A 13% Brier improvement, every model lifting, no model degrading. La Liga gave us our cleanest validation of the methodology — here's what it tells us about Spain at the World Cup.

Read full post

May 2, 2026•

Premier Leaguefindings

The Premier League blind spot

Of every league we tested, the Premier League moved the least when we corrected for bias. We think we know why — and what it means for English clubs at the World Cup.

Read full post

May 1, 2026•

calibrationlimitations

Where bias correction fails (and what we learn from it)

Calibration doesn't always help. In four leagues it actively hurt our predictions — and that failure is more useful than any of the wins.

Read full post

April 30, 2026•

calibrationfindings

Where bias correction wins big

Five leagues where correcting AI biases improved predictions by more than 10%. The pattern is not what we expected.

Read full post

April 29, 2026•

calibrationmethodology

What 18 leagues and 979 matches taught us

Before the World Cup we ran our calibration across 18 leagues and almost a thousand matches. The headline: bias correction works — but not everywhere.

Read full post

April 28, 2026•

methodologytransparency

Honest limitations: what we don't know yet

Transparency about our methodology's constraints — sample size, friendlies vs tournaments, and what we need the actual World Cup to teach us.

Read full post

April 27, 2026•

previewpredictions

Five World Cup matches where models will clash

Preview of group stage matches where we expect high model disagreement. These are the fixtures where The Edge methodology will be most tested.

Read full post

April 26, 2026•

The Edgemethodology

The Edge explained: how we turn biases into better predictions

A plain-English explanation of our fingerprint-weighted ensemble. Three layers of intelligence, no equations required.

Read full post

April 25, 2026•

methodologydivergence

What happens when all five models agree (and when they don't)

Model consensus isn't always right. Sometimes high divergence is the real signal — it reveals matches where our bias corrections matter most.

Read full post

April 24, 2026•

Claudebias

Home advantage: Claude thinks it matters more than it does

Claude over-adjusts for home advantage by 15-20 percentage points. Here's what that means for host nation matches in USA, Mexico, and Canada.

Read full post

April 23, 2026•

biasresearch

The prestige problem: why AI overvalues Premier League players

Our research reveals systematic bias toward Big 5 league players. When given identical stats, models prefer the player from the more prestigious league 58-71% of the time.

Read full post

April 22, 2026•

modelsintroduction

Meet the models: how five AIs see football differently

An introduction to GPT-5.4, GPT-5.5, Claude, Grok, and Gemini — their personalities, strengths, and blind spots. Understanding why they disagree is key to our methodology.

Read full post

April 21, 2026•

calibrationmethodology

Pre-tournament calibration: 12 matches tested

We tested our prediction methodology on international friendlies from March-April 2026. The ensemble beat market odds, and models showed distinct behavioral patterns.

Read full post

← Back to homepage