MBModelBall
May 4, 2026

The geography of AI bias

researchpatterns

Step back from any individual league and a pattern shows up across all 18. The further a league sits from the centre of gravity of AI training data — mostly English-language analysis of Big-5 European football — the more our bias correction tends to help. That's a useful map heading into a global tournament.

The shape of the data

Order our 18 leagues by Brier improvement after correction, top to bottom, and a rough geographic / coverage gradient appears.

Improvement, sorted by direction

TierLeaguesAvg improvement
High liftJ1, Saudi Pro, MLS, La Liga, Brasileirão+10 to +19%
Mild liftTurkish, A-League, Russian, EPL, Belgian, Primeira0 to +4%
Slight dragLigue 1, Argentina Primera, Bundesliga−0.3 to −4%
NegativeSerie A, Swiss Super, Scottish Prem, Eredivisie−5 to −10%

What the gradient tracks

Two factors broadly explain the ordering, though neither one fully:

How much training-data coverage the league gets. Premier League gets a flood of English-language analysis. Bundesliga, Serie A, Ligue 1 get a steady stream. La Liga gets a flood, but unevenly distributed (heavy on the top three, lighter elsewhere). MLS, J1, Saudi Pro, Brasileirão get a fraction of that — much of it written from outside the league itself. Less data leaves more room for default heuristics, which is what our correction targets.

How well-behaved the league is statistically. Some leagues have stable, predictable patterns — clear hierarchies, repeatable home advantages, smooth scoring distributions. Others have huge variance: a few dominant clubs and a long tail (Scottish Prem), unusually frequent upsets (Eredivisie), or such a small team count that any season is high-noise (Swiss Super). Generic correction trips on those leagues because the "default" the model would have used was actually closer to right.

The exception: La Liga

La Liga doesn't fit the "less coverage = more lift" story cleanly — it has plenty of coverage. We covered yesterday why it still benefited: the coverage is uneven inside the league, with Madrid and Barcelona overrepresented and the rest underweighted. The same kind of internal asymmetry our correction was designed to address.

That detail matters because the World Cup will produce a lot of similar asymmetries. Inside any World Cup squad, players from the most-covered clubs are over-anchored relative to their teammates from less-covered ones. The squad-level prestige gap is exactly the La Liga effect at international scale.

What it implies for the World Cup

Apply the gradient to the actual World Cup field and you get a rough preview of where our methodology should add the most value:

High-lift expected. Matches involving Saudi Arabia, USA, Mexico, Canada, Japan, South Korea, Australia, Iran, and most African and Asian sides. These are squads where the models' priors are thin and the prestige defaults bite hardest.

Mild lift expected. Matches between two big-coverage European nations — England, Germany, France — where the models already have decent priors on most players.

Possible drag. Small-sample, high-variance fixtures — think Scotland or smaller European entrants in upset-prone matchups. We'll apply lighter correction there, in line with what the league test suggested.

The map isn't perfect — it doesn't explain Serie A, and it doesn't explain the Bundesliga drag. But it explains most of the variance, and it's the closest thing to a working theory of where this methodology fits.

What's next

Tomorrow: per-model breakdowns. Across 18 leagues, no single model was consistently best. Claude wins big in La Liga and struggles in Bundesliga. Grok is steady but unspectacular. The full per-model league-by-league table is the single most useful artefact we have for picking which model to weight on which fixture.

Discussion