Prediction Models
Walk-forward backtested on historical pro matches · honest numbers, not marketing
Head-to-Head Elo
elo_h2hClassic Elo, no side or recency awareness.
Accuracy
0.513
50% = coinflip
Brier score
0.2502
0.25 = coinflip · lower = better
Log-loss
0.6936
0.693 = coinflip · lower = better
Top-quartile acc
0.602
highest-confidence 25%
Calibration · predicted vs actual win rate
50–60%
48.3%
839
60–70%
56.5%
216
70–80%
73.6%
53
80–90%
77.8%
9
predicted
actual
H2H Elo + Radiant bias
elo_h2h_sideClassic Elo with a +25 Elo nudge for radiant (observed historical edge).
Accuracy
0.528
50% = coinflip
Brier score
0.2489
0.25 = coinflip · lower = better
Log-loss
0.6909
0.693 = coinflip · lower = better
Top-quartile acc
0.620
highest-confidence 25%
Calibration · predicted vs actual win rate
50–60%
49.8%
797
60–70%
58.1%
253
70–80%
67.9%
56
80–90%
72.7%
11
predicted
actual
Patch-aware Elo (60d half-life)
elo_recency_60dStale ratings drift back to 1500 with a 60-day half-life. Penalizes teams who haven't played recently — meta shift simulator.
Accuracy
0.527
50% = coinflip
Brier score
0.2484
0.25 = coinflip · lower = better
Log-loss
0.6897
0.693 = coinflip · lower = better
Top-quartile acc
0.631
highest-confidence 25%
Calibration · predicted vs actual win rate
50–60%
49.4%
815
60–70%
60.2%
246
70–80%
68.0%
50
80–90%
66.7%
6
predicted
actual
Patch-aware Elo (120d half-life)
elo_recency_120dSlower decay — keeps ratings stickier than 60d. Useful when patch cadence is slower.
Accuracy
0.526
50% = coinflip
Brier score
0.2486
0.25 = coinflip · lower = better
Log-loss
0.6902
0.693 = coinflip · lower = better
Top-quartile acc
0.624
highest-confidence 25%
Calibration · predicted vs actual win rate
50–60%
49.4%
811
60–70%
59.3%
246
70–80%
68.0%
50
80–90%
70.0%
10
predicted
actual
Roadmap what's not built yet
Patch-aware Elo
planned
Current Elo treats 6-month-old matches the same as last week. Patches shift meta significantly — adding patch-decay weighting should lift accuracy ~2-3pp.
Roster-tracked Elo
planned
Team rating carries through roster changes. When a star player transfers, the new team should partially inherit the rating.
Draft XGBoost (live)
research
Pick/ban order has meaningful signal but drafts are only known AFTER game start. Live-only model for in-progress matches.