Overview
Championship predictions use the same core methodology as race-picks but with key differences:
- Probability only - No points predictions are made for championships
- Config-based athletes - Athletes are manually configured rather than scraped from startlists
- Championship filter - Races marked with
Championship == 1in the schedule are processed
The prediction pipeline is: Config Athletes → Elo Matching → Probability Training → Normalization → Output
Alpine
Individual
Data Gathering
Championship athletes are configured in config.py by nation rather than scraped from FIS. Each nation has a list of qualified athletes for World Championships or Olympics. The race schedule is read from weekends.csv and filtered for races where Championship == 1.
Athletes are matched with their most recent Elo scores for Overall, Downhill, Super-G, Giant Slalom, Slalom, Speed, and Technical disciplines.
Probability
Training
Setup
Historical race data is preprocessed to create binary classification outcomes for each position threshold. The training data is filtered to the last 10 seasons and to skiers with an Elo percentage of 75% or greater.
Position thresholds are Win (1), Podium (3), Top 5, Top 10, and Top 30. Each threshold creates a separate binary classification problem.
Elo scores are converted to percentages of the maximum for each race. Weighted previous points use the last 5 discipline-specific results with linear weights (most recent = 5, oldest = 1).
Feature Selection
Feature selection uses BIC optimization via regsubsets for each position threshold independently. Speed events (Downhill, Super-G) use speed-related Elo columns. Technical events (Slalom, Giant Slalom) use technical-related Elo columns.
Modeling
Binomial GAM models with REML estimation are trained for each position threshold. Brier scores validate probabilistic accuracy. Fallback models use simplified formulas if the full model fails.
Adjustments
Period adjustments are calculated during training based on performance differences across World Cup stages. Adjustments require statistical significance (p < 0.05) and minimum sample sizes.
Testing
Startlist Setup
Athletes from the config are matched with current Elo scores. Elo percentages are calculated by dividing by the startlist maximum. Missing values are imputed with first quartile values.
Race participation probabilities use exponential decay (α = 0.1) based on recent race participation. Configured championship athletes receive 1.0 probability.
Modeling
The trained binomial GAM models are applied to generate probability predictions for each position threshold.
Adjustments
Period adjustments from training are applied to predictions where statistically significant.
Normalization and Monotonic Constraints
A 5-phase normalization process is applied using two-phase constrained normalization:
-
Phase 1 - Two-Phase Constrained Normalization:
- Phase A: Scale all probabilities proportionally to target sum (100% for win, 300% for podium, etc.)
- Phase B: Cap athletes above 100% and redistribute excess iteratively
- Process iterates until no athletes exceed cap
-
Phase 2 - Monotonic Constraints: Ensures Win ≤ Podium ≤ Top 5 ≤ Top 10 ≤ Top 30
-
Phase 3 - Re-normalize (Two-Phase Constrained):
- Same two-phase approach as Phase 1
- Adjusts for any sum changes from monotonic constraints
-
Phase 4 - Final Participation Cap: No probability can exceed athlete’s participation probability
-
Phase 5 - Final Monotonic Enforcement: Second pass to ensure no inversions from re-normalization
Why Two-Phase Normalization?
The key insight is that raw model predictions may be over-inflated (e.g., multiple athletes at 120% win probability). The two-phase approach handles this correctly:
- Phase A scales everyone proportionally first, bringing the total to the target
- Phase B then caps any athletes still above 100% and redistributes excess
This ensures:
- Over-predicted athletes are scaled down fairly before capping
- Truly dominant athletes (above 100% after scaling) get capped and excess redistributed
- Probabilities always sum to the correct target
Mathematical Guarantee: After Phase A scaling, at most target/100 athletes can exceed 100%. For Win (target=100%), at most 1 athlete can exceed 100%. For Podium (target=300%), at most 3 can exceed 100%. This guarantees Phase B always has room to redistribute excess.
Biathlon
Individual
Data Gathering
Championship athletes are configured in config.py by nation. Race schedule filtered from weekends.csv for Championship == 1. Athletes matched with Elo scores for Overall, Sprint, Pursuit, Individual, and Mass Start.
Probability
Training
Training data filtered to last 10 seasons and 75% Elo threshold. Position thresholds: Win, Podium, Top 5, Top 10, Top 30. BIC feature selection per threshold. Binomial GAM models with REML. Period and elevation (≥1300m) adjustments calculated.
Testing
Elo percentages from startlist maximum. Exponential decay (α = 0.1) for participation probability. Trained models applied with adjustments.
Normalization and Monotonic Constraints
Same 5-phase process as Alpine.
Relay
Data Gathering
Relay teams configured by nation in config.py with team member lists. Team Elo scores calculated as averages of individual members.
Probability
Training
Position thresholds reduced to Win, Podium, Top 5, Top 10 for relay events. BIC feature selection on team average Elos. Binomial GAM models trained per threshold.
Testing
Team average Elos calculated from configured rosters. Models applied to generate team probability predictions.
Normalization and Monotonic Constraints
Same 5-phase process with relay-specific thresholds.
Cross-Country
Cross-country skiing predictions use a Monte Carlo simulation approach that differs from other sports. This approach naturally handles varying field sizes without normalization artifacts and produces more calibrated probability distributions.
Individual
Data Gathering
Championship athletes configured by nation. Race schedule filtered for Championship == 1. Athletes matched with Elo scores for Overall, Distance, Distance Classic, Distance Freestyle, Sprint, Sprint Classic, Sprint Freestyle, Classic, and Freestyle.
Probability
Training
Training data filtered to last 10 seasons.
Position thresholds: Win, Podium, Top 5, Top 10, Top 30 for all individual events.
Feature Selection uses BIC optimization via regsubsets with a positive coefficient constraint. Features that receive negative coefficients in the model are excluded, as all features should be positive indicators of performance. This prevents counterintuitive situations where higher Elo predicts lower probability.
Exponential Decay Weighting: Historical races are weighted using exponential decay based on date:
- Formula:
weight = exp(-DECAY_LAMBDA * days_ago) - DECAY_LAMBDA = 0.002 gives approximately 50% weight after 1 year
- More recent races have greater influence on predictions
Testing
For each athlete, a performance distribution is built by:
- Historical Performance: Recent discipline-specific race results weighted by exponential decay
- GAM Prediction: Expected points from the trained model
- Variance Estimation: Standard deviation based on historical variability, scaled and bounded
Monte Carlo Simulation: 10,000 race simulations are run where each athlete samples from their distribution. Position is determined by ranking all samples. Position probabilities are the proportion of simulations where each athlete achieved each threshold.
Variance Control Parameters (calibrated via Brier score grid search):
- SD_SCALE_FACTOR: Scales athlete variance (lower = favorites win more)
- SD_MIN/SD_MAX: Bounds on variance to prevent extreme values
Normalization
Simulation naturally produces correctly-summed probabilities (exactly 1 winner, 3 podium spots, etc.). No post-hoc normalization is required, avoiding artifacts from iterative capping/redistribution.
Relay
Data Gathering
Relay teams configured by nation with leg assignments. Classic legs (1-2) and Freestyle legs (3-4) use appropriate technique-specific Elos.
Team Selection
Dual-Optimized Selection: Teams are selected to maximize both podium probability (threshold = 3) and win probability (threshold = 1) using leg-specific models. The selection algorithm:
- Trains leg-specific binomial GAMs for both P(team podium | athlete on leg X) and P(team win | athlete on leg X)
- Uses combinatorial optimization to find optimal 4-athlete lineups for each objective
- Considers athlete availability and leg-appropriate technique
- Outputs both podium-optimized and win-optimized team rosters
Probability
Hybrid Approach (GAM + Simulation)
Cross-country relay uses a hybrid approach combining production models with simulation:
- Leg-Specific GAM Models: Binomial GAMs are trained for each leg position predicting P(team podium | athlete on this leg) and P(team win | athlete on this leg)
- Leg Importance Calculation: Model deviance explained determines relative importance of each leg position
- Team Distribution: Athlete predictions are combined with importance weights to create team score distributions
- Monte Carlo Simulation: Teams are ranked across 10,000 simulations
Exponential Decay for prev_points_weighted: Relay uses the same exponential decay weighting as individual races:
- Legs 1-2 (Classic): Uses classic distance race history with exponential decay
- Legs 3-4 (Freestyle): Uses freestyle distance race history with exponential decay
- Formula:
weight = exp(-DECAY_LAMBDA * days_ago)where DECAY_LAMBDA = 0.002
Leg Features (matching technique requirements):
- Legs 1-2 (Classic):
prev_points_weighted, Pelo_pct, Distance_Pelo_pct, Distance_C_Pelo_pct, Classic_Pelo_pct, Sprint_Pelo_pct, Sprint_C_Pelo_pct - Legs 3-4 (Freestyle):
prev_points_weighted, Pelo_pct, Distance_Pelo_pct, Distance_F_Pelo_pct, Freestyle_Pelo_pct, Sprint_Pelo_pct, Sprint_F_Pelo_pct
Variance Control
Relay-specific variance parameters (calibrated separately from individual races):
- RELAY_SCORE_SD_MIN/MAX: Bounds on team score variance
Output
Team probabilities for Win, Podium, Top 5, Top 10 from simulation. Leg-level probabilities are derived by scaling the leg model predictions by team-level threshold ratios.
Team Sprint
Data Gathering
Team sprint pairs configured by nation. Each team has two athletes alternating legs.
Technique-Specific Models
Unlike 4-leg relay, team sprint uses technique-specific features and models:
- Classic Team Sprint (C): Uses
Sprint_C_Pelo_pct, Classic_Pelo_pct, Distance_C_Pelo_pct - Freestyle Team Sprint (F): Uses
Sprint_F_Pelo_pct, Freestyle_Pelo_pct, Distance_F_Pelo_pct
Separate models are trained for each technique at the championship.
Exponential Decay for prev_points_weighted: Team sprint uses technique-specific sprint history with exponential decay:
- Classic (C): Uses classic sprint race history
- Freestyle (F): Uses freestyle sprint race history
- Formula:
weight = exp(-DECAY_LAMBDA * days_ago)where DECAY_LAMBDA = 0.002
Team Selection
Same dual-optimized approach as relay, producing both podium-optimized and win-optimized team rosters.
Probability
Same hybrid approach as relay with 2 legs instead of 4. Variance parameters (TS_SCORE_SD_MIN/MAX) are calibrated separately.
Calibration System
Three independent calibration processes using Brier score on historical data (2018+):
- Individual Race Calibration: Grid search over DECAY_LAMBDA, SD_SCALE_FACTOR, SD_MIN, SD_MAX
- Relay Calibration: Grid search over RELAY_SCORE_SD_MIN, RELAY_SCORE_SD_MAX
- Team Sprint Calibration: Grid search over TS_SCORE_SD_MIN, TS_SCORE_SD_MAX
Nordic Combined
Individual
Data Gathering
Championship athletes configured by nation. Athletes matched with Elo scores for Overall, Individual, Individual Compact, Sprint, and Mass Start.
Probability
Training
Training data filtered to last 10 seasons and 75% Elo threshold. Position thresholds: Win, Podium, Top 5, Top 10, Top 30. BIC feature selection per threshold. Binomial GAM models with REML. Period and elevation adjustments calculated.
Testing
Elo percentages from startlist maximum. Exponential decay (α = 0.1) for participation. Models applied with adjustments.
Normalization and Monotonic Constraints
Same 5-phase process as Alpine.
Team
Data Gathering
Team members configured by nation. Team Elo scores calculated as member averages.
Probability
Training
Position thresholds: Win, Podium, Top 5, Top 10. BIC feature selection on team average Elos. Binomial GAM models per threshold.
Testing
Team average Elos from configured rosters. Models applied for team predictions.
Normalization and Monotonic Constraints
Same 5-phase process with team thresholds.
Ski Jumping
Individual
Data Gathering
Championship athletes configured by nation. Athletes matched with Elo scores for Overall, Small Hill, Medium Hill, Normal Hill, Large Hill, and Flying.
Probability
Training
Training data filtered to last 10 seasons and 75% Elo threshold. Position thresholds: Win, Podium, Top 5, Top 10, Top 30. BIC feature selection per threshold using hill-size appropriate Elos. Binomial GAM models with REML. Period and hill size adjustments calculated.
Testing
Elo percentages from startlist maximum. Exponential decay (α = 0.1) for participation. Models applied with adjustments.
Normalization and Monotonic Constraints
Same 5-phase process as Alpine.
Team
Data Gathering
Team members (typically 4 jumpers) configured by nation. Team Elo scores calculated as member averages.
Probability
Training
Position thresholds: Win, Podium, Top 5 (smaller team fields). BIC feature selection on team average Elos. Binomial GAM models per threshold.
Testing
Team average Elos from configured rosters. Models applied for team predictions.
Normalization and Monotonic Constraints
Same 5-phase process with team-specific thresholds.
Mixed Team
Data Gathering
Mixed team members (2 men, 2 women) configured by nation. Team Elo scores calculated as averages across both genders.
Probability
Same methodology as team events with mixed team thresholds.