State Table
68,978 rows
The first-pass live model is trained and tested on minute-level match states rather than final-match summaries.
Tactical Decision Support
A minute-level live forecasting case study that updates win, draw, and loss probabilities from scoreline, time remaining, red-card state, cumulative xG, and pre-match strength priors.
How should outcome probabilities update during a match, and how can those updates support tactical decision-making?
State Table
68,978 rows
The first-pass live model is trained and tested on minute-level match states rather than final-match summaries.
Best Log Loss
1.1095
The GAM-based remaining-goals model currently produces the strongest live outcome probabilities on the 2020/2021 test season.
Best RPS
0.2198
The same model also leads on ranked probability score, beating both the empirical baseline and the simpler Poisson alternative.
Estimate in-game win, draw, and loss probabilities as the state of a match evolves.
Treat this as a remaining-goals problem from the current state to full time, because that creates a direct bridge between football events and live outcome probabilities.
This directly supports the wider football problem of in-game strategy under changing conditions.
The project is designed to answer questions such as how much a red card shifted the match, how much the pre-match prior still matters at minute 60, and when a game state remains too uncertain for strong tactical conclusions.
The first pass builds a minute-level state table from the broader La Liga archive used in Project 2.
The resulting state table contains 68,978 rows: 62,790 train, 3,003 validation, and 3,185 test.
Each row tracks minute, time remaining, score state, cumulative xG, red-card counts, and a pre-match strength-gap prior from Project 2.
The model ladder is deliberately interpretable. A naive empirical state baseline sets the floor, a remaining-goals Poisson model provides a structured statistical step up, and a GAM-based remaining-goals model captures the nonlinear effect of time remaining and state intensity.
The key decision here was not to jump straight into a black-box live classifier. Remaining-goals models make the football mechanism visible: given the current state, how much scoring is still expected for each side?
The first pass already shows that adding richer state structure matters materially, with the GAM outperforming the simpler alternatives on the holdout season.
Validation is again temporal rather than random. The live models are trained on earlier seasons and judged on 2020/2021 minute-level match states.
The evaluation focus is forecast quality at the state level, not just whether the eventual match outcome was guessed correctly from a late game state.
The first-pass live model is already viable: both structured models beat the empirical baseline by a wide margin, and the GAM-based remaining-goals model currently performs best.
Current test metrics are: baseline log loss 4.7749, remaining-goals Poisson 1.1553, and remaining-goals GAM 1.1095; baseline ranked probability score 0.3474, Poisson 0.2363, and GAM 0.2198.
That is a strong early result because it shows the project is doing more than restating scoreline frequencies. It is learning a better live probability surface from state, xG, red cards, and prior strength.
The main football lesson is intuitive but now quantified: score difference matters more and more as time runs down, while pre-match strength matters most early or when the game is still level.

Both structured live models massively outperform the empirical baseline, and the GAM-based remaining-goals model currently leads on all three first-pass test metrics.

The timeline view turns the model into something tactical and interpretable: probabilities move with score, time, red cards, and the pre-match prior rather than staying static.

This first-pass calibration view shows whether the live home-win probabilities are directionally aligned with observed outcomes rather than simply being sharp or dramatic.

This is the clearest football takeaway in the project: the same one-goal lead means something very different at minute 15 than it does at minute 75, and the model captures that directly.

Pre-match strength still matters while the game is level. The stronger team starts with a real edge, and that prior continues to shape the live probabilities until enough in-match evidence arrives.

Late red-card imbalances create the most dramatic shifts in this dataset. When the home team is down a player late, its win probability nearly disappears; when the away team is down a player late, the home side is almost certain to win.
The first implementation of the state builder failed on empty event subsets, because matches without red cards produced empty tables with no typed minute column. That broke the minute-by-minute state extraction logic until the pipeline was made robust to empty event classes.
The project also forced a modelling choice early: rather than jumping to a direct live classification model, the remaining-goals approach proved to be the more defensible first version because it stayed interpretable and fit naturally with football match mechanics.
Those failures improved the project because they pushed the workflow toward robustness in the data layer and discipline in the modelling layer.
The model should support tactical reasoning, not automate it. Analysts still need opponent context, player fitness information, and coaching intent.
The right use is to describe how the game state changed, how much prior expectations still matter, and where uncertainty is still too large to overreact.
What it can say now is when the match has become materially more stable or unstable. What it cannot yet say cleanly is whether a substitution at minute 58 is better than one at minute 68, because substitutions are not yet modelled as causal interventions.
The implementation separates state-table construction, pre-match prior ingestion, remaining-goals modelling, and evaluation into modular pieces.
That separation matters because the in-game model is the first project where the data volume becomes meaningfully larger: minute-level states are much more demanding than match-level or shot-level summaries.
The first-pass scoreline path is approximated from cumulative xG progression rather than exact goal timestamps, so the state table is useful but not yet a perfect reconstruction of live match history.
Substitution quality and tactical shape changes are still only partially observed in the standard event feed, which limits certainty.
Red-card states are relatively sparse, so the live model should be interpreted carefully in extreme manpower situations.
That is why this version can describe red-card impact more confidently than substitution timing: the substitution effect is not yet modelled directly.
The next upgrade is to replace the approximate score-state reconstruction with true event-time score tracking from goal events.
A second upgrade is to export live probability timelines and tactical scenario charts, which would make the case study much more visually concrete.
Later iterations can add substitutions and more explicit tactical scenario analysis once the live state backbone is fully trustworthy.