Rated Player-Seasons
288
The first pass rates 288 La Liga player-seasons from 2015/2016 to 2020/2021 across the four target roles.
Recruitment and Development
A role-aware recruitment model built from lineup-derived minutes, event-level player contributions, and team-context adjustment, designed to separate player signal from strong-team inflation.
How should a club compare recruitment targets within role while controlling for team environment and small-sample noise?
Rated Player-Seasons
288
The first pass rates 288 La Liga player-seasons from 2015/2016 to 2020/2021 across the four target roles.
Strongest Stability Signal
0.50
Central midfielders repeat at a 0.50 season-to-season correlation across 15 repeated player-seasons.
Largest Context Gap
1.32
Luis Suarez's 2015/2016 raw center-forward score dropped by 1.32 after adjusting for Barcelona's team environment.
Build a player rating that is useful for recruitment and development decisions, rather than a generic public-facing score.
The core challenge is separating player contribution from role, team environment, and unreliable small-sample output.
Recruitment models fail when they compare players across fundamentally different tactical jobs or reward outputs inflated by dominant teams.
A club analyst needs a shortlist tool that narrows discussion safely before live scouting, video review, medical checks, and financial screening.
The first pass uses StatsBomb Open Data for La Liga from 2015/2016 to 2020/2021. Player minutes are derived from lineup stints, not assumed from appearance counts.
Event files provide shots, xG, passing, carrying, dribbling, and defensive actions. Project 2 contributes season-level team non-penalty xG context for adjustment.
The model standardises features within role, builds raw attacking, creation, and defensive component scores, and combines them into a transparent raw overall rating.
A role-specific GAM then estimates how much of that raw score is explained by team non-penalty xG difference. The final rating is the context-adjusted score, shrunk for low-minute reliability.
Validation focuses on season-to-season stability for repeated players, within-role coherence, and the size of context adjustments for players from dominant teams.
The current stability signal is directionally useful rather than definitive: central midfielders and center backs both show roughly 0.5 repeat correlation, while wingers remain too sparse to overclaim.
The first pass produces role-specific tables that are already useful for recruitment framing. The output highlights strong same-role performers while exposing how much elite-team context can flatter raw box-score production.
Barcelona attackers produce the clearest adjustment examples: Lionel Messi and Luis Suarez still rate highly, but their raw numbers are materially reduced once team context is accounted for.

The first-pass recruitment model covers 288 player-seasons across four roles. Center backs and central midfielders currently have the deepest samples, which matters when interpreting stability.

Repeated-player stability is strongest in the deeper role samples. Winger stability appears very high, but only three repeated player-seasons exist, so that result should be treated cautiously.

This chart shows how far raw ratings moved after adjusting for team environment. Barcelona attackers remain elite, but the model makes their strong-team inflation explicit instead of hiding it.
The original scaffold estimated minutes as appearances multiplied by 90. That was fast, but not defensible, so it was replaced with lineup-derived stint minutes.
A recent-seasons-only build produced too few player-seasons to make the recruitment model credible. The scope was widened back to 2015/2016 onward to recover sample depth.
Flexible context adjustment also needed a fallback path for thin role samples where a smooth would overfit or fail altogether.
The rating is designed to narrow a shortlist, structure same-role comparisons, and flag players whose outputs deserve more video review.
It should not be used as a single signing score. Analysts still need tactical fit, athletic profile, contract situation, injury history, and live scouting context before acting.
The pipeline keeps raw StatsBomb files immutable, constructs canonical player-season tables, and exports reproducible tables for role summaries, top players, and context-gap diagnostics.
The implementation shares the same R, data.table, mgcv, and testthat pattern as the rest of the portfolio, which makes the research stack coherent rather than one-off.
Open-data coverage is uneven, role assignment is still broad, and team-context adjustment is built from one season-level strength proxy rather than a full player-level hierarchical model.
Uncertainty is largest for wingers with sparse repeated seasons, low-minute players near the threshold, and anyone whose role changed materially across seasons.
Add age curves, league-translation logic, and better possession/value features so the model becomes more useful for cross-context recruitment work.
A strong next public-facing case study would be a World Cup or transfer-shortlist module that uses this rating system rather than replacing it.