Runs Under Model — Methodology
How the GameDay Analytics runs-under model predicts the probability that a batter
scores zero runs in a game — the Under 0.5 runs prop. Live values (active version,
metrics, settings) are read directly from the model database; the surrounding narrative
describes the fixed pipeline and decision logic.
{ } View as JSON ↗
Same content, machine-readable.
1. What the model predicts
For each projected starting hitter the model estimates a single number: P(zero runs) — the probability the player scores zero runs in the game. That is exactly the Under 0.5 runs side of the batter-runs prop.
It is a binary classification problem. The training target is
scored_zero ∈ {0, 1}: 1 means the
player did not score (the Under wins), 0 means they scored one
or more (the Under loses). The product of the model is a single
calibrated probability in [0, 1], which is what every downstream
edge calculation compares against the book.
2. Active model
Recent model versions
| Version | Trained at | Active | Brier | Log loss | AUC |
|---|---|---|---|---|---|
v1.0-lgbm |
2026-06-27 21:22:18 | active | 0.2245 | 0.6436 | 0.5781 |
3. Feature groups
The probability blends five weighted groups of inputs, all computed from the current-season (2026) context at prediction time. Each group can be scaled independently via model settings.
on_base 6 features
The batter's own propensity to reach base and score — season and recent on-base rate, runs-scored rate, plate-appearance volume. The dominant driver of whether they score at all.
| Member features |
|---|
player_season_obp |
player_l7_obp |
player_l15_obp |
player_l15_reach_rate |
player_season_k_pct |
player_season_bb_pct |
lineup_context 3 features
Where the batter hits and the run-scoring context around them — batting slot, projected plate appearances, and the ability of the hitters behind them to drive them in.
| Member features |
|---|
batting_slot |
behind_obp |
behind_slg |
pitcher_suppression 6 features
The opposing starter's ability to suppress baserunners and runs — strikeout rate, WHIP, ERA, and Statcast suppression signals.
| Member features |
|---|
sp_obp_against |
sp_k_pct |
sp_runners_on_rate |
sp_xwoba_against |
sp_hardhit_pct_against |
sp_barrel_pct_against |
environment 4 features
Park and game-environment factors: venue run environment, home/away, day/night, and park factors that shift offensive output.
| Member features |
|---|
park_run_factor |
is_home |
is_night |
temperature |
team_offense 2 features
The batter's team offensive strength and the opponent's run prevention — team runs per game, recent form, and opposing-staff context.
| Member features |
|---|
team_runs_per_game |
implied_team_total |
4. Calibration
The core learner is a LightGBM gradient-boosted classifier. Its raw scores rank player-games well but are not honest probabilities. So the raw scores are passed through an isotonic regression fit on a held-out slice of the data. Isotonic calibration is a monotonic, non-parametric mapping: it preserves the ranking while bending the curve so that a published 70% really does cash about 70% of the time.
The result is a trustworthy probability — the only kind that can be fairly compared against a sportsbook's implied probability. Reliability is monitored on the Performance page via a 10-bucket decile table (model probability vs observed zero-run rate).
Held-out metrics for the active version: Brier 0.2245 · log-loss 0.6436 · AUC 0.5781.
5. Edge & filtering
The sportsbook's Under odds are converted to an implied probability with the vig removed (de-vigged). The model's edge is the difference, in percentage points:
edge = model P(zero) − de-vigged implied P(under)
A player-game is surfaced as an UNDER pick only when both gates clear:
- The Under odds are short enough — at or below -160
(the configured
odds_threshold) — so the market itself favors the Under. - The edge clears the configured minimum of
3.0 pts
(
min_edge_threshold).
The model is UNDER-only: it is trained on the zero-runs event, and the asymmetric
0.5 line makes the Under the only directionally-meaningful side. Anything that fails either
gate is recorded as no_pick. Thresholds live in
ru_model_settings (append-only — every change inserts a new row).
6. Lock & grading
Predictions update through the day as lineups and lines move, then lock shortly before first pitch on a confirmed lineup. A pick is voided if the player's lineup spot is never confirmed, the player is scratched, or they end up a DNP — voids are zero units and are excluded from the bankroll record. Once the game is final, each pick is graded against whether the player actually scored:
| Result | Condition | Units |
|---|---|---|
win | Player scored zero runs (scored_zero=1). The Under cleared. | +unit × (odds payout factor) |
loss | Player scored one or more runs (scored_zero=0). The Under missed. | −unit |
void | Unconfirmed lineup, scratch, DNP, or game postponed/cancelled. | 0.00 |
7. Sufficiency gate
Recommendations stay hidden on the public slate until the model has proven itself on a live sample. Both conditions must hold before picks publish:
- At least 150 graded picks
(
sufficiency_min_graded). - A live Brier score at or below
0.18
(
sufficiency_brier_ceiling).
published = (graded_count ≥ min_graded) AND (brier_live ≤ brier_ceiling)
Until the gate clears, the Today and Performance pages show probabilities and lines but no UNDER call-to-action, alongside a yellow “Building sample” banner that reports progress.
brier_live is
mean((model_prob − scored_zero)²) over graded UNDER picks, and is
null until at least one pick is graded.
8. Limitations
- Single book — lines and odds are read from one sportsbook; the model does not shop the line across books for the best available Under price.
- No platoon-specific behind-hitters split — the lineup-context group captures who bats behind the hitter, but not the platoon-specific (vs-LHP / vs-RHP) production of those run producers against the listed starter.
- Weather is best-effort / often unavailable — temperature, wind, and humidity are not reliably in the feature set; the environment group leans on venue and park factors.
9. Live settings
All values below are read live from ru_model_settings (append-only —
the most recent row per key wins).
| Key | Value | Last updated |
|---|---|---|
default_book |
Hard Rock Bet |
2026-06-27 02:40:49 |
feature_weight_overrides |
{"on_base":1,"lineup_context":1,"pitcher_suppression":1,"environment":1,"team_offense":1} |
2026-06-27 02:40:49 |
min_edge_threshold |
3 |
2026-06-27 02:40:49 |
odds_threshold |
-160 |
2026-06-27 02:40:49 |
poll_lead_minutes |
120 |
2026-06-27 02:40:49 |
sufficiency_brier_ceiling |
0.18 |
2026-06-27 02:40:49 |
sufficiency_min_graded |
150 |
2026-06-27 02:40:49 |
unit_size |
25 |
2026-06-27 02:40:49 |
All numeric values above are live reads from the model database.