Runs Under Model — Methodology

How the GameDay Analytics runs-under model predicts the probability that a batter scores zero runs in a game — the Under 0.5 runs prop. Live values (active version, metrics, settings) are read directly from the model database; the surrounding narrative describes the fixed pipeline and decision logic.
{ } View as JSON ↗ Same content, machine-readable.

Today History Performance Methodology

1. What the model predicts

For each projected starting hitter the model estimates a single number: P(zero runs) — the probability the player scores zero runs in the game. That is exactly the Under 0.5 runs side of the batter-runs prop.

It is a binary classification problem. The training target is scored_zero ∈ {0, 1}: 1 means the player did not score (the Under wins), 0 means they scored one or more (the Under loses). The product of the model is a single calibrated probability in [0, 1], which is what every downstream edge calculation compares against the book.

2. Active model

Version tag

v1.0-lgbm active

Trained at

2026-06-27 21:22:18

Algorithm

LightGBM + isotonic

Target

scored_zero (0/1)

Brier

0.2245

Log loss

0.6436

AUC

0.5781

Base zero rate

64.8%

Test rows

3,733

Recent model versions

Version	Trained at	Active	Brier	Log loss	AUC
`v1.0-lgbm`	2026-06-27 21:22:18	active	0.2245	0.6436	0.5781

3. Feature groups

The probability blends five weighted groups of inputs, all computed from the current-season (2026) context at prediction time. Each group can be scaled independently via model settings.

on_base 6 features

The batter's own propensity to reach base and score — season and recent on-base rate, runs-scored rate, plate-appearance volume. The dominant driver of whether they score at all.

Member features
`player_season_obp`
`player_l7_obp`
`player_l15_obp`
`player_l15_reach_rate`
`player_season_k_pct`
`player_season_bb_pct`

lineup_context 3 features

Where the batter hits and the run-scoring context around them — batting slot, projected plate appearances, and the ability of the hitters behind them to drive them in.

Member features
`batting_slot`
`behind_obp`
`behind_slg`

pitcher_suppression 6 features

The opposing starter's ability to suppress baserunners and runs — strikeout rate, WHIP, ERA, and Statcast suppression signals.

Member features
`sp_obp_against`
`sp_k_pct`
`sp_runners_on_rate`
`sp_xwoba_against`
`sp_hardhit_pct_against`
`sp_barrel_pct_against`

environment 4 features

Park and game-environment factors: venue run environment, home/away, day/night, and park factors that shift offensive output.

Member features
`park_run_factor`
`is_home`
`is_night`
`temperature`

team_offense 2 features

The batter's team offensive strength and the opponent's run prevention — team runs per game, recent form, and opposing-staff context.

Member features
`team_runs_per_game`
`implied_team_total`

4. Calibration

The core learner is a LightGBM gradient-boosted classifier. Its raw scores rank player-games well but are not honest probabilities. So the raw scores are passed through an isotonic regression fit on a held-out slice of the data. Isotonic calibration is a monotonic, non-parametric mapping: it preserves the ranking while bending the curve so that a published 70% really does cash about 70% of the time.

The result is a trustworthy probability — the only kind that can be fairly compared against a sportsbook's implied probability. Reliability is monitored on the Performance page via a 10-bucket decile table (model probability vs observed zero-run rate).

Held-out metrics for the active version: Brier 0.2245 · log-loss 0.6436 · AUC 0.5781.

5. Edge & filtering

The sportsbook's Under odds are converted to an implied probability with the vig removed (de-vigged). The model's edge is the difference, in percentage points:

edge = model P(zero) − de-vigged implied P(under)

A player-game is surfaced as an UNDER pick only when both gates clear:

The Under odds are short enough — at or below -160 (the configured odds_threshold) — so the market itself favors the Under.
The edge clears the configured minimum of 3.0 pts (min_edge_threshold).

The model is UNDER-only: it is trained on the zero-runs event, and the asymmetric 0.5 line makes the Under the only directionally-meaningful side. Anything that fails either gate is recorded as no_pick. Thresholds live in ru_model_settings (append-only — every change inserts a new row).

6. Lock & grading

Predictions update through the day as lineups and lines move, then lock shortly before first pitch on a confirmed lineup. A pick is voided if the player's lineup spot is never confirmed, the player is scratched, or they end up a DNP — voids are zero units and are excluded from the bankroll record. Once the game is final, each pick is graded against whether the player actually scored:

Result	Condition	Units
`win`	Player scored zero runs (`scored_zero=1`). The Under cleared.	+unit × (odds payout factor)
`loss`	Player scored one or more runs (`scored_zero=0`). The Under missed.	−unit
`void`	Unconfirmed lineup, scratch, DNP, or game postponed/cancelled.	0.00

7. Sufficiency gate

Recommendations stay hidden on the public slate until the model has proven itself on a live sample. Both conditions must hold before picks publish:

At least 150 graded picks (sufficiency_min_graded).
A live Brier score at or below 0.18 (sufficiency_brier_ceiling).

published = (graded_count ≥ min_graded) AND (brier_live ≤ brier_ceiling)

Until the gate clears, the Today and Performance pages show probabilities and lines but no UNDER call-to-action, alongside a yellow “Building sample” banner that reports progress.

Note: brier_live is mean((model_prob − scored_zero)²) over graded UNDER picks, and is null until at least one pick is graded.

8. Limitations

Single book — lines and odds are read from one sportsbook; the model does not shop the line across books for the best available Under price.
No platoon-specific behind-hitters split — the lineup-context group captures who bats behind the hitter, but not the platoon-specific (vs-LHP / vs-RHP) production of those run producers against the listed starter.
Weather is best-effort / often unavailable — temperature, wind, and humidity are not reliably in the feature set; the environment group leans on venue and park factors.

9. Live settings

All values below are read live from ru_model_settings (append-only — the most recent row per key wins).

Key	Value	Last updated
`default_book`	`Hard Rock Bet`	2026-06-27 02:40:49
`feature_weight_overrides`	`{"on_base":1,"lineup_context":1,"pitcher_suppression":1,"environment":1,"team_offense":1}`	2026-06-27 02:40:49
`min_edge_threshold`	`3`	2026-06-27 02:40:49
`odds_threshold`	`-160`	2026-06-27 02:40:49
`poll_lead_minutes`	`120`	2026-06-27 02:40:49
`sufficiency_brier_ceiling`	`0.18`	2026-06-27 02:40:49
`sufficiency_min_graded`	`150`	2026-06-27 02:40:49
`unit_size`	`25`	2026-06-27 02:40:49

All numeric values above are live reads from the model database.