IBP
INDEPENDENT BASEBALL PROJECTIONS
Model Architecture · Technical Reference · Dual-Poisson Win Probability Engine
33-Signal Model
33 Log-Odds Adj.
Dual-Poisson
Platt Calibrated
CLV Tracked
7,359 Games OOS
How Independent Baseball Projections Works — At a Glance
Curious how the model is priced against the market today?
View Today's Picks →
Model Type
Dual-Poisson
λ_home + λ_away → CDF
Feature Layers
4 · 33 signals
pitcher / sit. / market
Adjustments
33 log-odds
logit(p) += Σ αᵢ
Calibration
Platt sigmoid
A = 0.590 · B = +0.025
Bet Filter
edge ≥ 5%
vs. no-vig Pinnacle
Sizing Method
½-Kelly / 3u
bankroll fraction
OOS Period
2022 – 2024
3 seasons, sequential
Brier Score
0.234
vs. 0.250 market · ↓ better
ℹ️
Independent Baseball Projections is a market-informed model

Raw team strength, pitching, lineup, park, weather, and situational signals generate the baseline projection. Market information — specifically the book O/U total and Pinnacle no-vig probability — is used as a stabilizing input (via the book_total_constraint factor, α₁₆) to reduce extreme run-total outputs and align the model with the sharp betting market's run environment. After Platt calibration, the final Independent Baseball Projections probability is compared against the no-vig Pinnacle market probability to identify remaining pricing gaps where the model and market diverge.

Because the market is used as both a stabilizing input and a comparison benchmark, Independent Baseball Projections should be understood as a market-informed fair-line model rather than a fully market-independent projection. This is disclosed transparently; the model still identifies genuine pricing gaps in approximately 15–25% of games per day.

Inference Pipeline — Input → Transform → Output at Each Stage
01
📡
Data Ingestion
IN
Cron trigger · game schedule · ump assignments · weather coords
OUT
Raw JSON: odds, lineups, weather, ump_id, FIP data per game
02
⚙️
Feature Eng.
IN
Raw JSON dicts per game
OUT
30 float features (home/away): xFIP, xERA, K-BB%, OPS splits, BvP, arsenal fit, park, weather, ump…
03
λ
λ Construction
IN
RS₃ᵧᵣ, RS_L15, park_factor, book O/U
OUT
(λ_home, λ_away) floats — expected runs per team
04
📊
Poisson Win Prob
IN
(λ_home, λ_away)
OUT
p_base ∈ (0,1) — exact CDF summation
05
Σα
Log-Odds Adj.
IN
p_base + 33 αᵢ float signals
OUT
p_adj — logit(p_base)+Σαᵢ → sigmoid
06
σ
Platt Calib.
IN
p_adj (raw model output)
OUT
p_cal — σ(0.590·logit(p_adj)+0.025)
07
Edge Detection
IN
p_cal + p_nv_Pinnacle
OUT
edge float; bet_flag if ≥ 0.04
08
💰
Kelly Sizing
IN
edge, decimal_odds, bankroll
OUT
stake (units); cap 3u; log → picks_log.csv
Core Mathematics — Formulas Behind Each Pipeline Stage
λ Run Expectancy Construction
Base lambda — form-blended, park-adjusted
λ_base = (0.65 × RS₃ᵧᵣ + 0.35 × RS_L15) / G × park_factor // RS₃ᵧᵣ: 3yr rolling RS/G · RS_L15: last 15G avg
Log-odds stacking — 33 factors, logit space
logit(p) = log(p / (1 p)) logit(p_adj) = logit(p_base) + Σᵢ αᵢ // 33 αᵢ ∈ ℝ · logit prevents p ∉ (0,1) p_adj = sigmoid(logit(p_adj))
65/35 blend weights recent L15 performance to capture hot/cold streaks without overreacting to small samples. Park adjustment applied symmetrically to both offensive and defensive λ.
📊 Poisson Distribution & Win Probability
Run distributions — modeled independently
R_home ~ Poisson(λ_home) R_away ~ Poisson(λ_away) P(X=k ; λ) = exp(λ) · λ^k / k!
Win probability — exact CDF (no simulation)
P(home wins) = Σ_{k=1}^∞ P(R_home=k) · CDFPois(λ_away, k1) // Tie → extra innings as 50/50
No Monte Carlo needed — win prob is computed exactly from Poisson CDFs. Teams are modeled as independent (no run correlation). Pitcher, park, and lineup signals enter via λ; 33 log-odds factors adjust the derived win probability directly.
σ Platt Scaling Calibration
Sigmoid recalibration — fitted OOS 2022–2024
p_cal = σ(A · logit(p_raw) + B) = 1 / (1 + exp((0.590 · logit(p_raw) + 0.025)))
Parameter interpretation
A = 0.590 // < 1: model over-expresses // confidence; compress toward 50% B = +0.025 // home-team bias correction
Fitted via MLE on 3yr OOS outcomes. Brier Score: 0.234 model vs. 0.250 market — 6.4% improvement. A < 1 confirms the raw Poisson systematically over-prices favorites; calibration compresses the logit toward the empirical base rate.
⚖️ No-Vig Market Probability Strip
American odds → raw implied probability
p_raw(ml) = { |ml| / (|ml| + 100) if ml < 0 (favorite) 100 / (ml + 100) if ml > 0 (underdog) }
Simultaneous both-side margin removal
hold = p_raw(h_ml) + p_raw(a_ml) 1 p_nv_home = p_raw(h_ml) / (p_raw(h_ml) + p_raw(a_ml)) // Pinnacle benchmark: hold ≈ 2.5%
Simultaneous stripping avoids asymmetric vig attribution — equivalent to the multiplicative method. Pinnacle's ~2.5% hold makes its no-vig the tightest available reflection of sharp market consensus.
⚡ Edge Detection & Stake Sizing
Model edge vs. no-vig Pinnacle
edge = p_cal p_nv_Pinnacle // Bet flagged if edge ≥ 0.05
Fractional Kelly sizing — conservative multi-step reduction
b = decimal_odds 1 p = p_cal ; q = 1 p f* = 0.5 × (b · p q) / b // units = f* × bankroll ; cap: 3u
Stake sizes use a conservative fractional Kelly approach: base fraction × verdict multiplier × display fraction. CONDITIONAL picks (4–6pp) are full-sized; REDUCED CONF. picks are sized at half the normal rate; FLAGGED picks at ¾. The 3u cap prevents over-sizing on high-edge outliers.
📈 Closing Line Value — Process Validation
Definition — Pinnacle no-vig both sides
CLV = p_close_nv p_open_nv // Snapshot: pre-game closing line (CT)
Interpretation
CLV > 0 → market moved toward pick CLV < 0 → market moved against pick // +1% avg CLV ≈ long-run positive EV // Separates skill from short-run variance
CLV is the gold standard for betting process validation — a model can run cold for 50 games but still show positive CLV, confirming the signal is real. Tracked per-bet in picks_log.csv, surfaced via --poster-stats.
Signal Importance — Avg. Abs. Log-Odds Contribution per Factor (αᵢ) · Relative Magnitudes Approximate · v2 = 2026 update
Importances reflect the model's design; magnitudes are approximate. Several factors marked inactive below do not currently affect live picks — retired after backtest validation (bullpen fatigue, pitcher-form slope) or awaiting data wiring (catcher framing, bullpen availability, career H2H matchup). In practice, live discrimination is concentrated in the pitcher-xFIP and run-environment core; the situational adjustments are deliberately small.
⚾ Pitcher Layer
5 factors · α₁–α₅
α₁
Starter xFIP matchup
0.18
α₅
Bullpen xFIP tiered
0.10
α₂
Platoon FIP split
0.09
α₃
SwStr% plate discipline
0.05
α₄
Pitcher form slope L5 · inactive
off
🌤 Situational Layer
6 factors · α₆–α₁₁
α₆
Park factor
0.12
α₁₀
Defensive OAA → RAA
0.07
α₉
Umpire run tendency
0.06
α₇
Weather — wind carry
0.05
α₁₁
Catcher framing runs · inactive
off
α₈
Weather — temperature
0.03
📈 Market & Lineup Layer
7 factors · α₁₂–α₁₈
α₁₂
Lineup OPS split (hand)
0.11
α₁₆
O/U market constraint
0.08
α₁₃
Lineup quality delta
0.07
α₁₅
Home field advantage
0.06
α₁₇
Rolling RS/RA form
0.05
α₁₄
Bullpen availability · inactive
off
α₁₈
Rest days differential
0.03
⚡ Advanced Signals
15 factors · α₁₉–α₃₃
α₁₉
Baserunning quality (BsR) — lambda multiplier
0.03
α₂₀
Travel fatigue / back-to-back scheduling
0.02
α₂₁
Career H2H matchup ERA vs opponent · inactive
off
α₂₂
Statcast xERA blend
0.09
α₂₇
Bullpen 14-day rolling xFIP
0.07
α₂₄
K-BB% command signal
0.06
α₂₈
BvP career matchup
0.05
α₂₃
Times-through-order penalty
0.04
α₂₉
Arsenal fit score
0.03
α₂₅/₂₆
Opener / lineup certainty shrinkage
0.02
α₃₀
Stuff+ composite (FanGraphs sp_pitching)
0.02
α₃₂
Recent pitcher form — ERA last 3 starts
0.02
α₃₁
Precipitation probability shrinkage
0.01
α₃₃
Line movement — intraday Pinnacle sharp-money signal
0.02
Feature Engineering — 33-Signal Architecture Across 4 Layers (v2/v3: 10 new signals added 2026 · a few currently inactive, see note above)
Pitcher Layer
MLB Stats API · Baseball Savant
starter_xfip
Regressed ERA estimator — normalizes FIP by league fly-ball rate, removes BABIP luck and HR/FB variance. Primary pitcher quality signal, computed in-house from Statcast batted-ball data + MLB Stats components so it stays reliable (FanGraphs used when reachable). Blended 70% individual / 30% league average to reduce overconfidence at extremes.
pitcher_quality_xfip v4
Display-only approximation of the starter's xFIP contribution to the Poisson λ ratio. Shows the pick team's starter quality advantage vs opponent — e.g. +1.9pp when facing a 4.77 xFIP opponent with a 2.31 xFIP starter. Not a separate log-odds adjustment (would double-count); captures the Poisson core contribution explicitly.
bullpen_quality_xfip v4
Same approximation applied to the bullpen innings share (~42% at default projected IP). Surfaces bullpen quality differential as a visible signal when one team's 'pen is meaningfully better than the other's.
projected_starter_ip v4
Phase 1 IP projection: weighted blend of season IP/start (60%), last-5 starts (25%), and last-3 starts (15%) with Bayesian smoothing toward 5.3 IP league average. Rest adjustment (−0.4 short rest to +0.1 extra rest) and pitch-count fatigue applied. Controls the starter/bullpen xFIP split weight — a starter projected for 6.5 IP weights the starter signal at 72% vs 28% bullpen.
xera_blend v2
Statcast xERA (contact quality from exit velocity/launch angle) blended with xFIP — up to 30% weight at 300+ PA faced. Captures contact suppression independent of K/BB.
k_minus_bb_pct v2
K%−BB% (MLB Stats API) — direct command + dominance signal. Complements SwStr% by capturing walk suppression that swinging-strike rate misses.
tto_penalty v2
Times-through-order degradation: ~0.25 runs/TTO beyond the first. Batters adapt; a starter projected for 7+ IP faces measurably higher opponent scoring late.
platoon_fip_split
LHH/RHH xFIP delta weighted by opposing lineup handedness % (per-batter split, not team-level).
swstr_pct
Swinging-strike rate from Baseball Savant. Leading indicator for K% — predicts FIP before results converge.
bullpen_xfip_tiered
Closer (35%) / Setup (25%) / Middle (40%) xFIP blend. Fatigue-adjusted by recent appearance counts per tier.
bullpen_recent_xfip v2
14-day rolling bullpen xFIP blended 30/70 with season average. Captures in-season bullpen volatility that season-long averages smooth over.
arsenal_fit v2
Statcast pitch-mix matchup score (−1 to +1): how well the starter's pitch category usage (power FB, breaking, offspeed) suppresses the opposing lineup's handedness profile.
pitcher_form_slope inactive
OLS slope of xFIP over last 5 starts. Currently inactive — retired after the 5-start slope showed no reliable directional edge (mean-reversion; market already prices recent form).
stuff_plus v3
FanGraphs sp_pitching composite (100 = avg). Measures raw pitch quality — velocity, movement, release point — independently of outcomes. A small, best-effort signal (weight ~0.02) sourced only from FanGraphs; omitted when that source is unavailable.
recent_pitcher_era v3
Actual ERA over last 3 starts vs. season xFIP baseline. Captures hot/cold streaks and mechanical changes that peripheral stats deliberately strip out.
🌤
Situational Layer
Savant · Open-Meteo · UmpScoreCards
park_factor
3yr static anchor blended with running 2026 RS/RA splits at ballpark GPS coordinates — updated daily.
weather_run_delta
Wind mph × bearing → HR carry/suppress; temp °F; humidity at first pitch. Converted to expected Δruns/9.
umpire_run_factor
Historical run-impact mean per ump (142 tracked). Tight-zone umps suppress scoring; wide-zone umps inflate λ.
def_oaa_delta
Savant OAA (outs above average) → runs above average vs. league mean → win probability delta.
catcher_framing_runs inactive
Savant pitch-level framing runs (called-strike prob over replacement) → run delta per game. Currently inactive — framing data not yet wired into the live pipeline.
precip_probability v3
Open-Meteo forecast precipitation probability (0–1) at game time. Above 30%: shrinks adjustment stack up to 12%, reflecting higher environmental variance. Domed stadiums unaffected.
📈
Lineup & Market Layer
MLB Stats API · Pinnacle / Odds API
lineup_ops_split
Per-batter OPS vs. LHP/RHP from posted lineups only. Weighted OPS delta vs. team season mean, matched to starter's hand.
lineup_quality_delta
Today's lineup aggregate OPS vs. team season average. Detects rest-day lineups and injury absences.
lineup_certainty v2
When lineups aren't confirmed at pick time, the adjustment stack is shrunk 5% toward the Poisson base — correctly reducing confidence without distorting the run-environment estimate.
bvp_career_matchup v2
Career batter-vs-pitcher xwOBA from Baseball Savant, PA-weighted regression toward league average. Capped ±1pp per batter, ±2.5pp per team. Captures genuine matchup history without overfitting small samples. Displayed on pick cards as Career Matchup.
starter_rest_adj
Days since last start + prior pitch count penalty. Captures rest-days fatigue only — pitch quality (xFIP) is handled separately via the λ ratio. Displayed as Starter Rest on pick cards. Short rest (≤3 days): −2.5pp. Extra rest: +0.5pp. Prior start 105+ pitches: additional penalty.
bullpen_availability inactive
Binary fatigue flag: closer/setup used 2+ times in last 3 days → tier downgrade in bullpen xFIP blend. Currently inactive — availability data not yet wired into the live pipeline.
opener_shrinkage v2
When a starter is TBD or listed as an opener, the full adjustment stack is shrunk 20% — the xFIP-based projection is unreliable and the model should not over-project on opener games.
novig_pinnacle_prob
Pinnacle moneyline de-juiced via simultaneous margin strip. Sharpest market benchmark — hold ≈ 2.5%.
book_total_constraint
Market O/U λ-scales the Poisson simulation — prevents model run totals from diverging sharply from the sharp total market.
From Model to Pick — A Practical Example
1
Dual-Poisson base
Team RS/RA, park factor, and starter xFIP generate λ_home = 4.2 runs, λ_away = 3.8 runs → P(home wins) = 53.1%
2
33 log-odds adjustments
Home starter Stuff+ 118 (+0.3pp), lineup quality +0.8pp, TTO penalty −0.4pp, umpire wide zone +0.2pp, BvP +0.5pp → net +1.4pp adjustment
3
Platt calibration
Logit-scaled by fitted slope 0.587 + intercept 0.021 → final model probability: 54.5%
4
Market comparison
Pinnacle no-vig says home team wins 50.3%. Model says 54.5%. Gap = +4.2% edge → pick is posted at +148 best available odds.
This pick's signal breakdown appears in the Details section of each pick card on the Today's Picks page. CLV is tracked at close to verify the market agreed with the model's direction.
Out-of-Sample Validation — 2022–2024 · Flat $100/Bet · 5%+ Edge Threshold
Games
Backtested
7,359
2022–2024 · OOS only
Sharpe
Ratio
1.82
annualized · risk-adj.
Max
Drawdown
−12.4u
peak-to-trough worst run
Brier
Score
0.234
vs. 0.250 market · ↓ better
+10.4% — Improving Every Out-of-Sample Year
Flat $100/bet · 5%+ edge signals only · vs. −4.0% baseline (bet every game, full vig)
2022
+0.7%
975 bets · +6.8u
2023
+9.1%
1,093 bets · +99.5u
2024
+10.4%
977 bets · +101.6u
Baseline
−4.0%
all games · vig drain
Other Markets — Run Line & Over/Under Calibration
🏃 Run Line Calibration
Platt scaling — RL-specific constants, fitted OOS 2022–2024
p_rl_cal = σ(A_rl · logit(p_rl_raw) + B_rl) = 1 / (1 + exp((0.8188 · logit(p_rl_raw) 0.1035)))
Parameter interpretation
A_rl = 0.8188 // < 1: slight overconfidence; // less shrinkage than ML model B_rl = 0.1035 // small away-team correction
Backtest results — 7,237 games, OOS 2022–2024
Home cover rate: 35.7% actual vs 35.6% model Direction acc.: 64.7% (better side predicted) Away +1.5, 5% edge, 130 juice: Model ROI: +22.3% // vs +13.8% blind baseline Win rate: 67.2% // vs 65.5% at ≥0% edge Away +1.5 win rate vs model edge: 0% edge 65.5% 5% edge 67.2% 8% edge 68.4% 10% edge 69.6% // genuine discriminatory power
RL edge = p_rl_cal − p_nv_Pinnacle_RL. Bet flagged when edge ≥ 5%. CLV tracked separately via Pinnacle spreads pre-game closing line snapshot. Home −1.5 volume is low (<40 bets at ≥8% edge over 3 seasons) — treated as a specialty bet requiring high model confidence.
📊 Over/Under Calibration
Platt scaling — O/U constants, fitted OOS 2022–2024
p_over_cal = σ(A_ou · logit(p_over_raw) + B_ou) = 1 / (1 + exp((0.5268 · logit(p_over_raw) 0.2020)))
Parameter interpretation
A_ou = 0.5268 // heavy shrinkage toward 50% — // model significantly over-expresses // confidence on totals B_ou = 0.2020 // corrects systematic over-prediction // on high-scoring games
Status — informational, not separately bet
Synthetic-line ROI @ 0% edge: +3.0% Synthetic-line ROI @ 2% edge: 1.4% Bias by total line: < 7.5 runs: 1.06 run underestimate 10.0 runs: +0.81 run overestimate // O/U market prices now captured live (real // over/under odds shown in Other Markets card). // Standalone O/U bets deferred until live // edge evidence accumulates over several weeks.
Real O/U book prices (best execution + Pinnacle no-vig benchmark) are fetched daily and displayed in each pick card's Other Markets section. The heavy Platt shrinkage (A=0.527 vs A=0.590 for ML) reveals the Poisson model has weaker calibration on run totals — likely due to left-tail inflation (low-scoring games more common than Poisson predicts at high-k values).
IBP MLB
IBP NFL
IBP NBA
IBP NHL