Skip to content

Learn · Model transparency

How NBA Betting Models Are Built and Tested

A betting model is only useful if you understand what it does, what it does not do, and how it has been evaluated. This page explains the approach behind Linelabs and what honest model transparency looks like.

What a betting model does

It estimates probability — nothing more.

A sports betting model takes a set of input features about a game and outputs a probability. "This team has a 57% chance of covering the spread." The model does not predict the future. It estimates likelihoods based on patterns in historical data.

That probability is then compared to the probability implied by the current betting line. If the model says 57% and the market implies 50%, there is a potential edge. That gap is the basis for expected value — but only if the model is well-calibrated and the estimate is trustworthy.

A model that is overconfident — that says 70% when the true probability is 55% — is more dangerous than no model at all. It produces inflated EV estimates that lead to overbetting on edges that do not exist.

The features that matter for NBA

Form, rest, pace, and market signals carry the most weight.

Rolling team form

Win rate, points scored and allowed, and ATS record over the last 10–15 games. Recent form is more predictive than season averages for spread outcomes.

Rest and schedule

Days of rest between games, games played in the last 14 days, and whether a team is on the second night of a back-to-back. Rest deltas between teams matter more than rest in isolation.

Travel and time zone

Cross-country travel and east-to-west timezone shifts affect performance, particularly in the first quarter of games. These effects are small but detectable at scale.

Market-derived features

Opening spread, current spread, movement from open to current, and the implied win probability from the line itself. The market encodes sharp opinion — ignoring it is a mistake.

Walk-forward backtesting

The right way to evaluate a sports betting model.

Standard backtesting trains a model on historical data and evaluates it on the same data. That produces impressive numbers that mean nothing — the model has already seen the answers.

Walk-forward backtesting simulates how the model would actually be used: train on weeks 1 through N, predict week N+1, advance the window, repeat. Each prediction is made on data the model has never seen. The evaluation is honest because it mirrors the real deployment setting.

No lookahead bias

Features are constructed using only data available before each game, not after. A common mistake is including statistics that could not have been known at bet time.

Regime labeling

NBA early, mid, and late-season dynamics differ. Walk-forward evaluation across full seasons captures whether the model degrades in the playoffs or during schedule congestion periods.

Small sample honesty

An 82-game NBA season produces maybe 400 total testable spread and total outcomes per team. Real backtests have small samples. Confidence intervals should be wide, and results should be presented with that context.

Calibration vs accuracy

Accuracy tells you if the model is right. Calibration tells you if you can trust the probabilities.

A model with 56% accuracy on spread outcomes is useful. But the more important question is: when the model says 60%, does that happen 60% of the time?

Calibration is measured using tools like the Brier score and reliability diagrams. A well-calibrated model allows you to use probability estimates directly in EV calculations. An uncalibrated model produces numbers that look precise but are not.

Linelabs applies probability calibration (logistic regression on recent outcomes) after the base XGBoost model is trained. The goal is to ensure the 60% predictions win close to 60% of the time — not to inflate confidence.

What models cannot do

Randomness is real and persistent.

A 55% win rate on spread bets means you lose 45% of the time. Over 100 bets, variance can produce 15–20 bet losing streaks even at positive EV. Models do not eliminate downswings — they shift the long-run expectation.

Models cannot account for information that is not yet in the data: a player who is unexpectedly limited minutes into a game, a coach's tactical adjustment mid-series, or a team resting stars with a lead late in the fourth. These are irreducible uncertainties.

The appropriate response is not to abandon modeling but to keep position sizes proportional to your actual edge — and to never mistake a winning week for validation of an unchecked model.

The Linelabs model in plain terms

Algorithm

XGBoost gradient boosting classifier. Separate models for spread and total outcomes. Trained weekly on the current season plus historical data.

Validation method

Walk-forward cross-validation by week. Each week is predicted using only prior data. No full-season train/test splits that would produce optimistic results.

Calibration

Logistic regression calibration layer applied after the base model. Output probabilities are adjusted to better reflect true win rates in the target confidence range.

Transparency

Model accuracy, Brier score, log loss, and ROI are surfaced on the results page with sample size shown. These numbers are updated weekly and are not selectively reported.

Continue learning