Statistical Computing for Traders: Time Series, Stationarity and Honest Backtest Stats
Most trading "research" dies not from a bad idea but from bad statistics. Statistical computing — using tools like R or Python to analyse market data rigorously — is what separates a strategy you can trust from a curve you fooled yourself with. This is a practical tour of the concepts that matter, and the mistakes that quietly ruin results.
The tools
Use whichever you will actually be disciplined in. The statistics are identical; only the syntax changes.
Stationarity: the concept that breaks most backtests
A price series is non-stationary — its mean and variance wander over time. Most statistical methods assume the opposite. Run a regression or a correlation on raw prices and you will find impressive relationships that are pure spurious correlation: two unrelated rising series look "cointegrated" simply because both trend up.
Time-series tools worth knowing
Honest backtest statistics
A single equity curve tells you almost nothing. Demand the numbers that reveal fragility:
The silent killers
Bottom line
Statistical computing is not academic decoration — it is your defence against fooling yourself. Make series stationary before you model them, prefer cointegration to correlation, report risk-adjusted stats with their sample size, and treat every impressive result as guilty until proven robust out-of-sample. The market is the harshest peer reviewer there is.
What does your statistical checklist look like before a strategy goes live? Share your must-run tests.
Most trading "research" dies not from a bad idea but from bad statistics. Statistical computing — using tools like R or Python to analyse market data rigorously — is what separates a strategy you can trust from a curve you fooled yourself with. This is a practical tour of the concepts that matter, and the mistakes that quietly ruin results.
The tools
- R is built by statisticians for statisticians. For time-series work, hypothesis testing and clean plots, packages like xts, quantmod, forecast and PerformanceAnalytics are hard to beat.
- Python wins on integration and scale: pandas, NumPy, statsmodels and scikit-learn cover the same ground and plug straight into execution and machine-learning stacks.
Use whichever you will actually be disciplined in. The statistics are identical; only the syntax changes.
Stationarity: the concept that breaks most backtests
A price series is non-stationary — its mean and variance wander over time. Most statistical methods assume the opposite. Run a regression or a correlation on raw prices and you will find impressive relationships that are pure spurious correlation: two unrelated rising series look "cointegrated" simply because both trend up.
- Difference or use returns instead of price levels to get something closer to stationary.
- Test it with an Augmented Dickey-Fuller (ADF) or KPSS test before trusting any model built on the series.
- For pairs trading, test for genuine cointegration (e.g. Engle-Granger / Johansen), not just correlation.
Time-series tools worth knowing
- ARIMA for linear autocorrelation structure.
- GARCH for volatility clustering — calm and stormy periods cluster, and modelling that is often more reliable than predicting direction.
- Autocorrelation/partial-autocorrelation plots to see what structure actually exists before you model it.
Honest backtest statistics
A single equity curve tells you almost nothing. Demand the numbers that reveal fragility:
- Sharpe and Sortino for risk-adjusted return; report the sample size behind them.
- Maximum drawdown and time-to-recover — the pain you must survive.
- Trade count and statistical significance — 20 trades cannot support strong claims.
- Confidence intervals / bootstrapping. Resample your returns to ask: how much of this could be luck?
The silent killers
- Multiple-testing bias. Test 200 ideas and a few will look brilliant by chance. The more you try, the higher your bar for significance must be.
- Survivorship bias. Backtesting only today's surviving stocks ignores everything that went to zero — flattering and false.
- Look-ahead bias. Using data that was not available at decision time. The most common and most embarrassing error.
- Data snooping. Tweaking until the backtest looks good is just overfitting with extra steps.
Bottom line
Statistical computing is not academic decoration — it is your defence against fooling yourself. Make series stationary before you model them, prefer cointegration to correlation, report risk-adjusted stats with their sample size, and treat every impressive result as guilty until proven robust out-of-sample. The market is the harshest peer reviewer there is.
What does your statistical checklist look like before a strategy goes live? Share your must-run tests.