Monte Carlo and Bootstrap: Stress-Testing Your Strategy's Equity Curve

Monte Carlo and Bootstrap: Stress-Testing Your Strategy's Equity Curve

A backtest produces exactly one equity curve — one ordering of trades, one drawdown, one final number. The market could easily have dealt the same edge in a different order and handed you a far uglier ride. Monte Carlo and bootstrap methods exist to answer the question a single backtest cannot: how lucky was that one path, and how bad could it realistically have been?

Why one equity curve lies to you
Your reported max drawdown is the worst slump that happened to occur in your historical trade order. Reshuffle the same trades and you will often find a drawdown twice as deep. If your position sizing assumes the historical drawdown is the worst case, you are under-capitalized and do not know it yet.

Trade-order resampling (the bootstrap)
The simplest and most honest technique:

Take your list of historical trade returns.
Resample them — shuffle the order, or draw with replacement — to build thousands of alternative equity curves from the same edge.
Collect the distribution of outcomes: median return, 5th-percentile return, the distribution of maximum drawdowns, and how often the account would have been ruined.

Now instead of "max drawdown was 18%" you can say "95% of plausible orderings stayed above a 31% drawdown" — a number you can actually size against.

Monte Carlo on the model, not just the trades
Resampling trades assumes the trades themselves are representative. You can go further and simulate the process: model returns with a fitted distribution (fat tails, not Gaussian — markets have them), or block-bootstrap to preserve autocorrelation and volatility clustering that naive shuffling destroys. Block methods matter: if your strategy depends on trends, shuffling individual days erases the very structure it trades.

What to actually measure

Risk of ruin / drawdown distribution. Not the single historical drawdown — the whole distribution of plausible ones.
Confidence interval on Sharpe. A Sharpe of 1.4 from 60 trades has enormous error bars. Bootstrapping shows you whether it is distinguishable from zero.
Probability of a losing year. Far more honest than a single backtested annual return.

The honesty checks that matter most

Multiple-testing correction. If you tried 200 strategies and kept the best, its Sharpe is inflated by selection. The "deflated Sharpe ratio" and similar adjustments exist precisely for this.
Fat tails. Resampling from a normal distribution will understate tail risk badly. Use the empirical distribution or one with heavy tails.
Dependence. If returns are autocorrelated, plain bootstrap overstates your certainty. Use block bootstrap.

Bottom line
A single backtest is a single sample from a noisy process, and treating it as destiny is how accounts blow up on a drawdown the backtest "never showed." Monte Carlo and bootstrap turn that one sample into a distribution, so you can size positions against the bad paths instead of the lucky one. It is cheap, it is a few lines in R or Python, and it will change how much risk you are willing to take. Run it before you go live, not after the drawdown teaches you the hard way.

How do you stress-test your equity curve — trade shuffling, block bootstrap, full Monte Carlo? Share your method below.