Survivorship bias is a systematic error in backtesting that arises when historical data only includes instruments that still exist today. Companies that went bankrupt, were delisted, or were acquired disappear from the dataset. Since these tend to be the worst performers, their exclusion makes any strategy tested on the surviving universe appear more profitable than it would have been in reality.
How survivorship bias distorts results
Consider a strategy that buys the cheapest stocks by price-to-earnings ratio. Many cheap stocks are cheap for a reason: they are in financial distress. Some of these companies eventually go bankrupt and get delisted. In a survivorship-biased dataset, these bankrupt companies are absent, so the backtest never experiences the total loss that a real trader would have suffered. The result is a backtest that overstates returns, sometimes dramatically.
Studies have estimated that survivorship bias can inflate annual equity returns by 1% to 2% or more, depending on the strategy and universe. For strategies that specifically target distressed or small-cap stocks, the bias can be even larger because these segments have higher delisting rates.
Sources of survivorship bias
The most obvious source is using a current stock index (like today's S&P 500 constituents) as the historical trading universe. The current S&P 500 only includes companies that are successful enough to be in the index today. Companies that were in the S&P 500 ten years ago but were subsequently removed due to poor performance or acquisition are excluded from the backtest.
Data providers can also introduce survivorship bias if they do not maintain records of delisted securities. Some cheaper data sources only provide data for currently active instruments, making survivorship-free backtesting impossible without supplementing the data.
How to avoid survivorship bias
The primary solution is to use point-in-time data that includes all instruments that existed at each historical date, including those that were later delisted. This means the trading universe at each backtest timestamp matches the universe that was actually available to trade at that time.
Survivorship-free datasets explicitly track delistings and include the final trading prices of delisted securities. When a company goes bankrupt, the dataset records the loss. When a company is acquired, the dataset records the acquisition price. This ensures the backtest experiences the same outcomes a live trader would have faced.
Practical example
A trader backtests a value strategy on the Russell 2000 using current constituents over the past 15 years. The backtest shows 14% annual returns. When the same strategy is tested on a survivorship-free dataset that includes all historical constituents, including companies that went bankrupt or were delisted, the returns drop to 9% annually. The 5% difference was entirely due to excluding losing stocks that no longer exist.
How Tektii helps
Tektii encourages survivorship-bias-free backtesting by supporting point-in-time data that includes delisted instruments. The platform's data pipeline preserves historical universe composition so that strategies are tested against the instruments that were actually available at each point in time. This produces backtest results that more accurately reflect what a trader would have experienced in live markets.