How To Identify Overfit Trading Strategies

15 November 2017
The Market Bull

In this post, we describe the main features and behaviours of overfit trading strategies, and the risks they pose to both traders and DARWIN investors.

Overfit trading strategies typically perform well in backtesting environments, creating the illusion that they exploit the market inefficiency being targeted, really well.

However, when deployed in a live trading environment, their performance is disproportionately different to what was observed in backtesting, due primarily to their modeling historical data too closely.

This prevents such strategies from generalizing well to unseen data in future, to the detriment of both the traders (launching them live with capital) and DARWIN investors (backing them with capital).

For your convenience, this post is organized as follows:

  1. Types of Overfit Trading Strategies
  2. Typical Features & Behaviours
  3. How Traders Can Address Overfitting
  4. How Investors Can Avoid Overfit DARWINs.

Types of Overfit Trading Strategies

We can generalize them into two main categories:

  1. Model-focused – where the strategy fits historical data too closely, and exhibits high variance when tested on unseen data.
  2. Risk-focused – where the performance of a weak model is compensated for with an unrealistic, loss averse risk management rationale.

In the former (model-focused), a trading strategy will typically perform very well in backtesting, but either stagnate for lengthy periods of time (or fail entirely) in live testing.

Overfit Trading Strategies (model-focused)

Overfit Trading Strategies (model-focused)

From both a trader and investor’s perspective, such strategies are easy to identify visually; returns will at some point reach a point of inflexion where they no longer appear similar to historical performance.


In the latter (risk-focused), a trading strategy will demonstrate smooth, very consistent returns in backtesting as well as in live testing for a period of time, making such strategies more dangerous than the former as they are difficult to identify from looking at just returns charts.

Overfit Trading Strategies (risk-focused)

Overfit Trading Strategies (risk-focused)

Before & After (Risk-Focused)

Before & After (Risk-Focused)

Later in the post, we’ll discuss certain DARWIN investment attributes that assist both traders and investors in isolating such strategies.


Typical Features & Behaviours

Model-focused

It is usually quite straight-forward to identify trading strategies that overfit to historical data.

Compared to training phases in backtests, their test phases and live performance may demonstrate:

  1. Excess stagnation,
  2. Larger drawdowns, and/or
  3. An overall reversal in forecast returns.

Of the 12 DARWIN investment attributes available, this behaviour is captured best by the evolution of the following:

  1. Experience (Ex)
  2. Performance (Pf),
  3. Risk Stability (Rs),
  4. Positive/Negative Returns Consistency (R+/R-),
  5. Duration Consistency (Dc),
  6. Capacity (Cp),
  7. Loss Aversion (La),
  8. Open/Close Strategy (Os/Cs).

Low scores or unstable evolution of scores in these attributes (especially in live trading) can serve as a useful indicator in identifying a strategy as being overfit in backtests or otherwise.

When a strategy is overfit to training data, the evolution of its scores for the above attributes is likely to demonstrate high variance when subjected to test data and/or in live trading.


Typically, three combinations of scores for the above attributes demonstrate consistent performance between backtests and live trading (when accompanied by High Ex, High Mc, High Rs scores):

  1. (Low Cp | High Os/Cs | High Pf | High R-) – Once a strategy with good scores for these in the backtest is launched live, if scores for Os/Cs and R- progressively decline over time, the likelihood of the strategy being overfit increases.
  2. (Moderate Cp | High La | High Pf) – If high scores for La and Rs progressively decline upon launching live, the likelihood of the strategy being overfit increases.
  3. (High Pf | Very High R+/R- or Dc | Moderate La) – A strategy with this combination of scores in the backtest is the least likely to be overfit, and quite difficult to find. However, the same rules for monitoring declines in scores applies here too, when launched live.

Traders can therefore benefit from uploading trading strategy backtests to the Darwinex platform for analysis.

Examining the evolution of scores received provides a valuable layer of insight into how symmetric performance is likely to be in backtesting vs live trading.

For example, if steady evolution is observed over training data, but high variance in test data, the likelihood of the strategy generalizing well to unseen data in live trading is low.

Stable Evolution of Risk & Duration Consistency

Stable Evolution of Risk & Duration Consistency


Risk-focused

In trading strategies where loss averse risk management compensates for poor timing (and generates unrealistic returns in backtesting), one or more of the following behaviours can be observed:

  1. Poorly timed trades are not closed for lengthy periods of time,
  2. Additional orders are opened in the same direction as poorly timed ones in an attempt to recover the position at incrementally better prices,
  3. Excess leverage is employed per trade in an attempt to recover losing positions at incrementally better prices.

Of the 12 DARWIN investment attributes available, this behaviour is captured best by the following:

  1. Loss Aversion (La)
  2. Combination of low La and high Capacity (Cp)
  3. Risk Stability (Rs)
  4. Market Correlation (Mc)

The evolution of scores received for these investment attributes, provides valuable insight into whether a strategy will compensate for inferior timing by employing loss averse risk management practices.

Additionally, poor scores for Rs and Mc in particular add strong confirmation that risk-focused overfitting is likely the case with a strategy.

La vs Cp vs Rs (Evolution of DARWIN Investment Attributes)

La vs Cp vs Rs (Evolution of DARWIN Investment Attributes)

Additionally, strongly negative correlation with DARWIN $DWC, adds confirmation to this risk.


 

How Traders Can Address Overfitting

A recent blog post – DO’s and DONT’s of MetaTrader 4 Backtesting – details several steps traders can take to both address and eliminate overfitting from their trading strategies.


How Investors Can Avoid Overfit or Risky DARWINs

As described under “Typical Features & Behaviours” above, monitoring the evolution of a strategy’s scores for:

  1. Experience (Ex),
  2. Risk Stability (Rs),
  3. Market Correlation (Mc),
  4. Performance (Pf),
  5. Loss Aversion (La),
  6. Capacity (Cp),
  7. Open/Close Strategy (Os/Cs),
  8. Positive/Negative Returns Consistency (R+/R-), and
  9. Duration Consistency (Dc)

..can help DARWIN Investors exercise caution with (or avoid entirely), both types of overfit trading strategies discussed in this post.

More detailed information on each investment attribute is available via the Education Section.


[Additional Resources] (Video) How To Identify Overfit Trading Strategies

Do you have what it takes? – Join the Darwinex Trader Movement!

Darwinex - The Open Trader Exchange