Workflow Tips When Using Data Mining Software
With data mining software that is readily available to retail traders such as StrategyQuant, Adaptrade and EA Studio, one of the key issues you need to address is ‘curve fitting’. Curve fitting is not only restricted to data mining but is also a feature associated with mean reverting strategies that possess negative skew over the long term.
There are two broad types of strategy within which any strategy can be broadly classified. These are convergent strategies and divergent strategies. They represent spectral extremes with all strategies plotting somewhere between.
Convergent strategies are ‘backwards looking’ with a central assumption that price will converge to an estimated historic equilibrium. In a sense these strategies are predictive in nature and include mean reversion strategies, many value investing approaches, grid trading styles and Martingale methods that apply a principle of convergence to an estimated value over time. Given the repetitive nature of price converging towards the current market equilibrium, these strategies are typically quite complex in nature based on the ‘expected certainty of this outcome’. The complexity is a symptom of ‘fitting’ the strategy to the nature of the current market condition.
These types of strategies are what most retail traders prefer given their linear equity curves and high win rates over the periods that equilibrium persists….however when market conditions change to new equilibria, the past stable equilibrium is not respected and you quickly find that your strategy no longer performs as expected and you are left with a sequence of large unfavourable losses that frequently ends up in account blow ups. Because the ‘true’ equity curve is revealed over the long term, it is advised to use long term data horizons that span a broad array of market conditions to detect them.
If you use short term data horizons to develop your strategy, the chances are great that you are biasing your outcomes to ‘convergent styles’ that in the long term have negative skew. It is not advised. You will find that this approach means that you need to continuously mine new strategies when existing strategies fail….and you will become a victim of ‘strategy hopping’. Be aware that the need to continuously replace your strategies is a form of ‘market timing’….and there are swags of research papers available that demonstrate that market timing methods used to ‘switch on or switch off’ your strategies is a fools errand as the decisions invariably are lagging in nature.
The other class of strategy type are referred to as ‘divergent strategies’. They lie on the other end of the spectrum and are the opposite to convergent styles in that they are non-predictive in nature and simply operate off the principle that stable market conditions will not persist and that conditions will move in the future towards new equilibrium. The location of this new equilibrium is unknown given market uncertainty….and the only assumption you use is that it will either be higher or lower than the current equilibrium. As a result, these strategies are said to ‘follow’ price as opposed to ‘predict future price’. These strategies typically flounder when market conditions are stable and persist but perform very well when markets diverge towards new equilibria.
To avoid trading during normal market conditions, what divergent traders do to avoid death by a thousand cuts is to apply trade filters to ensure that their strategy is active during more ‘exotic’ times where price moves outside the ‘normal’ range. The filter significantly reduces your trade frequency, frictional costs of trading and ensure that you only participate when market conditions become more ‘exotic’ where it is more probable that the exotic conditions is representative of market divergence as opposed to market stability.
The way these divergent strategies are developed is through logical design build considerations that enforce an asymmetry into the design through cutting losses short and letting profits run. They all use stop losses and tend to use different forms of trailing stop to manage adverse risk exposure in cutting losses short….but they keep their profit potential open and tend to not use profit targets. The reason for this open-ended condition statement is to predate on ‘fat tailed’ price moves and catch the occasional white swan that according to Ed Seykota in his whipsaw ditty ‘pays for them all’.
Given the ‘cutting losses short’ condition applied to these strategies and given the applied filters, these form of strategy are far more long lasting than their convergent cousins as their drawdown signature takes time to develop representative of when extended market conditions are not favorable to the strategy. Furthermore this form of strategy only requires very simple asymmetrical rules to allow for ‘degrees of market freedom’ in price movement. With simple rules, the strategy floats on the market condition. With more rules, your strategy starts dictating terms to the market. It is the market that decides your returns, not your complex system design.
Divergent strategies typically have volatile equity curves over their long lifetimes where drawdowns are a result of many small sequential losses but ….and here is the rub…… so do convergent strategies. The problem with convergent strategies is that their linear equity curves are a symptom of their limited lifespan. If measured over the same time-spans as divergent strategies…then you will see how that beautiful linear equity curve is a temporary condition only. In fact the vast majority of convergent strategy equity curves over these long periods result in >100% failure. We unfortunately only see the nice linear equity curves when convergent strategies are performing.
So in essence if you are interested in mining over long term data, then consider using the following principles in your workflow:
1. Preserve at least 30% Out of Sample data (OOS) to your long range data set. This ensures that optimization is only performed on the In Sample 70% component. The OOS component is simply used to ensure that the previous performance metrics are maintained in the future. Discard all strategies where the OOS component falls well below the IS performance metrics. The closer the OOS performance against the IS benchmark the better.
2. Ensure stops and trailing stops are allowed for in the design settings and keep profits unlimited in nature.
3. Use a preset filter to avoid trading during the normal ‘day to day’ conditions.
4. Use the multi-market feature to test your strategy on other unseen data. The greater the number of alternative markets the strategy works on the better.
5. Use stringent settings on your Monte Carlo…but do not set high acceptance levels. Expect variance in your results, but ensure that the MC array is overall generally profitable. This is used to ensure that your strategy requires a ‘weak edge’.
Remember that over the long term say 20 years plus…your equity curve is quite deceptive. Despite the ‘apparent’ fairly linear nature of the curve when viewed from a height, when you drill down, there is considerable volatility and long periods of stagnation. For those looking for ‘instant’ results, it is unlikely that they will have the patience to tolerate them.
So the way we overcome this volatility and uncertainty associated with the future of a particular strategy, is to diversify into a broad array of different strategies, each with slight positive expectancy that are uncorrelated in nature. The ‘uncorrelated’ requirement is used to ensure that over ‘most times’ your portfolio has positive momentum and your ‘stagnation period’ of your entire portfolio is minimized. The uncorrelated nature of each strategy also ensures that the drawdowns of each component strategy are reduced when compiled into a portfolio.
So the decision to turn off the strategy is a long term one in the divergent space but a short term one in the convergent space. It is very difficult to determine between drawdowns versus total risk of ruin with convergent strategies…..but drawdowns in divergent strategies take a long time to play out plus under diversification where each strategy comprises a small portion of your total equity….you are very unlikely to meet ‘total risk of ruin’.
Under a diversified portfolio of divergent strategies you continuously monitor live performance against your long term backtest. You set performance benchmarks for your live trading that uses the backtest performance metrics as a guide. You only drop the strategy when those benchmarks (based on long term data) are exceeded. In this game…you need to rely on the Law of Large numbers as your guide.
Trade well and prosper
Rich B