Evidence-Based Technical Analysis: Applying the Scientific Method and Statistical Inference to Trading Signals
Today I’ll be talking about an excellent book, which was recommended on several “quant” blogs I read: Evidence-Based Technical Analysis by David Aronson. One of the main reasons I picked this book is because it teaches you to fish (instead of giving you a fish). So, if you’re after a book with great trading strategies or indicators, this might not be the ideal one, however if you want to learn about strategy testing and methodology, it’s probably a great addition to your trading library. It had been on my list for a while and I wish I’d read it earlier as it has the potential to add cornerstone methods to trading research and testing procedures. Read on for a summary with a review right at the end…
One of the early quotes from the book defines the concept it covers:
The scientific method is the only rational way to extract useful knowledge from market data and the only rational approach for determining which TA methods have predictive power. I call this evidence-based technical analysis (EBTA).
Aronson introduces early on the concept of objective (TA) vs. subjective (TA). An objective claim is a meaningful proposition, which can be unambiguously verified. For us mechanical system trading developers: a set of rules that can be back-tested. On the other hand, subjective technical analysis would consist of approaches such as Elliot Wave Analysis.
However, objective technical analysis is not sufficient on its own: you still need rigourous statistical inference to draw conclusions on its predictive power.
Part One: the Foundations
Part one of the book establishes the methodological, philosophical, psychological and statistical foundations of EBTA.
The first topic covered is the need for benchmarking to evaluate objective rules and introduces the concept of detrending, which I have previously discussed.
The second topic deals with cognitive psychology and gives examples of different types of behavioral biases that can fool us and make us believe in subjective technical analysis:
- Pattern recognition
- Confirmation bias
- Hindsight bias
- Illusory correlations
- Mis-perception of randomness
The antidote for these “mind traps” is the scientific method. The generic scientific method is covered in the third chapter with some history and philosophy of science and logic reasoning. The scientific method – which can and should be applied to Technical Analysis – contains 5 stages:
Subjective TA does not conform to the scientific method and the author presents an interesting study of objectification of a subjective TA pattern (Head and Shoulders) to make it testable (it shows that Head and Shoulders is worthless on stocks and has doubtful value on currencies).
Statistical Analysis of Back-Test Results
The next three chapters introduce and cover statistical analysis. The beginning of this part gives a good refresher on statistical inference, starting with concepts such as frequency distribution, standard deviation, probabilities and p-values. The example of sampling and statistical inference using beads in a box makes for a good illustration and a fairly clear parallel with the world of trading rules back-testing.
The book moves on to concepts such as hypothesis testing, statistical significance and confidence interval, etc. and how they relate to rule testing.
One of the main issue of back-testing results is that they only represent one sample of how the systems/rule(s) perform. Aronson presents the classical statistical approach to derive the sampling distribution (required to perform the statistical inference) based on a single observation/sample. However this assumes normality of the distribution, which is unlikely to be correct when dealing with financial data.
New Scientific Methods for Back-Testing
This last concept leads to the introduction of the two alternative methods to derive the sampling distribution and perform statistical inference on the back-tested results. These are two computer-based methods:
- The Bootstrap
- The Monte Carlo permutation
Both methods estimate the sampling distribution by randomly resampling (reusing) the original sample of observation. A test statistic is then computed for each resample.
In practice, the bootstrap method uses resampling with replacement of the daily strategy returns to generate numerous random test statistics used to approximate a sampling distribution.
The Monte Carlo permutation method achieves the same result by decoupling and permuting the position direction (ie. long or short) with the daily instrument returns.
Using the statistical inference covered in earlier chapters, one can decide whether results found in the back-test are statistically significant or the product of random chance.
These two methods are the main take-away from the book, as they are valuable to identify the degree of randomness in a back-tested rule. This should probably be part of a standard trading system research methodology and I will cover these two methods in more detail in later posts.
On Data Mining
The methods above only deal with one rule/back-test. However, we rarely test the one rule in isolation: most back-testing would test multiple parameter values, rules and combinations to try and identify the best performing ones: this is data mining.
It is however wrong to expect future performance of the best performing systems to keep in line with past, back-tested results. The best performing systems might have intrinsic value, but some of their over-performance is due to random variations. If you run 1,000 different rules with no predictive power, all of them will contain some random chance producing a variable departure from the zero-mean. The “most lucky” rule will be furthest away on the right-hand side of the zero-mean (and therefore picked up by the data miner), despite having no intrinsic value.
Data mining introduces a bias, which overstates the value of the “best” rule compared to expected random variations. The data mining bias is linked to several factors:
- Increases with the number of rules back-tested
- Decreases with sample size used in back-testing.
- Decreases with the correlation of back-tested rules results.
- Increases with the frequency of outliers in the back-test sample.
- Decreases with the variation in back-tested returns among rules considered.
This is illustrated with examples and charts. The rest of the chapter concentrate on methods to reduce/correct for the data mining bias and adapts the bootstrap method (using White’s reality check) and Monte Carlo permutation to be used in “data mining” mode (instead of single rule testing).
In conclusion, data mining is a valid method to discover the best rule(s) but the researcher should ensure that the results are statistically significant to avoid the risk of discovering “most lucky” rules.
A Tour of the EMH and Application of Methods
The following chapter deals with the Efficient Market Hypothesis, which takes a bit of a beating by the author. The main point is that both from an empirical and theoretical point of view, the EMH contains flaws, which supports the idea of succesful TA.
The last part of the book presents a diverse set of rules and parameters (6,402 combinations) and attempts to test for their statistical significance. The rules are fairly simple and the results do not highlight significant predictive power in any rule.
This book is a very interesting read, on the long side, with 450+ pages. Even though I enjoyed it throughout, I was sometimes finding myself hoping for the author not to expand so much on some introductory topics (the history and philosophy of science is quite interesting but could well be skim-read to get to the “juicier” parts quicker). If you’re in a rush I’d advise to concentrate on chapters 4, 5 and 6 where the actual bootstrap and Monte Carlo methods get presented and discussed, and the discussion on data mining bias is interesting and very relevant. For a reader new to these concepts, the initial chapters would provide a comprehensive introduction of the foundational concepts of scientific reasoning and statistical analysis before putting them all together in application.
For more info, some of the reviews on amazon are quite insightful (mostly positive – although the book’s got its share of 1-star reviews). There is also a companion website to the book with more info and detailed results of the tests performed in the last part of the book.