Systematic Trading research and development, with a flavour of Trend Following
Au.Tra.Sy blog – Automated trading System header image 2

A Different Application of the Bootstrap

September 28th, 2010 · 3 Comments · Backtest

random-numbers-artnooseIn the last volatility filters post we saw that trades from a simple Trend Following system (20-50 MA cross-over) had different expectancy based on the relative level of volatility at trade entry. This suggested that a filter blocking trades most volatile at entry (in the top decile: 90 to 100% of past volatility) would raise the expectancy per trade.

However, this conclusion was obtained via a single observation (ie. one back-test sample). Ideally we want many samples, to be able to establish more robust conclusions – or at least be able to calculate a level of confidence in the result from our one observation.

And this is where the bootstrap test comes in handy. In this post I’ll try and illustrate how we can apply the bootstrap concept to a slightly different problem, in order to strengthen back-testing research results.

Bootstrap to compare two populations

Instead of using the bootstrap test to check if the profitability of a single back-test is statistically significant, we use it to check whether the difference between 2 sample groups is statistically significant. Here, the 2 sample groups are:

  1. Trades with entry volatility at 0-90% of historical levels (lower-volatility group)
  2. Trades with entry volatility at 90-100% of historical levels (high-volatility group)

The difference between the 2 groups can be measured by the difference of the means of each group. In the original samples, the difference between the R-multiple mean values was fairly high with a ratio of nearly 3-to-1::

  1. Lower-volatility group: Average R-multiple = 0.27
  2. High-volatility group: Average R-multiple = 0.10

However, this difference could be down to random variation. Running a statistical test such as a bootstrap will enable us to determine a level of confidence in the results.

Similarly to a two-sample t-test, we need to formulate the null hypothesis (H0):

H0: the mean of each group is identical (mean1 = mean2). This is equivalent to the difference of the means being nil (mean1 – mean2 = 0)

with the alternative hypothesis (Ha):

Ha: the mean of group 1 (lower volatility group) is higher than the mean of group 2

The goal of the bootstrap test is to generate the sampling distribution of the difference of the means and to calculate a p-value for rejecting the null hypothesis.

Here are the steps to follow:

  1. Form two samples. Each sample contains all trades’ R-multiple values for one of the group.
  2. For each resample, select random instances (with replacement) of R-multiples from each group (same number of instances as the group count). Calculate the mean for each group and compute their difference (bootstrapped statistic).
  3. Perform a large number of resamples to generate a large number of bootstrapped statistics (difference of the means between the 2 group resamples).
  4. Form the sampling distribution of the difference of means generated in the step above.
  5. Derive the p-value of the difference of the means being 0 or less

For the case under study, the bootstrap gave us this sampling distribution (10,000 resamples):


We can calculate the p-value, which is 0.0276 (97.24% of the observations are positive, leaving only 2.76% zero or negative). We can therefore reject the null hypothesis.

The conclusion is that the result is statistically significant at the 5% level (p-value less than 0.05): there is only a slim chance (2.74%) that the null hypothesis is true (Type I error).

So, the filter seems to markedly improve the performance of the system. But of course, this is not a panacea… It is very well possible that a filtered trade might end up being a large runner.
With Trend Following’s dependency on infrequent large winners, missing a major trade because it has been filtered out could make a big difference to that year’s performance.

Random Numbers picture credits: artnoose via flickr (CC)
Related Posts with Thumbnails

Tags: ·

3 Comments so far ↓

  • Mark

    I have really learned a lot from your postings. I have especially liked your posts regarding bootstrap testing. I had already read Evidence-based Technical Analysis before your bootstrap posts, but your postings have helped reinforce the benefit of the test.
    Today’s post really hit home because I think I have found an additional element that looks like it can improve my current system. I will definitely be applying the bootstrap test the way you have described today to determine how much confidence I can have regarding the improvemenet this new element gives me.
    Thank you and keep up the great work.

  • Jez Liberty

    Mark – glad to hear that! And thanks for the comment…

  • Danae Lampe

    This is really interesting. I seem to remember that there was a program what would take the results of a system that was back tested in TradeStation (back in the day version 2000i or something) and do simular analysis. However, in that tool, they used it to optimize parameters for a trading system.

    But, I like the way you explained this and gave some practical examples to follow. Good to know that all of this stuff isn’t just theory for some people. :-)

Leave a Comment