Originally Posted By: MatPed
Can you just summarize the minus/plus of Chan approach vs the "traditional" zorro's?


Please don't get the impression that this is a 'one or the other' type of situation where we choose one approach over another. The approach above has merit and limitations, as do Chan's other approaches and Zorro's tools. The idea is to understand how the different tools work and what their limitations are, and then applying them in a logical manner. Remember that we are dealing with uncertainty, and no single approach or combination of approaches will yield a conclusive solution.

I believe I've outlined above what to me are the main limitations of some of Zorro's tools that relate to backtesting and statistical significance. Chan uses a hypothesis testing approach in which we hypothesise that the true test statistic of interest, based on an infinite data set (not a finite backtest) is zero (or 1, presumably, if we are looking at profit factor). Using this approach, we aim to reject this hypothesis with a certain degree of confidence. The probability distribution of the test statistic must have a zero mean, if our hypothesis is true. Therefore, if we know (or assume) a probability distribution, we can compute the probability that our statistic will be as large as that returned by the backtest. Obviously, the smaller this probability, the more confidently we can reject the hypothesis.

The main issue with this approach is determining the probability distribution of our statistic under the hypothesis we hope to reject. Hence, Chan's three approaches. The one I detailed is essentially an attempt to empirically model this probability distribution and see where the strategy's backtested statistic is located upon it.

There are other limitations which Chan cites in his book too. For example, the fact that the hypothesis we hope to reject isn't unique, and different hypotheses can give rise to different estimates of statistical significance.

Originally Posted By: MatPed
Regarding the data I was looking for a shortcut in the conversion process. I already have all the data I need from tickhistory. Its just a boring process split data in a year file, apply the script,...

The integration between a Data Provider and Zorro is one of the items in my Zorro's wish list (as amibroker or multicharts)

Anyway if you do not want to share then for any ethical reason, I understand that.


Well no one ever accused me of being ethical! The reason I suggested paying for the data is for simplicity. For about $30, you can transfer gigabytes of data using an FTP client. I don't know how to set this up so that you can access the data in a similar fashion from my PC, and to be honest I've got better things to do than try to save you thirty bucks. Having said that, if you have a Dropbox account with enough storage (>50 GB) I could quite easily share it with you that way. I'd prefer to get permission from the www.histdata.com site owner first though.

If you do manage to get your hands on the data, I have some scripts that automate the conversion process that I would be happy to share. One that I paid someone on Elance to create, and the credit for the other one goes to a fellow Zorro user (yosoytrader). I assume he wouldn't be against me sharing his script, but again, I'd prefer to check first.