Deep neural networks, gradient-boosted trees, random forests: Statistical arbitrage on the S&P 500

In recent years, machine learning research has gained momentum: new developments in the field of deep learning allow for multiple levels of abstraction and are starting to supersede well-known and powerful tree-based techniques mainly operating on the original feature space. All these methods can be applied to various fields, including finance. This paper implements and analyzes the effectiveness of deep neural networks (DNN), gradient-boosted-trees (GBT), random forests (RAF), and several ensembles of these methods in the context of statistical arbitrage. Each model is trained on lagged returns of all stocks in the S&P 500, after elimination of survivor bias. From 1992 to 2015, daily one-day-ahead trading signals are generated based on the probability forecast of a stock to outperform the general market. The highest k probabilities are converted into long and the lowest k probabilities into short positions, thus censoring the less certain middle part of the ranking. Empirical findings are promising. A simple, equal-weighted ensemble (ENS1) consisting of one deep neural network, one gradient-boosted tree, and one random forest produces out-of-sample returns exceeding 0.45 percent per day for k=10, prior to transaction costs. Irrespective of the fact that profits are declining in recent years, our findings pose a severe challenge to the semi-strong form of market efficiency.

[1]  J. Stock,et al.  Combination forecasts of output growth in a seven-country data set , 2004 .

[2]  Diego Klabjan,et al.  Implementing Deep Neural Networks for Financial Market Prediction on the Intel Xeon Phi , 2015 .

[3]  Robert E. Whaley,et al.  The Investor Fear Gauge , 2000 .

[4]  R. Faff,et al.  Does Simple Pairs Trading Still Work? , 2010 .

[5]  Tom Zimmermann,et al.  Tree-Based Conditional Portfolio Sorts: The Relation between Past and Future Stock Returns , 2016 .

[6]  Matthew Clegg,et al.  Pairs trading with partial cointegration , 2018 .

[7]  Johannes Stübinger,et al.  Non-linear dependence modelling with bivariate copulas: statistical arbitrage pairs trading on the S&P 100 , 2017 .

[8]  M. Hashem Pesaran,et al.  A Simple Nonparametric Test of Predictive Performance , 1992 .

[9]  William N. Goetzmann,et al.  Pairs Trading: Performance of a Relative Value Arbitrage Rule , 1998 .

[10]  David Enke,et al.  The use of data mining and neural networks for forecasting stock market returns , 2005, Expert Syst. Appl..

[11]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[12]  E. Fama,et al.  Multifactor Explanations of Asset Pricing Anomalies , 1996 .

[13]  Daniela M. Witten,et al.  An Introduction to Statistical Learning: with Applications in R , 2013 .

[14]  Trevor Hastie,et al.  An Introduction to Statistical Learning , 2013, Springer Texts in Statistics.

[15]  Nicolas Huck,et al.  Pairs selection and outranking: An application to the S&P 100 index , 2009, Eur. J. Oper. Res..

[16]  Markus Leippold,et al.  International price and earnings momentum , 2009 .

[18]  Nicolas Huck,et al.  Pairs trading: does volatility timing matter? , 2015 .

[19]  Hal R. Varian,et al.  Big Data: New Tricks for Econometrics , 2014 .

[20]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[21]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[22]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[23]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[24]  Andrew Pole,et al.  Statistical Arbitrage: Algorithmic Trading Insights and Techniques , 2007 .

[25]  José Hernández-Orallo,et al.  On the effect of calibration in classifier combination , 2013, Applied Intelligence.

[26]  Martin Weber,et al.  On the Determinants of Pairs Trading Profitability , 2014 .

[27]  C. Bacon Practical Portfolio Performance Measurement and Attribution , 2004 .

[28]  Kimon P. Valavanis,et al.  Surveying stock market forecasting techniques - Part II: Soft computing methods , 2009, Expert Syst. Appl..

[29]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[30]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[31]  M. Avellaneda,et al.  Statistical arbitrage in the US equities market , 2010 .

[32]  F. Longin,et al.  Extreme Correlation of International Equity Markets , 2000 .

[33]  J. M. Bates,et al.  The Combination of Forecasts , 1969 .

[34]  Klaus Nordhausen,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition by Trevor Hastie, Robert Tibshirani, Jerome Friedman , 2009 .

[35]  Timofei Bogomolov,et al.  Pairs trading based on statistical variability of the spread process , 2013 .

[36]  Peter Carl,et al.  Econometric tools for performance and risk analysis , 2014 .

[37]  Tamás D. Gedeon,et al.  Data Mining of Inputs: Analysing Magnitude and Functional Measures , 1997, Int. J. Neural Syst..

[38]  Michael Y. Hu,et al.  Forecasting with artificial neural networks: The state of the art , 1997 .

[39]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[40]  Jorge Mina,et al.  Return to RiskMetrics: The Evolution of a Standard , 2001 .

[41]  Mark M. Carhart On Persistence in Mutual Fund Performance , 1997 .

[42]  J. Friedman Stochastic gradient boosting , 2002 .

[43]  Ozgur Ince,et al.  Individual Equity Return Data from Thomson Datastream: Handle with Care! , 2004 .

[44]  Nicolas Huck,et al.  Pairs trading and outranking: The multi-step-ahead forecasting case , 2010, Eur. J. Oper. Res..

[45]  Yoshua Bengio,et al.  Maxout Networks , 2013, ICML.

[46]  Allan Timmermann,et al.  Persistence in forecasting performance and conditional combination strategies , 2006 .

[47]  C. Granger,et al.  Experience with Forecasting Univariate Time Series and the Combination of Forecasts , 1974 .

[48]  A. Timmermann,et al.  Combining expert forecasts: Can anything beat the simple average? , 2013 .

[49]  Ronnie Sadka,et al.  Liquidity Risk and the Cross-Section of Hedge-Fund Returns , 2009 .

[50]  Andrew W. Lo,et al.  What Happened to the Quants in August 2007?: Evidence from Factors and Transactions Data , 2008 .

[51]  Lawrence Takeuchi,et al.  Applying Deep Learning to Enhance Momentum Trading Strategies in Stocks , 2013 .

[52]  R. Schapire The Strength of Weak Learnability , 1990, Machine Learning.

[53]  Lars Kai Hansen,et al.  Neural Network Ensembles , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[54]  Christopher Krauss,et al.  Statistical Arbitrage Pairs Trading Strategies: Review and Outlook , 2017 .

[55]  L Monnier,et al.  An overview of the rationale for pharmacological strategies in type 2 diabetes: from the evidence to new perspectives. , 2005, Diabetes & metabolism.

[56]  A. Lo Hedge Funds: An Analytic Perspective , 2008 .

[57]  Heiko Jacobs,et al.  What Explains the Dynamics of 100 Anomalies , 2015 .

[58]  E. Fama,et al.  A Five-Factor Asset Pricing Model , 2014 .

[59]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[60]  Franco Scarselli,et al.  On the Complexity of Neural Network Classifiers: A Comparison Between Shallow and Deep Architectures , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[61]  M. Medeiros,et al.  Modeling and predicting the CBOE market volatility index , 2014 .

[62]  Diego Klabjan,et al.  Implementing deep neural networks for financial market prediction on the Intel Xeon Phi , 2015, WHPCF@SC.

[63]  L. V. Allis,et al.  Searching for solutions in games and artificial intelligence , 1994 .

[64]  C. J. McGrath,et al.  The Effect , 2012 .

[65]  G.E. Moore,et al.  Cramming More Components Onto Integrated Circuits , 1998, Proceedings of the IEEE.

[66]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[67]  Efstratios F. Georgopoulos,et al.  Forecasting foreign exchange rates with adaptive neural networks using radial-basis functions and Particle Swarm Optimization , 2013, Eur. J. Oper. Res..

[68]  A. Timmermann Chapter 4 Forecast Combinations , 2006 .