Cascading logistic regression onto gradient boosted decision trees for forecasting and trading stock indices

Abstract Forecasting the direction of the daily changes of stock indices is an important yet difficult task for market participants. Advances on data mining and machine learning make it possible to develop more accurate predictions to assist investment decision making. This paper attempts to develop a learning architecture LR2GBDT for forecasting and trading stock indices, mainly by cascading the logistic regression (LR) model onto the gradient boosted decision trees (GBDT) model. Without any assumption on the underlying data generating process, raw price data and twelve technical indicators are employed for extracting the information contained in the stock indices. The proposed architecture is evaluated by comparing the experimental results with the LR, GBDT, SVM (support vector machine), NN (neural network) and TPOT (tree-based pipeline optimization tool) models on three stock indices data of two different stock markets, which are an emerging market (Shanghai Stock Exchange Composite Index) and a mature stock market (Nasdaq Composite Index and S&P 500 Composite Stock Price Index). Given the same test conditions, the cascaded model not only outperforms the other models, but also shows statistically and economically significant improvements for exploiting simple trading strategies, even when transaction cost is taken into account.

[1]  Lior Rokach,et al.  Ensemble-based classifiers , 2010, Artificial Intelligence Review.

[2]  Cheolbeom Park,et al.  What Do We Know About the Profitability of Technical Analysis? , 2007 .

[3]  Guy Lapalme,et al.  A systematic analysis of performance measures for classification tasks , 2009, Inf. Process. Manag..

[4]  Sahil Shah,et al.  Predicting stock and stock price index movement using Trend Deterministic Data Preparation and machine learning techniques , 2015, Expert Syst. Appl..

[5]  Michael C. Jensen,et al.  Random Walks and Technical Theories: Some Additional Evidence , 1970 .

[6]  Chi-Jie Lu,et al.  Integrating independent component analysis-based denoising scheme with neural network for stock price prediction , 2010, Expert Syst. Appl..

[7]  D. Sornette Physics and financial economics (1776–2014): puzzles, Ising and agent-based models , 2014, Reports on progress in physics. Physical Society.

[8]  Didier Sornette,et al.  Quantification of the High Level of Endogeneity and of Structural Regime Shifts in Commodity Markets , 2013 .

[9]  Elli Gifford Investor's guide to technical analysis : prediction price action in the markets , 1995 .

[10]  Chung-Ming Kuan,et al.  Testing the Predictive Ability of Technical Analysis Using a New Stepwise Test Without Data Snooping Bias , 2009 .

[11]  A. Lo,et al.  A Non-Random Walk Down Wall Street , 1999 .

[12]  Kent D. Daniel,et al.  Presentation Slides for 'Investor Psychology and Security Market Under and Overreactions' , 1998 .

[13]  Didier Sornette,et al.  Investment Strategies Used as Spectroscopy of Financial Markets Reveal New Stylized Facts , 2011, PloS one.

[14]  Chung-Ming Kuan,et al.  Reexamining the Profitability of Technical Analysis with Data Snooping Checks , 2005 .

[15]  Russell L. Purvis,et al.  Forecasting the NYSE composite index with technical analysis, pattern recognizer, neural network, and genetic algorithm: a case study in romantic decision support , 2002, Decis. Support Syst..

[16]  Massimiliano Kaucic,et al.  Investment using evolutionary learning methods and technical rules , 2010, Eur. J. Oper. Res..

[17]  Randal S. Olson,et al.  Automating Biomedical Data Science Through Tree-Based Pipeline Optimization , 2016, EvoApplications.

[18]  D. Sornette,et al.  Quantifying reflexivity in financial markets: towards a prediction of flash crashes , 2012, 1201.3572.

[19]  P. Samuelson Proof that Properly Anticipated Prices Fluctuate Randomly , 2015 .

[20]  Sanford J. Grossman On the Impossibility of Informationally Efficient Markets , 1980 .

[21]  Ömer Kaan Baykan,et al.  Predicting direction of stock price index movement using artificial neural networks and support vector machines: The sample of the Istanbul Stock Exchange , 2011, Expert Syst. Appl..

[22]  Huaguang Zhang,et al.  Optimal tracking control for completely unknown nonlinear discrete-time Markov jump systems using data-based reinforcement learning method , 2016, Neurocomputing.

[23]  Yang Liu,et al.  Symptom severity classification with gradient tree boosting. , 2017, Journal of biomedical informatics.

[24]  Kyoung-jae Kim,et al.  Financial time series forecasting using support vector machines , 2003, Neurocomputing.

[25]  C. Granger,et al.  Efficient Market Hypothesis and Forecasting , 2002 .

[26]  A. Craig MacKinlay,et al.  Stock Market Prices Do Not Follow Random Walks: Evidence from a Simple Specification Test , 1988 .

[27]  Meltem Ozturan,et al.  Stock Price Direction Prediction Using Artificial Neural Network Approach: The Case of Turkey , 2008 .

[28]  Pei-Chann Chang,et al.  An intelligent stock trading system using comprehensive features , 2014, Appl. Soft Comput..

[29]  E. Fama The Behavior of Stock-Market Prices , 1965 .

[30]  E. Fama EFFICIENT CAPITAL MARKETS: A REVIEW OF THEORY AND EMPIRICAL WORK* , 1970 .

[31]  Narasimhan Jegadeesh,et al.  Returns to Buying Winners and Selling Losers: Implications for Stock Market Efficiency , 1993 .

[32]  Mohammad Alshayeb,et al.  Software defect prediction using ensemble learning on selected features , 2015, Inf. Softw. Technol..

[33]  Vadlamani Ravi,et al.  Forecasting financial time series volatility using Particle Swarm Optimization trained Quantile Regression Neural Network , 2017, Appl. Soft Comput..

[34]  Marco Cipriani,et al.  Herd Behavior in a Laboratory Financial Market , 2005 .

[35]  M. Thenmozhi,et al.  Forecasting stock returns based on information transmission across global markets using support vector machines , 2016, Neural Computing and Applications.

[36]  David de la Fuente,et al.  Forecasting IBEX-35 moves using support vector machines , 2012, Neural Computing and Applications.

[37]  Feng Zhou,et al.  EMD2FNN: A strategy combining empirical mode decomposition and factorization machine based neural network for stock market trend prediction , 2019, Expert Syst. Appl..

[38]  S. Brunton,et al.  Discovering governing equations from data by sparse identification of nonlinear dynamical systems , 2015, Proceedings of the National Academy of Sciences.

[39]  Marcos M. López de Prado,et al.  Advances in Financial Machine Learning: Numerai's Tournament (seminar slides) , 2018, SSRN Electronic Journal.

[40]  Xi Chen,et al.  Integrating piecewise linear representation and weighted support vector machine for stock trading signal prediction , 2013, Appl. Soft Comput..

[41]  E. Fama,et al.  Efficient Capital Markets : II , 2007 .

[42]  H. V. Dijk,et al.  Combined forecasts from linear and nonlinear time series models , 1999 .

[43]  Licheng Jiao,et al.  Multiobjective sparse ensemble learning by means of evolutionary algorithms , 2018, Decis. Support Syst..

[44]  Liang-Ying Wei,et al.  A hybrid ANFIS model based on empirical mode decomposition for stock time series forecasting , 2016, Appl. Soft Comput..

[45]  Pei-Chann Chang,et al.  A dynamic threshold decision system for stock trading signal detection , 2011, Appl. Soft Comput..

[46]  Didier Sornette,et al.  Crashes and High Frequency Trading: An evaluation of risks posed by high-speed algorithmic trading , 2011 .

[47]  Chih-Jen Lin,et al.  Dual coordinate descent methods for logistic regression and maximum entropy models , 2011, Machine Learning.

[48]  Narasimhan Jegadeesh,et al.  Evidence of Predictable Behavior of Security Returns , 1990 .

[49]  Fei Liu,et al.  Moving horizon estimation for Markov jump systems , 2016, Inf. Sci..

[50]  J. Bouchaud,et al.  Why have asset price properties changed so little in 200 years , 2016, 1605.00634.

[51]  E. Fama,et al.  Filter Rules and Stock-Market Trading , 1966 .

[52]  J. Murphy Technical Analysis of the Futures Markets: A Comprehensive Guide to Trading Methods and Applications , 1986 .

[53]  Steven B. Achelis Technical Analysis from A to Z : Covers Every Trading Tool... from the Absolute Breadth Index to the Zig Zag , 2000 .

[54]  Ingoo Han,et al.  Genetic algorithms approach to feature discretization in artificial neural networks for the prediction of stock price index , 2000 .