Evaluation of Tree-Based Ensemble Machine Learning Models in Predicting Stock Price Direction of Movement

Forecasting the direction and trend of stock price is an important task which helps investors to make prudent financial decisions in the stock market. Investment in the stock market has a big risk associated with it. Minimizing prediction error reduces the investment risk. Machine learning (ML) models typically perform better than statistical and econometric models. Also, ensemble ML models have been shown in the literature to be able to produce superior performance than single ML models. In this work, we compare the effectiveness of tree-based ensemble ML models (Random Forest (RF), XGBoost Classifier (XG), Bagging Classifier (BC), AdaBoost Classifier (Ada), Extra Trees Classifier (ET), and Voting Classifier (VC)) in forecasting the direction of stock price movement. Eight different stock data from three stock exchanges (NYSE, NASDAQ, and NSE) are randomly collected and used for the study. Each data set is split into training and test set. Ten-fold cross validation accuracy is used to evaluate the ML models on the training set. In addition, the ML models are evaluated on the test set using accuracy, precision, recall, F1-score, specificity, and area under receiver operating characteristics curve (AUC-ROC). Kendall W test of concordance is used to rank the performance of the tree-based ML algorithms. For the training set, the AdaBoost model performed better than the rest of the models. For the test set, accuracy, precision, F1-score, and AUC metrics generated results significant to rank the models, and the Extra Trees classifier outperformed the other models in all the rankings.

[1]  Guoping Qiu,et al.  Random Forest for Label Ranking , 2016, Expert Syst. Appl..

[2]  Sahil Shah,et al.  Predicting stock market index using fusion of machine learning techniques , 2015, Expert Syst. Appl..

[3]  E. Fama EFFICIENT CAPITAL MARKETS: A REVIEW OF THEORY AND EMPIRICAL WORK* , 1970 .

[4]  Shouyang Wang,et al.  Ensemble ANNs-PSO-GA Approach for Day-ahead Stock E-exchange Prices Forecasting , 2014, Int. J. Comput. Intell. Syst..

[5]  Fadel M. Megahed,et al.  Stock market one-day ahead movement prediction using disparate data sources , 2017, Expert Syst. Appl..

[6]  Marcela Perrone-Bertolotti,et al.  Machine learning–XGBoost analysis of language networks to classify patients with epilepsy , 2017, Brain Informatics.

[7]  Tomer Geva,et al.  Empirical evaluation of an automated intraday stock recommendation system incorporating both market data and textual news , 2014, Decis. Support Syst..

[8]  Jinghui Wang,et al.  Time series classification based on arima and adaboost , 2020 .

[9]  Chih-Fong Tsai,et al.  Combining multiple feature selection methods for stock prediction: Union, intersection, and multi-intersection approaches , 2010, Decis. Support Syst..

[10]  Arun Agarwal,et al.  Recurrent neural network and a hybrid model for prediction of stock returns , 2015, Expert Syst. Appl..

[11]  Ole-Christoffer Granmo,et al.  Multi-layer intrusion detection system with ExtraTrees feature selection, extreme learning machine ensemble, and softmax aggregation , 2019, EURASIP Journal on Information Security.

[12]  Wenyu Liu,et al.  Structured random forest for label distribution learning , 2018, Neurocomputing.

[13]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[14]  Abdulhamit Subasi,et al.  Comparison of Bagging and Boosting Ensemble Machine Learning Methods for Automated EMG Signal Classification , 2019, BioMed research international.

[15]  Pourya Shamsolmoali,et al.  Application of Credit Card Fraud Detection: Based on Bagging Ensemble Classifier , 2015 .

[16]  Zhe George Zhang,et al.  Forecasting stock indices with back propagation neural network , 2011, Expert Syst. Appl..

[17]  Thomas Fischer,et al.  Deep learning with long short-term memory networks for financial market predictions , 2017, Eur. J. Oper. Res..

[18]  Baikunth Nath,et al.  A fusion model of HMM, ANN and GA for stock market forecasting , 2007, Expert Syst. Appl..

[19]  Ash Booth,et al.  Automated trading with performance weighted random forests and seasonality , 2014, Expert Syst. Appl..

[20]  Han Zhang,et al.  Gene Expression Value Prediction Based on XGBoost Algorithm , 2019, Front. Genet..

[21]  Chulwoo Han,et al.  Deep learning networks for stock market analysis and prediction: Methodology, data representations, and case studies , 2017, Expert Syst. Appl..

[22]  Tao Chen,et al.  Back propagation neural network with adaptive differential evolution algorithm for time series forecasting , 2015, Expert Syst. Appl..

[23]  Mohammad Modarres,et al.  Developing an approach to evaluate stocks by forecasting effective features with data mining methods , 2015, Expert Syst. Appl..

[24]  V. Smith Constructivist and Ecological Rationality in Economics , 2003 .

[25]  Stephan Seifert,et al.  Application of random forest based approaches to surface-enhanced Raman scattering data , 2020, Scientific Reports.

[26]  E. Suganya,et al.  An AdaBoost-modified classifier using stochastic diffusion search model for data optimization in Internet of Things , 2020, Soft Comput..

[27]  Yudong Zhang,et al.  Stock market prediction of S&P 500 via combination of improved BCO approach and BP neural network , 2009, Expert Syst. Appl..

[28]  Victor Chang,et al.  Towards an improved Adaboost algorithmic method for computational financial analysis , 2019, J. Parallel Distributed Comput..

[29]  Hsinchun Chen,et al.  Textual analysis of stock market prediction using breaking financial news: The AZFin text system , 2009, TOIS.

[30]  Johan Bollen,et al.  Twitter mood predicts the stock market , 2010, J. Comput. Sci..

[31]  Ismail Mohamad,et al.  Standardization and Its Effects on K-Means Clustering Algorithm , 2013 .

[32]  Guangwei Zhu,et al.  Stock selection with random forest: An exploitation of excess return in the Chinese stock market , 2019, Heliyon.

[33]  John R. Nofsinger Social Mood and Financial Economics , 2005 .

[34]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[35]  LessmannStefan,et al.  Bridging the divide in financial market forecasting , 2016 .

[36]  Bo Yang,et al.  Flexible neural trees ensemble for stock index modeling , 2007, Neurocomputing.

[37]  Stefan Lessmann,et al.  Bridging the divide in financial market forecasting: machine learners vs. financial economists , 2016, Expert Syst. Appl..

[38]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[39]  Michel Ballings,et al.  Evaluating multiple classifiers for stock price direction prediction , 2015, Expert Syst. Appl..

[40]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[41]  Yali Amit,et al.  Shape Quantization and Recognition with Randomized Trees , 1997, Neural Computation.

[42]  Zehong Yang,et al.  Short-term stock price prediction based on echo state networks , 2009, Expert Syst. Appl..

[43]  Amita Sharma,et al.  Improving Diagnosis of Depression With XGBOOST Machine Learning Model and a Large Biomarkers Dutch Dataset (n = 11,081) , 2020, Frontiers in Big Data.

[44]  Wu Hao,et al.  Predicting Hard Rock Pillar Stability Using GBDT, XGBoost, and LightGBM Algorithms , 2020, Mathematics.

[45]  Scott L. Zeger,et al.  Clinical risk prediction with random forests for survival, longitudinal, and multivariate (RF-SLAM) data analysis , 2019, BMC Medical Research Methodology.

[46]  E. Fama The Behavior of Stock-Market Prices , 1965 .

[47]  Emma Izquierdo-Verdiguier,et al.  Land Cover Classification Using Extremely Randomized Trees: A Kernel Perspective , 2020, IEEE Geoscience and Remote Sensing Letters.

[48]  Shahrokh Asadi,et al.  Improvement of Bagging performance for classification of imbalanced datasets using evolutionary multi-objective optimization , 2020, Eng. Appl. Artif. Intell..

[49]  Khaled Rasheed,et al.  Stock market prediction with multiple classifiers , 2007, Applied Intelligence.

[50]  ChongEunsuk,et al.  Deep learning networks for stock market analysis and prediction , 2017 .

[51]  Tugrul U. Daim,et al.  Using artificial neural network models in stock market index prediction , 2011, Expert Syst. Appl..