Predicting Stock Prices using Ensemble Learning and Sentiment Analysis

The recent success of the application of Artificial Intelligence in the financial sector has resulted in more firms relying on stochastic models for predicting the behaviour of the market. Everyday, quantitative analysts strive to attain better accuracies from their machine learning models for forecasting returns from stocks. Support Vector Machine (SVM) and Random Forest based regression models are known for their effectiveness in accurately predicting closing prices. In this work, we propose a technique for analyzing and predicting stock prices of companies using the aforementioned algorithms as an ensemble. Datasets from India's National Stock Exchange (NSE) containing basic market price information are preprocessed to include well known leading technical indicators as features. Feature selection, which ranks features based on their degree of influence on the final closing price has been incorporated to reduce the size of the training dataset. Additionally, we evaluate the effectiveness of considering the public opinion of a company by employing sentiment analysis. Using a trained Word2Vec model, company specific hash-tagged posts from Twitter are classified as positive or negative. Our proposed ensemble model is then trained on a new dataset which combines the technical indicator data along with the aggregated number of positive/negative tweets of a company over time. Our experiments indicate that in some scenarios, the ensemble model performs better than the constituent models and is highly dependent of the nature and size of the training data. However, combining technical indicator data with aggregated positive/negative tweet counts has a negligible effect on the performance of the ensemble model.

[1]  Steven Bird,et al.  NLTK: The Natural Language Toolkit , 2002, ACL 2006.

[2]  Anshul Mittal,et al.  Stock Prediction Using Twitter Sentiment Analysis , 2011 .

[3]  Avi Arampatzis,et al.  Stock Price Forecasting via Sentiment Analysis on Twitter , 2016, PCI.

[4]  Sofiane Labidi,et al.  USING SENTIMENT ANALYSIS FOR STOCK EXCHANGE PREDICTION , 2016 .

[5]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[6]  Indu Kumar,et al.  A Comparative Study of Supervised Machine Learning Algorithms for Stock Market Trend Prediction , 2018, 2018 Second International Conference on Inventive Communication and Computational Technologies (ICICCT).

[7]  Ji Bo,et al.  A multi-factor analysis model of quantitative investment based on GA and SVM , 2017, 2017 2nd International Conference on Image, Vision and Computing (ICIVC).

[8]  Tian Ye Stock forecasting method based on wavelet analysis and ARIMA-SVR model , 2017, 2017 3rd International Conference on Information Management (ICIM).

[9]  Sebastian Raschka,et al.  MLxtend: Providing machine learning and data science utilities and extensions to Python's scientific computing stack , 2018, J. Open Source Softw..

[10]  Ivan Stajduhar,et al.  Predicting stock market trends using random forests: A sample of the Zagreb stock exchange , 2015, 2015 38th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO).

[11]  S. Bharathi,et al.  Sentiment Analysis for Effective Stock Market Prediction , 2017 .

[12]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[13]  Li Jianping,et al.  Research on financial time series forecasting based on SVM , 2016, 2016 13th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP).

[14]  Abdelaziz Berrado,et al.  Machine learning techniques for short term stock movements classification for Moroccan stock exchange , 2016, 2016 11th International Conference on Intelligent Systems: Theories and Applications (SITA).

[15]  Michelangelo Ceci,et al.  Semi-Supervised Multi-View Learning for Gene Network Reconstruction , 2015, SEBD.

[16]  Leo Breiman,et al.  Stacked regressions , 2004, Machine Learning.

[17]  Piyasak Jeatrakul,et al.  A forecast model for stock trading using support vector machine , 2016, 2016 International Computer Science and Engineering Conference (ICSEC).

[18]  Ganapati Panda,et al.  Sentiment analysis of Twitter data for predicting stock market movements , 2016, 2016 International Conference on Signal Processing, Communication, Power and Embedded System (SCOPES).

[19]  Maitreyee Dutta,et al.  Sentiment Analysis for Indian Stock Market Prediction Using Sensex and Nifty , 2015 .

[20]  Wei Li,et al.  A comparative study on trend forecasting approach for stock price time series , 2017, 2017 11th IEEE International Conference on Anti-counterfeiting, Security, and Identification (ASID).

[21]  Yang Jian,et al.  Ensemble Model for Stock Price Movement Trend Prediction on Different Investing Periods , 2016 .

[22]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[23]  Steven Bird,et al.  NLTK: The Natural Language Toolkit , 2002, ACL.

[24]  Christopher Potts,et al.  Learning Word Vectors for Sentiment Analysis , 2011, ACL.