Text Mining Approach to Analyse Stock Market Movement

Stock Market (SM) is a significant sector of countries’ economy and represents a crucial role in the growth of their commerce and industry. Hence, discovering efficient ways to analyse and visualise stock market data is considered a significant issue in modern finance. The use of Data Mining (DM) techniques to predict stock market has been extensively studied using historical market prices but such approaches are constrained to make assessments within the scope of existing information, and thus they are not able to model any random behaviour of stock market or provide causes behind events. One area of limited success in stock market prediction comes from textual data, which is a rich source of information and analysing it may provide better understanding of random behaviours of the market. Text Mining (TM) combined with Random Forest (RF) algorithm offers a novel approach to study critical indicators, which contribute to the prediction of stock market abnormal movements. A Stock Market Random Forest-Text Mining system (SMRF-TM) is developed to mine the critical indicators related to the 2009 Dubai stock market debt standstill. Random forest is applied to classify the extracted features into a set of semantic classes, thus extending current approaches from three to eight classes: critical down, down, neutral, up, critical up, economic, social and political. The study demonstrates that Random Forest has outperformed the other classifiers and has achieved the best accuracy in classifying the bigram features extracted from the corpus.

[1]  Azadeh Nikfarjam,et al.  Text mining approaches for stock market prediction , 2010, 2010 The 2nd International Conference on Computer and Automation Engineering (ICCAE).

[2]  Ayman E. Khedr,et al.  Predicting Stock Market Behavior using Data Mining Technique and News Sentiment Analysis , 2017 .

[3]  Sotiris B. Kotsiantis,et al.  Supervised Machine Learning: A Review of Classification Techniques , 2007, Informatica.

[4]  M. Elif Karsligil,et al.  Stock price prediction using financial news articles , 2010, 2010 2nd IEEE International Conference on Information and Financial Engineering.

[5]  George Forman,et al.  An Extensive Empirical Study of Feature Selection Metrics for Text Classification , 2003, J. Mach. Learn. Res..

[6]  T. Gungor,et al.  An evaluation of existing and new feature selection metrics in text categorization , 2008, 2008 23rd International Symposium on Computer and Information Sciences.

[7]  Imran Ghani,et al.  Text Opinion Mining to Analyze News for Stock Market Prediction , 2014 .

[8]  Ying Wah Teh,et al.  Text mining for market prediction: A systematic review , 2014, Expert Syst. Appl..

[9]  Frank J. Fabozzi,et al.  Trade the tweet: Social media text mining and sparse matrix factorization for stock market prediction , 2016 .

[10]  Sang-goo Lee,et al.  PicAChoo: a tool for customizable feature extraction utilizing characteristics of textual data , 2009, ICUIMC '09.

[11]  Fabrício Benevenuto,et al.  Comparing and combining sentiment analysis methods , 2013, COSN '13.

[12]  Juan Martínez-Romo,et al.  Detecting malicious tweets in trending topics using a statistical analysis of language , 2013, Expert Syst. Appl..

[13]  Paul S. Bradley,et al.  Clustering very large databases using EM mixture models , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[14]  Zhenming Liu,et al.  Stock Market Prediction from WSJ: Text Mining via Sparse Matrix Factorization , 2014, 2014 IEEE International Conference on Data Mining.

[15]  Vadlamani Ravi,et al.  A survey of the applications of text mining in financial domain , 2016, Knowl. Based Syst..

[16]  Hannu Vanharanta,et al.  Combining data and text mining techniques for analysing financial reports , 2004, Intell. Syst. Account. Finance Manag..

[17]  Hsinchun Chen,et al.  Textual analysis of stock market prediction using breaking financial news: The AZFin text system , 2009, TOIS.

[18]  Babis Theodoulidis,et al.  Analyzing Stock Market Fraud Cases Using a Linguistics-Based Text Mining Approach , 2014, WaSABi-FEOSW@ESWC.

[19]  Sahil Shah,et al.  Predicting stock market index using fusion of machine learning techniques , 2015, Expert Syst. Appl..

[20]  Alexander F. Gelbukh,et al.  Mining the News: Trends, Associations, and Deviations , 2001, Computación y Sistemas.

[21]  Hsinchun Chen,et al.  Evaluating sentiment in financial news articles , 2012, Decis. Support Syst..

[22]  Jian Zhang,et al.  Daily Prediction of Major Stock Indices from Textual WWW Data , 1998, KDD.

[23]  C. Aasheim,et al.  Feeling The Stock Market: A Study in the Prediction of Financial Markets Based on News Sentiment , 2017 .

[24]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .