Natural language based financial forecasting: a survey

Natural language processing (NLP), or the pragmatic research perspective of computational linguistics, has become increasingly powerful due to data availability and various techniques developed in the past decade. This increasing capability makes it possible to capture sentiments more accurately and semantics in a more nuanced way. Naturally, many applications are starting to seek improvements by adopting cutting-edge NLP techniques. Financial forecasting is no exception. As a result, articles that leverage NLP techniques to predict financial markets are fast accumulating, gradually establishing the research field of natural language based financial forecasting (NLFF), or from the application perspective, stock market prediction. This review article clarifies the scope of NLFF research by ordering and structuring techniques and applications from related work. The survey also aims to increase the understanding of progress and hotspots in NLFF, and bring about discussions across many different disciplines.

[1]  Eamonn J. Keogh,et al.  Online discovery and maintenance of time series motifs , 2010, KDD.

[2]  Ling Liu,et al.  The effect of news and public mood on stock movements , 2014, Inf. Sci..

[3]  Joseph Engelberg,et al.  The Causal Impact of Media in Financial Markets , 2009 .

[4]  CortezPaulo,et al.  Stock market sentiment lexicon acquisition using microblogging data and statistical measures , 2016 .

[5]  Kazuhiro Seki,et al.  Leveraging temporal properties of news events for stock market prediction , 2015, Artif. Intell. Res..

[6]  Erik Cambria,et al.  A review of affective computing: From unimodal analysis to multimodal fusion , 2017, Inf. Fusion.

[7]  Bill McDonald,et al.  Textual Analysis in Accounting and Finance: A Survey , 2016 .

[8]  Steven C. H. Hoi,et al.  Online ARIMA Algorithms for Time Series Prediction , 2016, AAAI.

[9]  Steven C. H. Hoi,et al.  Online portfolio selection: A survey , 2012, CSUR.

[10]  Andrew Trotman,et al.  Sound and complete relevance assessment for XML retrieval , 2008, TOIS.

[11]  Erik Cambria,et al.  Convolutional MKL Based Multimodal Emotion Recognition and Sentiment Analysis , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[12]  Guofu Zhou,et al.  Bayesian Portfolio Analysis , 2010 .

[13]  Ying Wah Teh,et al.  Text mining for market prediction: A systematic review , 2014, Expert Syst. Appl..

[14]  Yue Zhang,et al.  Deep Learning for Event-Driven Stock Prediction , 2015, IJCAI.

[15]  Jake M. Hofman,et al.  Prediction and explanation in social systems , 2017, Science.

[16]  Erik Cambria,et al.  Predicting evolving chaotic time series with fuzzy neural networks , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[17]  Hsinchun Chen,et al.  Textual analysis of stock market prediction using breaking financial news: The AZFin text system , 2009, TOIS.

[18]  Hassan H. Malik,et al.  Accurate information extraction for quantitative financial events , 2011, CIKM '11.

[19]  A. Brabazon,et al.  An Introduction to Evolutionary Computation in Finance , 2008, IEEE Computational Intelligence Magazine.

[20]  Jian Zhang,et al.  Daily stock market forecast from textual web data , 1998, SMC'98 Conference Proceedings. 1998 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No.98CH36218).

[21]  Erik Cambria,et al.  Affective Computing and Sentiment Analysis , 2016, IEEE Intelligent Systems.

[22]  Jun Wang,et al.  Doubly Regularized Portfolio with Risk Minimization , 2014, AAAI.

[23]  Edgar E. Peters A Chaotic Attractor For the S&P 500 , 1991 .

[24]  Timothy Baldwin,et al.  Multiword Expressions: A Pain in the Neck for NLP , 2002, CICLing.

[25]  Paulo Cortez,et al.  The impact of microblogging data for stock market prediction: Using Twitter to predict returns, volatility, trading volume and survey sentiment indices , 2017 .

[26]  Matthew J. Schneider,et al.  Forecasting Sales of New and Existing Products Using Consumer Reviews: A Random Projections Approach , 2015 .

[27]  Rob J Hyndman,et al.  Another look at measures of forecast accuracy , 2006 .

[28]  Bin Zhang,et al.  Growth of single corrosion pit in sputtered nanocrystalline stainless steel film , 2016 .

[29]  Edward F. Kelly,et al.  Computer recognition of English word senses , 1975 .

[30]  Dirk Neumann,et al.  Automated news reading: Stock price prediction based on financial news using context-capturing features , 2013, Decis. Support Syst..

[31]  Nils B. Weidmann,et al.  Predicting Conflict in Space and Time , 2010 .

[32]  Erik Cambria,et al.  An Introduction to Concept-Level Sentiment Analysis , 2013, MICAI.

[33]  E. Henry Are Investors Influenced By How Earnings Press Releases Are Written? , 2006 .

[34]  Nassim Nicholas Taleb,et al.  Finiteness of variance is irrelevant in the practice of quantitative finance , 2009, Complex..

[35]  Jure Leskovec,et al.  Inducing Domain-Specific Sentiment Lexicons from Unlabeled Corpora , 2016, EMNLP.

[36]  Alaa A. Kharbouch,et al.  Three models for the description of language , 1956, IRE Trans. Inf. Theory.

[37]  John Kittrell Sentiment reversals as buy signals , 2012 .

[38]  S. Shacham,et al.  A shortened version of the Profile of Mood States. , 1983, Journal of personality assessment.

[39]  Alexander Vervuurt,et al.  Stochastic Portfolio Theory: A Machine Learning Approach , 2016, UAI.

[40]  Ronen Feldman,et al.  Techniques and applications for sentiment analysis , 2013, CACM.

[41]  Jan Muntermann,et al.  An intraday market risk management approach based on textual analysis , 2011, Decis. Support Syst..

[42]  Oliver Hinz,et al.  Using Twitter to Predict the Stock Market , 2015, Business & Information Systems Engineering.

[43]  Tim Loughran,et al.  Textual Analysis in Accounting and Finance: A Survey: TEXTUAL ANALYSIS IN ACCOUNTING AND FINANCE , 2016 .

[44]  James Allan,et al.  Language models for financial news recommendation , 2000, CIKM '00.

[45]  Yue Zhang,et al.  Measuring the Information Content of Financial News , 2016, COLING.

[46]  Erik Cambria,et al.  The Hourglass of Emotions , 2011, COST 2102 Training School.

[47]  David M. Blei,et al.  Probabilistic topic models , 2012, Commun. ACM.

[48]  Ramesh Nallapati,et al.  Sparse Word Graphs: A Scalable Algorithm for Capturing Word Correlations in Topic Models , 2007 .

[49]  Vadlamani Ravi,et al.  A survey of the applications of text mining in financial domain , 2016, Knowl. Based Syst..

[50]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[51]  Matthias W. Uhl Reuters Sentiment and Stock Returns , 2014 .

[52]  Hsinchun Chen,et al.  Tensor-Based Learning for Predicting Stock Movements , 2015, AAAI.

[53]  Bin Li,et al.  Moving average reversion strategy for on-line portfolio selection , 2015, Artif. Intell..

[54]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[55]  Kiyoaki Shirai,et al.  Topic Modeling based Sentiment Analysis on Social Media for Stock Market Prediction , 2015, ACL.

[56]  Luyang Li,et al.  Truth Discovery with Memory Network , 2016, ArXiv.

[57]  Andrew M. Rockett,et al.  Parrondo's paradox , 2003 .

[58]  Werner Antweiler,et al.  Is All that Talk Just Noise? The Information Content of Internet Stock Message Boards , 2001 .

[59]  Franciska de Jong,et al.  Classifying the influence of negative affect expressed by the financial media on investor behavior , 2014, IIiX.

[60]  Wei Wei,et al.  Twitter volume spikes and stock options pricing , 2016, Comput. Commun..

[61]  Yves-Laurent Kom Samo,et al.  Stochastic Portfolio Theory: A Machine Learning Perspective , 2016, UAI 2016.

[62]  Ning Chen,et al.  Financial credit risk assessment: a recent review , 2015, Artificial Intelligence Review.

[63]  Björn W. Schuller,et al.  SenticNet 4: A Semantic Resource for Sentiment Analysis Based on Conceptual Primitives , 2016, COLING.

[64]  Tomasz Makarewicz,et al.  Bubble Formation and (In)Efficient Markets in Learning�?To�?Forecast and Optimise Experiments , 2015 .

[65]  Rayner Alfred,et al.  A review of stock market prediction with Artificial neural network (ANN) , 2013, 2013 IEEE International Conference on Control System, Computing and Engineering.

[66]  Delhi Paiva,et al.  Copula-based regression models: A survey , 2009 .

[67]  Yunming Ye,et al.  Dynamic Business Network Analysis for Correlated Stock Price Movement Prediction , 2015, IEEE Intelligent Systems.

[68]  Alan F. Murray,et al.  International Joint Conference on Neural Networks , 1993 .

[69]  Mike Y. Chen,et al.  Yahoo! for Amazon: Sentiment Extraction from Small Talk on the Web , 2001 .

[70]  Rada Mihalcea,et al.  What Men Say, What Women Hear: Finding Gender-Specific Meaning Shades , 2016, IEEE Intelligent Systems.

[71]  K. Frazier,et al.  A METHODOLOGY FOR THE ANALYSIS OF NARRATIVE ACCOUNTING DISCLOSURES , 1984 .

[72]  Ivor W. Tsang,et al.  Learning word dependencies in text by means of a deep recurrent belief network , 2016, Knowl. Based Syst..

[73]  Wai Lam,et al.  Stock prediction: Integrating text mining approach using real-time news , 2003, 2003 IEEE International Conference on Computational Intelligence for Financial Engineering, 2003. Proceedings..

[74]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[75]  Michael T. Cliff,et al.  Investor Sentiment and the Near-Term Stock Market , 2001 .

[76]  H. Varian,et al.  Predicting the Present with Google Trends , 2012 .

[77]  Evangelos Simoudis,et al.  Mining business databases , 1996, CACM.

[78]  ChenHsinchun,et al.  Textual analysis of stock market prediction using breaking financial news , 2009 .

[79]  Leslie M. Collins,et al.  Financial fraud detection using vocal, linguistic and financial cues , 2015, Decis. Support Syst..

[80]  Xiaotie Deng,et al.  Exploiting Topic based Twitter Sentiment for Stock Prediction , 2013, ACL.

[81]  John A. Barnden,et al.  Semantic Networks , 1998, Encyclopedia of Social Network Analysis and Mining.

[82]  Hugo Liu,et al.  ConceptNet — A Practical Commonsense Reasoning Tool-Kit , 2004 .

[83]  Jun Zhao,et al.  Question Answering over Knowledge Bases , 2015, IEEE Intelligent Systems.

[84]  R. Palmer,et al.  Time series properties of an artificial stock market , 1999 .

[85]  Stacy Marsella,et al.  Computationally modeling human emotion , 2014, CACM.

[86]  Dong Lou,et al.  Connected Stocks , 2014 .

[87]  Erik Cambria,et al.  Label Embedding for Zero-shot Fine-grained Named Entity Typing , 2016, COLING.

[88]  Enrico H. Gerding,et al.  Twenty-Eighth AAAI Conference on Artificial Intelligence , 2014, AAAI 2014.

[89]  Oren Etzioni,et al.  Open Information Extraction from the Web , 2007, CACM.

[90]  Vasileios Hatzivassiloglou,et al.  Predicting the Semantic Orientation of Adjectives , 1997, ACL.

[91]  Erik Cambria,et al.  Deep Learning-Based Document Modeling for Personality Detection from Text , 2017, IEEE Intelligent Systems.

[92]  Haixun Wang,et al.  Guest Editorial: Big Social Data Analysis , 2014, Knowl. Based Syst..

[93]  Andrea Frazzini,et al.  Economic Links and Predictable Returns , 2007 .

[94]  Hung-Yu Kao,et al.  Automatic Domain-Specific Sentiment Lexicon Generation with Label Propagation , 2013, IIWAS '13.

[95]  Pamela G. Yang Volatility Harvesting in Theory and Practice , 2016 .

[96]  Andrea Esuli,et al.  SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining , 2010, LREC.

[97]  C. Hommes Heterogeneous Agent Models in Economics and Finance , 2005 .

[98]  Samuel W. K. Chan,et al.  A text-based decision support system for financial sequence prediction , 2011, Decis. Support Syst..

[99]  C. Hommes Chapter 23 Heterogeneous Agent Models in Economics and Finance , 2006 .

[100]  Erik Cambria,et al.  Jumping NLP Curves: A Review of Natural Language Processing Research [Review Article] , 2014, IEEE Computational Intelligence Magazine.

[101]  Qing Li,et al.  Exploiting Social Relations and Sentiment for Stock Prediction , 2014, EMNLP.

[102]  Aristides Gionis,et al.  Correlating financial time series with micro-blogging activity , 2012, WSDM '12.

[103]  S. Heston,et al.  News Versus Sentiment: Predicting Stock Returns from News Stories , 2015 .

[104]  Hsinchun Chen,et al.  Evaluating sentiment in financial news articles , 2012, Decis. Support Syst..

[105]  H. Varian,et al.  Predicting the Present with Google Trends , 2009 .

[106]  Abraham Kandel,et al.  ADMIRAL: A Data Mining Based Financial Trading System , 2007, 2007 IEEE Symposium on Computational Intelligence and Data Mining.

[107]  The Future of the Social Web, Papers from the 2011 ICWSM Workshop, Barcelona, Catalonia, Spain, July 21, 2011 , 2011, The Future of the Social Web.

[108]  Josef Lakonishok,et al.  The Weekend Effect: Trading Patterns of Individual and Institutional Investors , 1990 .

[109]  K. Rasheed,et al.  HURST EXPONENT AND FINANCIAL MARKET PREDICTABILITY , 2005 .

[110]  Julien Velcin,et al.  Sentiment analysis on social media for stock movement prediction , 2015, Expert Syst. Appl..

[111]  Cheolbeom Park,et al.  The Profitability of Technical Analysis: A Review , 2004 .

[112]  O. Hinz,et al.  Using Twitter to Predict the Stock Market , 2015, Business & Information Systems Engineering.

[113]  Michal Tkác,et al.  Artificial neural networks in business: Two decades of research , 2016, Appl. Soft Comput..

[114]  E. Fama EFFICIENT CAPITAL MARKETS: A REVIEW OF THEORY AND EMPIRICAL WORK* , 1970 .

[115]  Christopher Polk,et al.  Connected Stocks: Connected Stocks , 2014 .

[116]  Jan Hendrik Witte Volatility Harvesting: Extracting Return from Randomness , 2015 .

[117]  Adriano Lorena Inácio de Oliveira,et al.  Expert Systems With Applications , 2022 .

[118]  H. D. Ardakani,et al.  Application of data mining techniques in stock markets: A survey , 2010 .

[119]  Sofus A. Macskassy,et al.  More than Words: Quantifying Language to Measure Firms' Fundamentals the Authors Are Grateful for Assiduous Research Assistance from Jie Cao and Shuming Liu. We Appreciate Helpful Comments From , 2007 .

[120]  Sarika Bobde,et al.  STOCK MARKET FORECASTING TECHNIQUES : LITERATURE SURVEY , 2016 .

[121]  Erik Cambria,et al.  A Deeper Look into Sarcastic Tweets Using Deep Convolutional Neural Networks , 2016, COLING.

[122]  Charles Song,et al.  SOPS: Stock Prediction Using Web Sentiment , 2007, Seventh IEEE International Conference on Data Mining Workshops (ICDMW 2007).

[123]  Mike Thelwall,et al.  Sentiment Analysis Is a Big Suitcase , 2017, IEEE Intelligent Systems.

[124]  Ramanathan V. Guha,et al.  CYC: A Midterm Report , 1990, AI Mag..

[125]  Tim Loughran,et al.  When is a Liability not a Liability? Textual Analysis, Dictionaries, and 10-Ks , 2010 .

[126]  Li Chen,et al.  News impact on stock price return via sentiment analysis , 2014, Knowl. Based Syst..

[127]  Samuel W. K. Chan,et al.  Sentiment analysis in financial texts , 2017, Decis. Support Syst..

[128]  Paolo Gastaldo,et al.  Bayesian network based extreme learning machine for subjectivity detection , 2017, J. Frankl. Inst..

[129]  Guoqiang Peter Zhang,et al.  Time series forecasting using a hybrid ARIMA and neural network model , 2003, Neurocomputing.

[130]  Carlo Strapparava,et al.  WordNet Affect: an Affective Extension of WordNet , 2004, LREC.

[131]  Jonathan L. Ticknor A Bayesian regularized artificial neural network for stock market forecasting , 2013, Expert Syst. Appl..

[132]  Johan Bollen,et al.  Twitter mood predicts the stock market , 2010, J. Comput. Sci..

[133]  Erik Cambria,et al.  Aspect extraction for opinion mining with a deep convolutional neural network , 2016, Knowl. Based Syst..

[134]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[135]  Claire Cardie,et al.  OpinionFinder: A System for Subjectivity Analysis , 2005, HLT.

[136]  Sa-Kwang Song,et al.  Media-aware quantitative trading based on public Web information , 2014, Decis. Support Syst..

[137]  Paulo Cortez,et al.  Stock market sentiment lexicon acquisition using microblogging data and statistical measures , 2016, Decis. Support Syst..