Word sense disambiguation application in sentiment analysis of news headlines: an applied approach to FOREX market prediction

Sentiment analysis of textual content has become a popular approach for market prediction. However, lack of a process for word sense disambiguation makes it questionable whether the sentiment expressed by the context is correctly identified. Meanwhile, many studies in natural language processing have focused on word sense disambiguation. However, there has been a weak link between the two logically relevant fields of study. Therefore, with two motivations, we propose a system for FOREX market prediction that exploits word sense disambiguation in sentiment analysis of news headlines and predicts the directional movement of a currency pair. The first motivation is the implementation of a novel word sense disambiguation that can determine the proper senses of all significant words in a news headline. The main contributions of this work that make the first motivation possible, are the introduction of novel approaches termed Relevant Gloss Retrieval, Similarity Threshold, Verb Nominalization, and also optimization measures to decrease execution time. The second motivation is to prove that determination of proper senses of significant words in textual contents can improve the determination of sentiment, conveyed by the context, and consequently any application based on sentiment analysis. Inclusion of the word sense disambiguation into the proposed system proves the achievement of the second motivation. Carried out tests with the same dataset prove that the proposed system outperforms one of the best systems (to our best knowledge) proposed for market prediction and improves accuracy from 83.33% to 91.67%. The detail for reproduction of the system is amply provided.

[1]  Leonidas Anastasakis,et al.  Exchange rate forecasting using a combined parametric and nonparametric self-organising modelling approach , 2009, Expert Syst. Appl..

[2]  David W. Conrath,et al.  Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy , 1997, ROCLING/IJCLCLP.

[3]  Ted Pedersen,et al.  An Adapted Lesk Algorithm for Word Sense Disambiguation Using WordNet , 2002, CICLing.

[4]  Dan Klein,et al.  Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network , 2003, NAACL.

[5]  Raymond K. Wong,et al.  Currency Exchange Rate Forecasting From News Headlines , 2002, Australasian Database Conference.

[6]  Isabell M. Welpe,et al.  Tweets and Trades: The Information Content of Stock Microblogs , 2010 .

[7]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[8]  Daniel C. Howe,et al.  RiTa: creativity support for computational literature , 2009, C&C '09.

[9]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[10]  Antonios Siganos,et al.  Facebook's daily sentiment and international stock markets , 2014 .

[11]  Júlio C. Nievola,et al.  Predicting published news effect in the Brazilian stock market , 2012, Expert Syst. Appl..

[12]  Ted Pedersen,et al.  UMND1: Unsupervised Word Sense Disambiguation Using Contextual Semantic Relatedness , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[13]  Michael E. Lesk,et al.  Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone , 1986, SIGDOC '86.

[14]  Mohamed M. Mostafa,et al.  More than words: Social networks' text mining for consumer brand sentiments , 2013, Expert Syst. Appl..

[15]  Dirk Neumann,et al.  Automated news reading: Stock price prediction based on financial news using context-capturing features , 2013, Decis. Support Syst..

[16]  P. Gloor,et al.  Procedia Social and Behavioral Sciences , 2022 .

[17]  Chenn-Jung Huang,et al.  Realization of a news dissemination agent based on weighted association rules and text mining techniques , 2010, Expert Syst. Appl..

[18]  Barry Smyth,et al.  Combining similarity and sentiment in opinion mining for product recommendation , 2015, Journal of Intelligent Information Systems.

[19]  Marc-André Mittermayer,et al.  Forecasting Intraday stock price trends with text mining techniques , 2004, 37th Annual Hawaii International Conference on System Sciences, 2004. Proceedings of the.

[20]  Ying Wah Teh,et al.  Text mining of news-headlines for FOREX market prediction: A Multi-layer Dimension Reduction Algorithm with semantics and sentiment , 2015, Expert Syst. Appl..

[21]  Saket Srivastava,et al.  Using twitter sentiments and search volumes index to predict oil, gold, forex and markets indices , 2012 .

[22]  Janyce Wiebe,et al.  Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2005, HLT.

[23]  Ying Liu,et al.  Using WordNet to Disambiguate Word Senses for Text Classification , 2007, International Conference on Computational Science.

[24]  David Yarowsky,et al.  Unsupervised Word Sense Disambiguation Rivaling Supervised Methods , 1995, ACL.

[25]  Philip Resnik,et al.  Using Information Content to Evaluate Semantic Similarity in a Taxonomy , 1995, IJCAI.

[26]  Stan Matwin,et al.  A WordNet-based Algorithm for Word Sense Disambiguation , 1995, IJCAI.

[27]  Alpha K. Luk Statistical Sense Disambiguation with Relatively Small Corpora Using Dictionary Definitions , 1995, ACL.

[28]  Ling Liu,et al.  The effect of news and public mood on stock movements , 2014, Inf. Sci..

[29]  Athanasios Kehagias,et al.  A Comparison of Word- and Sense-Based Text Categorization Using Several Classification Algorithms , 2003, Journal of Intelligent Information Systems.

[30]  Hsinchun Chen,et al.  Evaluating sentiment in financial news articles , 2012, Decis. Support Syst..

[31]  Yacine Ouzrout,et al.  A word sense disambiguation method for feature level sentiment analysis , 2015, 2015 9th International Conference on Software, Knowledge, Information Management and Applications (SKIMA).

[32]  Isabell M. Welpe,et al.  Tweets and Trades: The Information Content of Stock Microblogs , 2010 .

[33]  Cvetana Krstev,et al.  Hybrid sentiment analysis framework for a morphologically rich language , 2015, Journal of Intelligent Information Systems.

[34]  Johan Bollen,et al.  Twitter mood predicts the stock market , 2010, J. Comput. Sci..

[35]  Andrea Esuli,et al.  SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining , 2010, LREC.

[36]  Alan R. Dennis,et al.  Trading on Twitter: The Financial Information Content of Emotion in Social Media , 2014, 2014 47th Hawaii International Conference on System Sciences.