Financial Sentiment Analysis: An Investigation into Common Mistakes and Silver Bullets

The recent dominance of machine learning-based natural language processing methods has fostered the culture of overemphasizing model accuracies rather than studying the reasons behind their errors. Interpretability, however, is a critical requirement for many downstream AI and NLP applications, e.g., in finance, healthcare, and autonomous driving. This study, instead of proposing any “new model”, investigates the error patterns of some widely acknowledged sentiment analysis methods in the finance domain. We discover that (1) those methods belonging to the same clusters are prone to similar error patterns, and (2) there are six types of linguistic features that are pervasive in the common errors. These findings provide important clues and practical considerations for improving sentiment analysis models for financial applications.

[1]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[2]  Erik Cambria,et al.  Big Social Data Analysis , 2013 .

[3]  Maite Taboada,et al.  Lexicon-Based Methods for Sentiment Analysis , 2011, CL.

[4]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[5]  Shay B. Cohen,et al.  Stock Movement Prediction from Tweets and Historical Prices , 2018, ACL.

[6]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[7]  Erik Cambria,et al.  Business Taxonomy Construction Using Concept-Level Hierarchical Clustering , 2019, ArXiv.

[8]  Tomas Mikolov,et al.  Bag of Tricks for Efficient Text Classification , 2016, EACL.

[9]  Hinrich Schütze,et al.  Automatic Domain Adaptation Outperforms Manual Domain Adaptation for Predicting Financial Outcomes , 2019, ACL.

[10]  Tim Loughran,et al.  When is a Liability not a Liability? Textual Analysis, Dictionaries, and 10-Ks , 2010 .

[11]  Carlo Vercellis,et al.  Discovering Bayesian Market Views for Intelligent Asset Allocation , 2018, ECML/PKDD.

[12]  Sergei Nirenburg,et al.  Mood and modality: out of theory and into the fray , 2004, Nat. Lang. Eng..

[13]  Qing He,et al.  Beyond Polarity: Interpretable Financial Sentiment Analysis with Hierarchical Query-driven Attention , 2018, IJCAI.

[14]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[15]  Erik Cambria,et al.  Sentiment-aware volatility forecasting , 2019, Knowl. Based Syst..

[16]  Paul Buitelaar,et al.  Curse or Boon? Presence of Subjunctive Mood in Opinionated Text , 2015, IWCS.

[17]  Ahmed Abbasi,et al.  Benchmarking Twitter Sentiment Analysis Tools , 2014, LREC.

[18]  Kuldip K. Paliwal,et al.  Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[19]  Yue Zhang,et al.  Sentence-State LSTM for Text Representation , 2018, ACL.

[20]  Frank Z. Xing,et al.  High-Frequency News Sentiment and Its Application to Forex Market Prediction , 2020, HICSS.

[21]  Erik Cambria,et al.  Natural language based financial forecasting: a survey , 2017, Artificial Intelligence Review.

[22]  Barry Smyth,et al.  HTML: Hierarchical Transformer-based Multi-task Learning for Volatility Prediction , 2020, WWW.

[23]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[24]  Alok N. Choudhary,et al.  Sentiment Analysis of Conditional Sentences , 2009, EMNLP.

[25]  Xiang Zhang,et al.  Character-level Convolutional Networks for Text Classification , 2015, NIPS.

[26]  S. Attardo Irony as relevant inappropriateness , 2000 .

[27]  Sriparna Saha,et al.  Understanding Temporal Query Intent , 2015, SIGIR.

[28]  Iyad Rahwan,et al.  Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm , 2017, EMNLP.

[29]  Carlo Vercellis,et al.  Public Mood–Driven Asset Allocation: the Importance of Financial Sentiment in Portfolio Management , 2018, Cognitive Computation.

[30]  Yue Zhang,et al.  Deep Learning for Event-Driven Stock Prediction , 2015, IJCAI.

[31]  Timothy Baldwin,et al.  Encoding Sentiment Information into Word Vectors for Sentiment Analysis , 2018, COLING.

[32]  Katherine A. Keith,et al.  Modeling Financial Analysts’ Decision Making via the Pragmatics and Semantics of Earnings Calls , 2019, ACL.

[33]  Erik Cambria,et al.  SenticNet 6: Ensemble Application of Symbolic and Subsymbolic AI for Sentiment Analysis , 2020, CIKM.