Classifying facts and opinions in Twitter messages: a deep learning-based approach

ABSTRACT Massive social media data present businesses with an immense opportunity to extract useful insights. However, social media messages typically consist of both facts and opinions, posing a challenge to analytics applications that focus more on either facts and opinions. Distinguishing facts and opinionss may significantly improve subsequent analytics tasks. In this study, we propose a deep learning-based algorithm that automatically separates facts from opinions in Twitter messages. The algorithm outperformed multiple popular baselines in an experiment we conducted. We further applied the proposed algorithm to track customer complaints and found that it indeed benefits subsequent analytics applications.

[1]  Hanan Samet,et al.  TwitterStand: news in tweets , 2009, GIS.

[2]  Jay F. Nunamaker,et al.  Detecting Fake Websites: The Contribution of Statistical Learning Theory , 2010, MIS Q..

[3]  Lei Zhang,et al.  Sentiment Analysis and Opinion Mining , 2017, Encyclopedia of Machine Learning and Data Mining.

[4]  Isabell M. Welpe,et al.  Election Forecasts With Twitter , 2011 .

[5]  Stefan Feuerriegel,et al.  Decision support from financial disclosures with deep neural networks and transfer learning , 2017, Decis. Support Syst..

[6]  Michael S. Bernstein,et al.  Twitinfo: aggregating and visualizing microblogs for event exploration , 2011, CHI.

[7]  Hila Becker,et al.  Identifying content for planned events across social media sites , 2012, WSDM '12.

[8]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[9]  Jeonghee Yi,et al.  Sentiment analysis: capturing favorability using natural language processing , 2003, K-CAP '03.

[10]  Jie Jennifer Zhang,et al.  Social Media and Firm Equity Value , 2013, Inf. Syst. Res..

[11]  Hong Yu,et al.  Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying the Polarity of Opinion Sentences , 2003, EMNLP.

[12]  Alex Wright Our sentiments, exactly , 2009, CACM.

[13]  Wu He,et al.  A novel social media competitive analytics framework with sentiment benchmarks , 2015, Inf. Manag..

[14]  Huimin Zhao,et al.  Adapting sentiment lexicons to domain-specific social media texts , 2017, Decis. Support Syst..

[15]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[16]  Rui Li,et al.  TEDAS: A Twitter-based Event Detection and Analysis System , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[17]  Oren Etzioni,et al.  Open domain event extraction from twitter , 2012, KDD.

[18]  Ronen Feldman,et al.  Book Reviews: The Text Mining Handbook: Advanced Approaches to Analyzing Unstructured Data by Ronen Feldman and James Sanger , 2008, CL.

[19]  Ellen Riloff,et al.  Learning subjective nouns using extraction pattern bootstrapping , 2003, CoNLL.

[20]  David Zimbra,et al.  Targeted Twitter Sentiment Analysis for Brands Using Supervised Feature Engineering and the Dynamic Architecture for Artificial Neural Networks , 2016, J. Manag. Inf. Syst..

[21]  Weiguo Fan,et al.  An Analytical Framework for Understanding Knowledge-Sharing Processes in Online Q&A Communities , 2014, ACM Trans. Manag. Inf. Syst..

[22]  Janyce Wiebe,et al.  Instructions for annotating opinions in newspaper articles , 2002 .

[23]  Ellen Riloff,et al.  Learning Extraction Patterns for Subjective Expressions , 2003, EMNLP.

[24]  Jana-Rebecca Rehse,et al.  Predicting process behaviour using deep learning , 2016, Decis. Support Syst..

[25]  Mike Thelwall,et al.  Sentiment strength detection for the social web , 2012, J. Assoc. Inf. Sci. Technol..

[26]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[27]  Amit V. Deokar,et al.  Detecting Fraudulent Behavior on Crowdfunding Platforms: The Role of Linguistic and Content-Based Cues in Static and Dynamic Contexts , 2016, J. Manag. Inf. Syst..

[28]  Michail N. Giannakos,et al.  Big data analytics capabilities: a systematic literature review and research agenda , 2017, Information Systems and e-Business Management.

[29]  Alan R. Hevner,et al.  Design Science in Information Systems Research , 2004, MIS Q..

[30]  Eric Gilbert,et al.  VADER: A Parsimonious Rule-Based Model for Sentiment Analysis of Social Media Text , 2014, ICWSM.

[31]  Olivia Sheng,et al.  Investigating Predictive Power of Stock Micro Blog Sentiment in Forecasting Future Stock Price Directional Movement , 2011, ICIS.

[32]  Mor Naaman,et al.  Finding and assessing social media information sources in the context of journalism , 2012, CHI.

[33]  Gilad Mishne,et al.  Predicting Movie Sales from Blogger Sentiment , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[34]  Janyce Wiebe,et al.  Effects of Adjective Orientation and Gradability on Sentence Subjectivity , 2000, COLING.

[35]  Björn W. Schuller,et al.  New Avenues in Opinion Mining and Sentiment Analysis , 2013, IEEE Intelligent Systems.

[36]  Jui Ramaprasad,et al.  Social Media, Traditional Media, and Music Sales , 2014, MIS Q..

[37]  Mark Dredze,et al.  You Are What You Tweet: Analyzing Twitter for Public Health , 2011, ICWSM.

[38]  Philip S. Yu,et al.  A holistic lexicon-based approach to opinion mining , 2008, WSDM '08.

[39]  Chang-Tien Lu,et al.  Unsupervised Spatial Event Detection in Targeted Domains with Applications to Civil Unrest Modeling , 2014, PloS one.

[40]  Ahmed Abbasi,et al.  Benchmarking Twitter Sentiment Analysis Tools , 2014, LREC.

[41]  Maite Taboada,et al.  Lexicon-Based Methods for Sentiment Analysis , 2011, CL.

[42]  Johan Bollen,et al.  Twitter mood predicts the stock market , 2010, J. Comput. Sci..

[43]  Claire Cardie,et al.  OpinionFinder: A System for Subjectivity Analysis , 2005, HLT.

[44]  Jay F. Nunamaker,et al.  A Comparison of Classification Methods for Predicting Deception in Computer-Mediated Communication , 2004, J. Manag. Inf. Syst..

[45]  Alok Gupta,et al.  Putting Money Where the Mouths Are: The Relation Between Venture Financing and Electronic Word-of-Mouth , 2012, Inf. Syst. Res..

[46]  Salvatore T. March,et al.  Design and natural science research on information technology , 1995, Decis. Support Syst..

[47]  Huimin Zhao,et al.  Resolving Ambiguity in Sentiment Classification , 2017, ACM Trans. Manag. Inf. Syst..

[48]  Bing Liu,et al.  Sentiment Analysis and Subjectivity , 2010, Handbook of Natural Language Processing.

[49]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[50]  Yang Yu,et al.  The impact of social and conventional media on firm equity value: A sentiment analysis approach , 2013, Decis. Support Syst..

[51]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[52]  Huimin Zhao,et al.  A Hybrid Attribute Selection Approach for Text Classification , 2010, J. Assoc. Inf. Syst..

[53]  Mike Y. Chen,et al.  Yahoo! for Amazon: Sentiment Extraction from Small Talk on the Web , 2001 .

[54]  Ellen Riloff,et al.  Creating Subjective and Objective Sentence Classifiers from Unannotated Texts , 2005, CICLing.

[55]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[56]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[57]  Jay F. Nunamaker,et al.  Systems Development in Information Systems Research , 1990, J. Manag. Inf. Syst..

[58]  Barbara Poblete,et al.  Information credibility on twitter , 2011, WWW.