Social information discovery enhanced by sentiment analysis techniques

Abstract In recent years, the massive diffusion of social networks has made available a large amount of user-generated content, for the most part in the form of textual data that contain people’s thoughts and emotions about a great variety of topics. In order to exploit these publicly available information, in this work we introduce a social information discovery system which elaborates simultaneously over more-than-one social network in an integrated scenario. The system is designed to ensure flexibility and scalability, thus enabling for (near-)real-time analysis even in case of high rates of content’s creation and large amounts of heterogeneous data. Furthermore, a noise detection technique ensures a high relevance of analyzed posts/tweets to the domain of interest. We also propose a lexicon-based sentiment analysis algorithm to extract and measure users’ opinion, in order to support collaboration and open innovation. Polysemous words and negations are typically challenging for lexicon-based approaches: for this reason, we introduce both a word sense disambiguation algorithm and a negation handling technique. Experiments on several datasets have proven that the combined use of both techniques improves the classification accuracy on 3-class sentiment analysis.

[1]  Francesco Colace,et al.  Sentiment detection in social networks and in collaborative learning environments , 2015, Comput. Hum. Behav..

[2]  Andrea Esuli,et al.  SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining , 2010, LREC.

[3]  Claudia Diamantini,et al.  Semantic disambiguation in a social information discovery system , 2015, 2015 International Conference on Collaboration Technologies and Systems (CTS).

[4]  Ahmad Baraani-Dastjerdi,et al.  Enriched LDA (ELDA): Combination of latent Dirichlet allocation with word co-occurrence analysis for aspect extraction , 2017, Expert Syst. Appl..

[5]  Hamido Fujita,et al.  A hybrid approach to the sentiment analysis problem at the sentence level , 2016, Knowl. Based Syst..

[6]  Maite Taboada,et al.  Lexicon-Based Methods for Sentiment Analysis , 2011, CL.

[7]  Saif Mohammad,et al.  From once upon a time to happily ever after: Tracking emotions in mail and books , 2012, Decis. Support Syst..

[8]  Erik Cambria,et al.  Aspect extraction for opinion mining with a deep convolutional neural network , 2016, Knowl. Based Syst..

[9]  David C. Hoaglin,et al.  Exploratory Data Analysis , 2005 .

[10]  Alistair Kennedy,et al.  SENTIMENT CLASSIFICATION of MOVIE REVIEWS USING CONTEXTUAL VALENCE SHIFTERS , 2006, Comput. Intell..

[11]  Davide Anguita,et al.  Statistical Learning Theory and ELM for Big Social Data Analysis , 2016, IEEE Computational Intelligence Magazine.

[12]  Kristen E. DiCerbo,et al.  Exploratory Data Analysis , 2003 .

[13]  M. Chuah,et al.  Spam Detection on Twitter Using Traditional Classifiers , 2011, ATC.

[14]  Juan Luis Castro,et al.  Lexicon-based Comments-oriented News Sentiment Analyzer system , 2012, Expert Syst. Appl..

[15]  Shubhamoy Dey,et al.  Using Self-Organizing Maps for Sentiment Analysis , 2013, ArXiv.

[16]  Matteo Golfarelli,et al.  The Dimensional Fact Model: A Conceptual Model for Data Warehouses , 1998, Int. J. Cooperative Inf. Syst..

[17]  John W. Tukey,et al.  Exploratory Data Analysis. , 1979 .

[18]  E. F. Codd,et al.  A relational model of data for large shared data banks , 1970, CACM.

[19]  Erik Cambria,et al.  Affective Computing and Sentiment Analysis , 2016, IEEE Intelligent Systems.

[20]  João Francisco Valiati,et al.  Document-level sentiment classification: An empirical comparison between SVM and ANN , 2013, Expert Syst. Appl..

[21]  E. F. Codd,et al.  A relational model of data for large shared data banks , 1970, CACM.

[22]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[23]  Santanu Kumar Rath,et al.  Document-level sentiment classification using hybrid machine learning approach , 2017, Knowledge and Information Systems.