Discovering Hidden Concepts in Predictive Models for Texts' Polarization

The growth of Internet and the information technology has generated big changes in subjects' communication, which, nowadays, occurs through social media or via thematic forums. This challenges the traditional notion of Customer Relationship Management CRM and pushes businesses to prompt and accurate understanding of sentiments expressed, in order to address their marketing actions. In this paper, the authors propose a combined application of a supervised Sentiment Analysis SA with a probabilistic kernel discriminant to provide a robust classifier of texts polarization. The partition obtained is also described by means of a statistical characterization of the texts. Such an approach is very promising, not only in terms of classification accuracy, but also in terms of knowledge extraction. A real case study is illustrated in order to test and show the effectiveness of the proposed strategy.

[1]  Lina Zhou,et al.  Movie Review Mining: a Comparison between Supervised and Unsupervised Classification Approaches , 2005, Proceedings of the 38th Annual Hawaii International Conference on System Sciences.

[2]  M. Haenlein,et al.  Managing Customer Relationships in the Social Media Era: Introducing the Social CRM House , 2013 .

[3]  Nello Cristianini,et al.  Classification using String Kernels , 2000 .

[4]  Kristof Coussement,et al.  Improving Customer Complaint Management by Automatic Email Classification Using Linguistic Style Features as Predictors , 2007 .

[5]  R. Clarke,et al.  Theory and Applications of Correspondence Analysis , 1985 .

[6]  Kathleen R. McKeown,et al.  Predicting the semantic orientation of adjectives , 1997 .

[7]  Xijin Tang,et al.  Text classification based on multi-word with support vector machine , 2008, Knowl. Based Syst..

[8]  Jack G. Conrad,et al.  Opinion mining in legal blogs , 2007, ICAIL.

[9]  Christopher D. Manning,et al.  Baselines and Bigrams: Simple, Good Sentiment and Topic Classification , 2012, ACL.

[10]  Chih-Jen Lin,et al.  Probability Estimates for Multi-class Classification by Pairwise Coupling , 2003, J. Mach. Learn. Res..

[11]  Bing Liu,et al.  Opinion Mining and Sentiment Analysis , 2011 .

[12]  Ludovic Lebart,et al.  Exploring Textual Data , 1997 .

[13]  Rudy Prabowo,et al.  Sentiment analysis: A combined approach , 2009, J. Informetrics.

[14]  G. Cawley,et al.  Efficient approximate leave-one-out cross-validation for kernel logistic regression , 2008, Machine Learning.

[15]  Yiming Yang,et al.  A re-examination of text categorization methods , 1999, SIGIR '99.

[16]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[17]  Long-Sheng Chen,et al.  Journal of Informetrics , 2022 .

[18]  Federico Neri,et al.  Stalker, A Multilingual Text Mining Search Engine for Open Source Intelligence , 2008, IV.

[19]  Chung-Hsien Wu,et al.  Emotion recognition from text using semantic labels and separable mixture models , 2006, TALIP.

[20]  T. Bayes,et al.  Studies in the History of Probability and Statistics: IX. Thomas Bayes's Essay Towards Solving a Problem in the Doctrine of Chances , 1958 .

[21]  P. Greenberg The impact of CRM 2.0 on customer insight , 2010 .

[22]  J. Mercer Functions of Positive and Negative Type, and their Connection with the Theory of Integral Equations , 1909 .

[23]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[24]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[25]  Hsinchun Chen,et al.  Affect Analysis of Web Forums and Blogs Using Correlation Ensembles , 2008, IEEE Transactions on Knowledge and Data Engineering.

[26]  Eric T. Bradlow,et al.  Automated Marketing Research Using Online Customer Reviews , 2011 .

[27]  Shrikanth S. Narayanan,et al.  Classifying emotions in human-machine spoken dialogs , 2002, Proceedings. IEEE International Conference on Multimedia and Expo.

[28]  Janyce Wiebe,et al.  Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2005, HLT.

[29]  Christian Doerr,et al.  Context-Sensitive Sentiment Classification of Short Colloquial Text , 2012, Networking.

[30]  Bernhard Schölkopf,et al.  Support Vector Machines as Probabilistic Models , 2011, ICML.

[31]  Federico Neri,et al.  Text Mining Applied to Multilingual Corpora , 2005 .

[32]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[33]  Furio Camillo,et al.  Subjective Business Polarization: Sentiment Analysis Meets Predictive Modeling , 2013, ADBIS.

[34]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[35]  David J. Hand,et al.  Kernel Discriminant Analysis , 1983 .

[36]  Michael L. Littman,et al.  Measuring praise and criticism: Inference of semantic orientation from association , 2003, TOIS.

[37]  E. Anderson Customer Satisfaction and Word of Mouth , 1998 .

[38]  Songbo Tan,et al.  A survey on sentiment detection of reviews , 2009, Expert Syst. Appl..

[39]  Michael Gamon,et al.  Sentiment classification on customer feedback data: noisy data, large feature vectors, and the role of linguistic analysis , 2004, COLING.

[40]  Franco Salvetti,et al.  Automatic Opinion Polarity Classification of Movie Reviews , 2004 .

[41]  N. Aronszajn Theory of Reproducing Kernels. , 1950 .

[42]  Trevor J. Hastie,et al.  The Sentimental Factor: Improving Review Classification Via Human-Provided Information , 2004, ACL.

[43]  Pero Subasic,et al.  Affect analysis of text using fuzzy semantic typing , 2001, IEEE Trans. Fuzzy Syst..

[44]  Steven Skiena,et al.  Large-Scale Sentiment Analysis for News and Blogs (system demonstration) , 2007, ICWSM.

[45]  Jacob Goldenberg,et al.  Talk of the Network: A Complex Systems Look at the Underlying Process of Word-of-Mouth , 2001 .

[46]  Li Fan,et al.  Analyzing sentiments in Web 2.0 social media data in Chinese: experiments on business and marketing related Chinese Web forums , 2013, Inf. Technol. Manag..

[47]  Alistair Kennedy,et al.  SENTIMENT CLASSIFICATION of MOVIE REVIEWS USING CONTEXTUAL VALENCE SHIFTERS , 2006, Comput. Intell..

[48]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[49]  T. Bayes An essay towards solving a problem in the doctrine of chances , 2003 .

[50]  Furio Camillo,et al.  e-CRM, web semantic propensity models and micro-data-mining: an application of Kernel Discriminant Analysis to the Glam on Web case , 2006 .

[51]  Qiang Ye,et al.  Sentiment classification of online reviews to travel destinations by supervised machine learning approaches , 2009, Expert Syst. Appl..

[52]  Nitin Indurkhya,et al.  Handbook of Natural Language Processing , 2010 .

[53]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[54]  L. Lebart,et al.  Statistique exploratoire multidimensionnelle , 1995 .