Sentiment Classification of Spanish Reviews: An Approach based on Feature Selection and Machine Learning Methods

Sentiment analysis aims to extract users’ opinions from review documents. Nowadays, there are two main approaches for sentiment analysis: the semantic orientation and the machine learning. Sentiment analysis approaches based on Machine Learning (ML) methods work over a set of features extracted from the users’ opinions. However, the high dimensionality of the feature vector reduces the effectiveness of this approach. In this sense, we propose a sentiment classification method based on feature selection mechanisms and ML methods. The present method uses a hybrid feature extraction method based on POS pattern and dependency parsing. The features obtained are enriched semantically through commonsense knowledge bases. Then, a feature selection method is applied to eliminate the noisy and irrelevant features. Finally, a set of classifiers is trained in order to classify unknown data. To prove the effectiveness of our approach, we have conducted an evaluation in the movies and technological products domains. Also, our proposal was compared with well-known methods and algorithms used on the sentiment classification field. Our proposal obtained encouraging results based on the F-measure metric, ranging from 0.786 to 0.898 for the aforementioned domains.

[1]  Nai-Yang Deng,et al.  Support Vector Machines: Optimization Based Theory, Algorithms, and Extensions , 2012 .

[2]  João Francisco Valiati,et al.  Document-level sentiment classification: An empirical comparison between SVM and ANN , 2013, Expert Syst. Appl..

[3]  Hesham Arafat,et al.  Different Feature Selection for Sentiment Classification , 2014 .

[4]  Geoffrey Leech,et al.  EAGLES recommendations for the morphosyntactic annotation of corpora , 1996 .

[5]  Amit Ganatra,et al.  A Comparative Study of Training Algorithms for Supervised Machine Learning , 2012 .

[6]  Andrea Esuli,et al.  SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining , 2010, LREC.

[7]  Maite Taboada,et al.  Analyzing Appraisal Automatically , 2004 .

[8]  Douglas B. Lenat,et al.  CYC: a large-scale investment in knowledge infrastructure , 1995, CACM.

[9]  Madhat Alsoos,et al.  A Semantic Approach to Enhance HITS Algorithm for Extracting Associated Concepts Using ConceptNet , 2015 .

[10]  Luis Alfonso Ureña López,et al.  Ranked WordNet graph for Sentiment Polarity Classification in Twitter , 2014, Comput. Speech Lang..

[11]  Miguel Ángel Rodríguez-García,et al.  Feature-based opinion mining through ontologies , 2014, Expert Syst. Appl..

[12]  Rui Xia,et al.  Ensemble of feature sets and classification algorithms for sentiment classification , 2011, Inf. Sci..

[13]  Lluís Padró,et al.  Analizadores Multilingües en FreeLing , 2011, Linguamática.

[14]  Hugo Liu,et al.  ConceptNet — A Practical Commonsense Reasoning Tool-Kit , 2004 .

[15]  Namita Mittal,et al.  Concept-Level Sentiment Analysis with Dependency-Based Semantic Parsing: A Novel Approach , 2015, Cognitive Computation.

[16]  Alexander Gelbukh Computational Linguistics and Intelligent Text Processing : 14th International Conference, CICLing 2013, Samos, Greece, March 24-30, 2013, Proceedings, Part I , 2013 .

[17]  Samuel Reese,et al.  FreeLing 2.1: Five Years of Open-source Language Processing Tools , 2010, LREC.

[18]  Carlo Strapparava,et al.  WordNet Affect: an Affective Extension of WordNet , 2004, LREC.

[19]  Miguel Ángel Rodríguez-García,et al.  Ontology-based annotation and retrieval of services in the cloud , 2014, Knowl. Based Syst..

[20]  Luis Alfonso Ureña López,et al.  Experiments with SVM to classify opinions in different domains , 2011, Expert Syst. Appl..

[21]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[22]  T. Rauber,et al.  Learning To Classify , 1999 .

[23]  C. Selvi,et al.  A Comparative Study of Feature Selection and Machine Learning Methods for Sentiment Classification on Movie Data Set , 2015 .

[24]  Namita Mittal,et al.  Prominent feature extraction for review analysis: an empirical study , 2016, J. Exp. Theor. Artif. Intell..

[25]  Josef Steinberger,et al.  Supervised sentiment analysis in Czech social media , 2014, Inf. Process. Manag..

[26]  Deyu Zhou,et al.  Self-training from labeled features for sentiment analysis , 2011, Inf. Process. Manag..

[27]  Fermín L. Cruz,et al.  Clasificación de documentos basada en la opinión: experimentos con un corpus de críticas de cine en español , 2008, Proces. del Leng. Natural.

[28]  Luis Alfonso Ureña López,et al.  Learning to Classify Neutral Examples from Positive and Negative Opinions , 2012, J. Univers. Comput. Sci..

[29]  Miguel Ángel Rodríguez-García,et al.  ONLI: An ontology-based system for querying DBpedia using natural language paradigm , 2015, Expert Syst. Appl..

[30]  Nathalie Aussenac-Gilles,et al.  A study on LIWC categories for opinion mining in Spanish reviews , 2014, J. Inf. Sci..

[31]  Miguel Ángel Rodríguez-García,et al.  A semantic-based approach for querying linked data using natural language , 2016, J. Inf. Sci..

[32]  Namita Mittal,et al.  Semantic Orientation-Based Approach for Sentiment Analysis , 2016 .

[33]  Catherine Havasi,et al.  ConceptNet 5: A Large Semantic Network for Relational Knowledge , 2013, The People's Web Meets NLP.

[34]  Namita Mittal,et al.  Sentiment Classification using Rough Set based Hybrid Feature Selection , 2013, WASSA@NAACL-HLT.

[35]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[36]  Catherine Havasi,et al.  Representing General Relational Knowledge in ConceptNet 5 , 2012, LREC.