Improving subjectivity detection for Spanish texts using subjectivity word sense disambiguation based on knowledge

In this paper, we present a Sentence-level Subjectivity Detection method for Spanish using Subjectivity Word Sense Disambiguation (SWSD) based on Knowledge. We use a classic method of Word Sense Disambiguation, using the Spanish WordNet included in Mutlilingual Central Repository 3.0 and the WordNet-Pr as Knowledge base. Because of the alignment between the WordNet and the SentiWordNet, we use this latter as semantic resource annotated with polarity values to determine when a word expresses subjectivity and objectivity, defining subjectivity levels using a fuzzy clustering algorithm previously. Due to the few resources focused on Sentiment Analysis for Spanish, the Semcor corpus was used for analyzing the attributes to be used. Finally, a Rule-based classifier was created to detect subjective sentences. This method was executed over a Spanish corpus, created in this work. The results show that our approach contributes positively to Subjectivity Detection task, despite of using resources created for English.

[1]  Alexander F. Gelbukh,et al.  Empirical Study of Machine Learning Based Approach for Opinion Mining in Tweets , 2012, MICAI.

[2]  Xin Wang,et al.  Learning Lexical Subjectivity Strength for Chinese Opinionated Sentence Identification , 2012, CICLing.

[3]  Janyce Wiebe,et al.  Effects of Adjective Orientation and Gradability on Sentence Subjectivity , 2000, COLING.

[4]  Rada Mihalcea,et al.  Word Sense and Subjectivity , 2006, ACL.

[5]  Elena Lloret,et al.  Towards a Unified Approach for Opinion Question Answering and Summarization , 2011, WASSA@ACL.

[6]  Egoitz Laparra,et al.  Multilingual Central Repository version 3.0 , 2012, LREC.

[7]  Ellen Riloff,et al.  Learning subjective nouns using extraction pattern bootstrapping , 2003, CoNLL.

[8]  Bing Liu,et al.  Sentiment Analysis and Subjectivity , 2010, Handbook of Natural Language Processing.

[9]  Andrea Esuli,et al.  SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining , 2010, LREC.

[10]  Andrés Montoyo,et al.  Improving Subjectivity Detection using Unsupervised Subjectivity Word Sense Disambiguation , 2013, Proces. del Leng. Natural.

[11]  Giuseppe Carenini,et al.  Summarizing Spoken and Written Conversations , 2008, EMNLP.

[12]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[13]  Janyce Wiebe,et al.  Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2005, HLT.

[14]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[15]  Carlo Strapparava,et al.  WordNet Affect: an Affective Extension of WordNet , 2004, LREC.

[16]  Rada Mihalcea,et al.  Learning Multilingual Subjective Language via Cross-Lingual Projections , 2007, ACL.

[17]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[18]  Helmut Schmidt,et al.  Probabilistic part-of-speech tagging using decision trees , 1994 .

[19]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[20]  Ellen Riloff,et al.  Creating Subjective and Objective Sentence Classifiers from Unannotated Texts , 2005, CICLing.

[21]  Eneko Agirre,et al.  Personalizing PageRank for Word Sense Disambiguation , 2009, EACL.

[22]  Janyce Wiebe Subjectivity Word Sense Disambiguation , 2009, EMNLP 2009.

[23]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.