Exploring the use of linguistic features in sentiment analysis

In this paper we describe some explorations of the potential of genre-revealing features on automatic sentiment analysis. In particular, we use a small subset of the ‘linguistic facets’ employed in recent experiments on automatic genre identification in combination with more traditional sentiment-revealing features on two different single-genre corpora: a corpus of English blogs and a corpus of French reviews(relectures). Although still preliminary, results show that linguistic facets might have a positive influence on sentiment analysis because 6 out of 14 facets used in the experiments are among the first 22 most important discriminative features.

[1]  P. Ekman Universals and cultural differences in facial expressions of emotion. , 1972 .

[2]  Antal van den Bosch,et al.  Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics , 2007 .

[3]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[4]  J. Kamps,et al.  Words with attitude , 2002 .

[5]  John Blitzer,et al.  Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification , 2007, ACL.

[6]  Kenneth Ward Church,et al.  Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.

[7]  M. Genereux,et al.  Defi: classification de textes Francais subjectifs , 2007 .

[8]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[9]  Andrea Esuli,et al.  PageRanking WordNet Synsets: An Application to Opinion Mining , 2007, ACL.

[10]  Marina Santini,et al.  Automatic identification of genre in Web pages , 2011 .

[11]  Mike Wells,et al.  Structured Models for Fine-to-Coarse Sentiment Analysis , 2007, ACL.

[12]  Ted Briscoe,et al.  Weakly Supervised Learning for Hedge Classification in Scientific Literature , 2007, ACL.

[13]  飯島 周 「会話の文法」に関する一考察 : Longman Grammar of Spoken and Written Englishの場合 , 1999 .

[14]  Andrea Esuli,et al.  Determining the semantic orientation of terms through gloss analysis , 2005, CIKM 2005.

[15]  Hsin-Hsi Chen,et al.  Building Emotion Lexicon from Weblog Corpora , 2007, ACL.

[16]  Yuji Matsumoto,et al.  Extracting Aspect-Evaluation and Aspect-Of Relations in Opinion Mining , 2007, EMNLP.

[17]  Janyce Wiebe,et al.  Effects of Adjective Orientation and Gradability on Sentence Subjectivity , 2000, COLING.

[18]  Andrea Esuli,et al.  SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining , 2006, LREC.

[19]  Soo-Min Kim,et al.  Crystal: Analyzing Predictive Opinions on the Web , 2007, EMNLP.

[20]  Graeme Hirst,et al.  Semantic distance in WordNet: An experimental, application-oriented evaluation of five measures , 2004 .

[21]  E. Thoma Nebraska Symposium on Motivation , 1963 .

[22]  J. M. Kittross The measurement of meaning , 1959 .

[23]  Hsin-Hsi Chen,et al.  Test Collection Selection and Gold Standard Generation for a Multiply-Annotated Opinion Corpus , 2007, ACL.

[24]  Jörg Kindermann,et al.  Authorship Attribution with Support Vector Machines , 2003, Applied Intelligence.

[25]  Masaru Kitsuregawa,et al.  Building Lexicon for Sentiment Analysis from Massive Collection of HTML Documents , 2007, EMNLP.

[26]  Michel Généreux,et al.  Towards a validated model for affective classification of texts , 2006, ACL 2006.

[27]  Khurshid Ahmad,et al.  Sentiment Polarity Identification in Financial News: A Cohesion-based Approach , 2007, ACL.

[28]  Jonathon Read,et al.  Annotating expressions of Appraisal in English , 2007, Language Resources and Evaluation.

[29]  Rada Mihalcea,et al.  Learning Multilingual Subjective Language via Cross-Lingual Projections , 2007, ACL.

[30]  Arun Sundararajan,et al.  Opinion Mining using Econometrics: A Case Study on Reputation Systems , 2007, ACL.

[31]  Ellen Riloff,et al.  Learning Extraction Patterns for Subjective Expressions , 2003, EMNLP.

[32]  Carlo Strapparava,et al.  WordNet Affect: an Affective Extension of WordNet , 2004, LREC.