Affective language model adaptation via corpus selection

Motivated by methods used in language modeling and grammar induction, we propose the use of pragmatic constraints and perplexity as criteria to filter the unlabeled data used to generate the semantic similarity model. We investigate unsupervised adaptation algorithms of the semantic-affective models proposed in [1, 2]. Affective ratings at the utterance level are generated based on an emotional lexicon, which in turn is created using a semantic (similarity) model estimated over raw, unlabeled text. The proposed adaptation method creates task-dependent semantic similarity models and task-dependent word/term affective ratings. The proposed adaptation algorithms are tested on anger/distress detection of transcribed speech data and sentiment analysis in tweets showing significant relative classification error reduction of up to 10%.

[1]  Preslav Nakov,et al.  SemEval-2013 Task 2: Sentiment Analysis in Twitter , 2013, *SEMEVAL.

[2]  Yulan He,et al.  Joint sentiment/topic model for sentiment analysis , 2009, CIKM.

[3]  Alexandros Potamianos,et al.  Web data harvesting for speech understanding grammar induction , 2013, INTERSPEECH.

[4]  M. Bradley,et al.  Affective Normsfor English Words (ANEW): Stimuli, instruction manual and affective ratings (Tech Report C-1) , 1999 .

[5]  Michael L. Littman,et al.  Unsupervised Learning of Semantic Orientation from a Hundred-Billion-Word Corpus , 2002, ArXiv.

[6]  Shrikanth S. Narayanan,et al.  Kernel Models for Affective Lexicon Creation , 2011, INTERSPEECH.

[7]  Sabine Bergler,et al.  CLaC and CLaC-NB: Knowledge-based and corpus-based approaches to sentiment tagging , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[8]  Mei-Yuh Hwang,et al.  Web-data augmented language models for Mandarin conversational speech recognition , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[9]  Carlo Strapparava,et al.  WordNet Affect: an Affective Extension of WordNet , 2004, LREC.

[10]  Carlo Strapparava,et al.  SemEval-2007 Task 14: Affective Text , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[11]  Shrikanth S. Narayanan,et al.  Toward detecting emotions in spoken dialogs , 2005, IEEE Transactions on Speech and Audio Processing.

[12]  Andrea Esuli,et al.  SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining , 2006, LREC.

[13]  M. Bradley,et al.  Affective Norms for English Words (ANEW): Instruction Manual and Affective Ratings , 1999 .

[14]  Gerald C. Davison,et al.  Experimentally induced distraction impacts cognitive but not emotional processes in think-aloud cognitive assessment , 2014, Front. Psychol..

[15]  Karo Moilanen Packed Feelings and Ordered Sentiments: Sentiment Parsing with Quasi−compositional Polarity Sequencing and Compression , 2010 .

[16]  François-Régis Chaumartin,et al.  UPAR7: A knowledge-based system for headline sentiment tagging , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[17]  F. J. Pelletier The Principle of Semantic Compositionality , 1994 .

[18]  Xu Ling,et al.  Topic sentiment mixture: modeling facets and opinions in weblogs , 2007, WWW '07.

[19]  Shrikanth S. Narayanan,et al.  Combining acoustic and language information for emotion recognition , 2002, INTERSPEECH.

[20]  Shrikanth S. Narayanan,et al.  Distributional Semantic Models for Affective Text Analysis , 2013, IEEE Transactions on Audio, Speech, and Language Processing.