Comparing word, character, and phoneme n-grams for subjective utterance recognition

In this paper, we compare the performance of classifiers trained using word n-grams, character n-grams, and phoneme n-grams for recognizing subjective utterances in multiparty conversation. We show that there is value in using very shallow linguistic representations, such as character n-grams, for recognizing subjective utterances, in particular, gains in the recall of subjective utterances. Copyright © 2008 ISCA.

[1]  Pavel Matejka,et al.  Towards Lower Error Rates in Phoneme Recognition , 2004, TSD.

[2]  Ellen Riloff,et al.  Learning Extraction Patterns for Subjective Expressions , 2003, EMNLP.

[3]  Theresa Wilson,et al.  Annotating Subjective Content in Meetings , 2008, LREC.

[4]  Elizabeth Shriberg,et al.  Spotting "hot spots" in meetings: human judgments and prosodic cues , 2003, INTERSPEECH.

[5]  Dan Klein,et al.  Named Entity Recognition with Character-Level Models , 2003, CoNLL.

[6]  Jean Carletta,et al.  The AMI meeting corpus , 2005 .

[7]  Lukás Burget,et al.  The AMI System for the Transcription of Speech in Meetings , 2007, ICASSP.

[8]  Helene Weiss Randolph Quirk/Sidney Greenbaum/Geoffrey Leech/Jan Svartvik, A Comprehensive Grammar of the English Language , 1987 .

[9]  E. Stamatatos Ensemble-based Author Identification Using Character N-grams , 2006 .

[10]  Jan Svartvik,et al.  A __ comprehensive grammar of the English language , 1988 .

[11]  Hong Yu,et al.  Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying the Polarity of Opinion Sentences , 2003, EMNLP.

[12]  Wessel Kraaij,et al.  A Shallow Approach to Subjectivity Classification , 2008, ICWSM.

[13]  Efstathios Stamatatos,et al.  Webpage Genre Identification Using Variable-Length Character n-Grams , 2007 .

[14]  Swapna Somasundaran,et al.  Detecting Arguing and Sentiment in Meetings , 2007, SIGdial.

[15]  Yoram Singer,et al.  BoosTexter: A Boosting-based System for Text Categorization , 2000, Machine Learning.