A Shallow Approach to Subjectivity Classification

We present a shallow linguistic approach to subjectivity classification. Using multinomial kernel machines, we demonstrate that a data representation based on counting character n-grams is able to improve on results previously attained on the MPQA corpus using word-based n-grams and syntactic information. We compare two types of string-based representations: key substring groups and character n-grams. We find that word-spanning character n-grams significantly reduce the bias of a classifier, and boost its accuracy.1 Copyright © 2008, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.