Single and Cross-domain Polarity Classification using String Kernels

The polarity classification task aims at automatically identifying whether a subjective text is positive or negative. When the target domain is different from those where a model was trained, we refer to a cross-domain setting. That setting usually implies the use of a domain adaptation method. In this work, we study the single and cross-domain polarity classification tasks from the string kernels perspective. Contrary to classical domain adaptation methods, which employ texts from both domains to detect pivot features, we do not use the target domain for training. Our approach detects the lexical peculiarities that characterise the text polarity and maps them into a domain independent space by means of kernel discriminant analysis. Experimental results show state-of-the-art performance in single and cross-domain polarity classification.

[1]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[2]  G. Baudat,et al.  Generalized Discriminant Analysis Using a Kernel Approach , 2000, Neural Computation.

[3]  Fabrizio Sebastiani,et al.  Machine learning in automated text categorization , 2001, CSUR.

[4]  Koby Crammer,et al.  Confidence-weighted linear classification , 2008, ICML '08.

[5]  Paolo Rosso,et al.  Cross-domain polarity classification using a knowledge-enhanced meta-classifier , 2015, Knowl. Based Syst..

[6]  Yanjun Qi,et al.  Sentiment classification based on supervised latent n-gram analysis , 2011, CIKM '11.

[7]  Kilian Q. Weinberger,et al.  Marginalized Denoising Autoencoders for Domain Adaptation , 2012, ICML.

[8]  Bernhard Schölkopf,et al.  The Kernel Trick for Distances , 2000, NIPS.

[9]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[10]  Aoife Cahill,et al.  Can characters reveal your native language? A language-independent approach to native language identification , 2014, EMNLP.

[11]  Cristian Grozea,et al.  Kernel Methods and String Kernels for Authorship Analysis , 2012, CLEF.

[12]  John Blitzer,et al.  Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification , 2007, ACL.

[13]  Nello Cristianini,et al.  Classification using String Kernels , 2000 .

[14]  Paolo Rosso,et al.  On the difficulty of automatically detecting irony: beyond a simple case of negation , 2014, Knowledge and Information Systems.

[15]  Danushka Bollegala,et al.  Cross-Domain Sentiment Classification Using a Sentiment Sensitive Thesaurus , 2013, IEEE Transactions on Knowledge and Data Engineering.

[16]  Koby Crammer,et al.  Analysis of Representations for Domain Adaptation , 2006, NIPS.

[17]  B. Scholkopf,et al.  Fisher discriminant analysis with kernels , 1999, Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. No.98TH8468).