Semantic role-based representations in text classification

Although good results for automatic text classification can be achieved with the use of bag-of-words representation, this model is not suitable for all classification problems and richer text representations can be required. In this paper, we proposed two text representation models based on semantic role labels and analyzed them in text classification scenarios. We also evaluated the combination of bag-of-words with a semantic representation considering ensemble multi-view strategies. We explored different classification problems for two text collections and pointed out situations that require more than a bag-of-words. The experimental evaluation indicates that the combination of bag-of-words and a text representation based on semantic role labels can improve text classification accuracies.

[1]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[2]  Naomie Salim,et al.  A framework for multi-document abstractive summarization based on semantic role labelling , 2015, Appl. Soft Comput..

[3]  Hakan Cevikalp,et al.  Local Classifier Weighting by Quadratic Programming , 2008, IEEE Transactions on Neural Networks.

[4]  Estela Saquete Boró,et al.  Applying semantic knowledge to the automatic processing of temporal expressions and events in natural language , 2013, Inf. Process. Manag..

[5]  Marcos Aurélio Domingues,et al.  Named entities as privileged information for hierarchical text clustering , 2014, IDEAS.

[6]  Trevor Darrell,et al.  Multi-View Learning in the Presence of View Disagreement , 2008, UAI 2008.

[7]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[8]  Andreas Stafylopatis,et al.  Exploiting Wikipedia Knowledge for Conceptual Hierarchical Clustering of Documents , 2012, Comput. J..

[9]  Marko Grobelnik Many faces of text processing , 2011, WIMS '11.

[10]  Daniel Gildea,et al.  The Proposition Bank: An Annotated Corpus of Semantic Roles , 2005, CL.

[11]  Richard Johansson,et al.  Text Categorization Using Predicate-Argument Structures , 2009, NODALIDA.

[12]  James Allan,et al.  Interactive Clustering of Text Collections According to a User-Specified Criterion , 2007, IJCAI.

[13]  Fakhri Karray,et al.  AN EFFICIENT MODEL FOR ENHANCING TEXT CATEGORIZATION USING SENTENCE SEMANTICS , 2010, Comput. Intell..

[14]  João Luís Garcia Rosa,et al.  A two-step convolutional neural network approach for semantic role labeling , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).

[15]  Alneu de Andrade Lopes,et al.  Inductive Model Generation for Text Classification Using a Bipartite Heterogeneous Network , 2014, Journal of Computer Science and Technology.

[16]  Evgeniy Gabrilovich,et al.  Computing Semantic Relatedness Using Wikipedia-based Explicit Semantic Analysis , 2007, IJCAI.

[17]  João Luís Garcia Rosa,et al.  Mac-Morpho Revisited: Towards Robust Part-of-Speech Tagging , 2013, STIL.

[18]  Suresh Manandhar,et al.  SemEval-2015 Task 12: Aspect Based Sentiment Analysis , 2015, *SEMEVAL.

[19]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[20]  Hua Fan,et al.  Improving Text Categorization with Semantic Knowledge in Wikipedia , 2013, IEICE Trans. Inf. Syst..

[21]  Alneu de Andrade Lopes,et al.  Optimization and label propagation in bipartite heterogeneous networks to improve transductive classification of texts , 2016, Inf. Process. Manag..

[22]  Rafael Valencia-García,et al.  A semantic role labelling-based framework for learning ontologies from Spanish documents , 2013, Expert Syst. Appl..

[23]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .