Support Vector Learning for Semantic Argument Classification

The natural language processing community has recently experienced a growth of interest in domain independent shallow semantic parsing—the process of assigning a Who did What to Whom, When, Where, Why, How etc. structure to plain text. This process entails identifying groups of words in a sentence that represent these semantic arguments and assigning specific labels to them. It could play a key role in NLP tasks like Information Extraction, Question Answering and Summarization. We propose a machine learning algorithm for semantic role parsing, extending the work of Gildea and Jurafsky (2002), Surdeanu et al. (2003) and others. Our algorithm is based on Support Vector Machines which we show give large improvement in performance over earlier classifiers. We show performance improvements through a number of new features designed to improve generalization to unseen data, such as automatic clustering of verbs. We also report on various analytic studies examining which features are most important, comparing our classifier to other machine learning algorithms in the literature, and testing its generalization to new test set from different genre. On the task of assigning semantic labels to the PropBank (Kingsbury, Palmer, & Marcus, 2002) corpus, our final system has a precision of 84% and a recall of 75%, which are the best results currently reported for this task. Finally, we explore a completely different architecture which does not requires a deep syntactic parse. We reformulate the task as a combined chunking and classification problem, thus allowing our algorithm to be applied to new languages or genres of text for which statistical syntactic parsers may not be available.

[1]  Daniel Gildea,et al.  The Necessity of Parsing for Predicate Argument Recognition , 2002, ACL.

[2]  Daniel Jurafsky,et al.  Shallow Semantic Parsing using Support Vector Machines , 2004, NAACL.

[3]  Daniel Gildea,et al.  Automatic Labeling of Semantic Roles , 2000, ACL.

[4]  Alexander J. Smola,et al.  Advances in Large Margin Classifiers , 2000 .

[5]  Mitchell P. Marcus,et al.  Adding Semantic Annotation to the Penn TreeBank , 1998 .

[6]  Eugene Charniak,et al.  Immediate-Head Parsing for Language Models , 2001, ACL.

[7]  Yuji Matsumoto,et al.  Chunking with Support Vector Machines , 2001, NAACL.

[8]  Yuji Matsumoto,et al.  Use of Support Vector Learning for Chunk Identification , 2000, CoNLL/LLL.

[9]  Thomas Hofmann,et al.  Statistical Models for Co-occurrence Data , 1998 .

[10]  Eduard H. Hovy,et al.  A Maximum Entropy Approach to FrameNet Tagging , 2003, HLT-NAACL.

[11]  Ann Bies,et al.  The Penn Treebank: Annotating Predicate Argument Structure , 1994, HLT.

[12]  SchwartzRichard,et al.  An Algorithm that Learns Whats in a Name , 1999 .

[13]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[14]  Daniel Gildea,et al.  Identifying Semantic Roles Using Combinatory Categorial Grammar , 2003, EMNLP.

[15]  Dania Egedi,et al.  A freely available wide coverage morphological analyzer for English , 1992, COLING 1992.

[16]  Eugene Charniak,et al.  Assigning Function Tags to Parsed Text , 2000, ANLP.

[17]  J. R. Quinlan,et al.  Data Mining Tools See5 and C5.0 , 2004 .

[18]  Marti A. Hearst Untangling Text Data Mining , 1999, ACL.

[19]  Ulrich H.-G. Kreßel,et al.  Pairwise classification and support vector machines , 1999 .

[20]  Sean Wallis,et al.  Knowledge Discovery in Grammatically Analysed Corpora , 2001, Data Mining and Knowledge Discovery.

[21]  Yoram Singer,et al.  Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers , 2000, J. Mach. Learn. Res..

[22]  Owen Rambow,et al.  Use of Deep Linguistic Features for the Recognition and Labeling of Semantic Arguments , 2003, EMNLP.

[23]  David M. Magerman Natural Language Parsing as Statistical Pattern Recognition , 1994, ArXiv.

[24]  Dania Egedi,et al.  A Freely Available Wide Coverage Morphological Analyzer for English , 1992, COLING.

[25]  Dekang Lin,et al.  Automatic Retrieval and Clustering of Similar Words , 1998, ACL.

[26]  Wayne H. Ward,et al.  Target Word Detection and Semantic Role Chunking using Support Vector Machines , 2003, NAACL.

[27]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[28]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[29]  Nello Cristianini,et al.  Classification using String Kernels , 2000 .

[30]  John B. Lowe,et al.  The Berkeley FrameNet Project , 1998, ACL.

[31]  Michael Collins,et al.  Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[32]  Roger Levy,et al.  A Generative Model for Semantic Role Labeling , 2003, ECML.

[33]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[34]  Mitchell P. Marcus,et al.  Text Chunking using Transformation-Based Learning , 1995, VLC@ACL.

[35]  Martha Palmer,et al.  Adding predicate argument structure to the Penn TreeBank , 2002 .

[36]  Erik F. Tjong Kim Sang,et al.  Representing Text Chunks , 1999, EACL.

[37]  Sanda M. Harabagiu,et al.  Using Predicate-Argument Structures for Information Extraction , 2003, ACL.

[38]  Richard M. Schwartz,et al.  An Algorithm that Learns What's in a Name , 1999, Machine Learning.