A Method for Automatically Constructing Case Frames for English

Case frames are an important knowledge base for a variety of natural language processing (NLP) systems. For the practical use of these systems in the real world, wide-coverage case frames are required. In order to acquire such large-scale case frames, in this paper, we automatically compile case frames from a large corpus. The resultant case frames that are compiled from the English Gigaword corpus contain 9,300 verb entries. The case frames include most examples of normal usage, and are ready to be used in numerous NLP analyzers and applications.

[1]  Anne Abeillé,et al.  A Lexicalized Tree Adjoining Grammar for English , 1990 .

[2]  XTAG Research Group,et al.  A Lexicalized Tree Adjoining Grammar for English , 1998, ArXiv.

[3]  Anna Korhonen,et al.  Semantically Motivated Subcategorization Acquisition , 2002, ACL 2002.

[4]  Adam Janin,et al.  Mutaphrase: Paraphrasing with FrameNet , 2007, ACL-PASCAL@ACL.

[5]  Alex Waibel,et al.  The Automatic Acquisition of Frequencies of Verb Subcategorization Frames from Tagged Corpora , 2002 .

[6]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[7]  Michael R. Brent,et al.  From Grammar to Lexicon: Unsupervised Learning of Lexical Syntax , 1993, Comput. Linguistics.

[8]  Anna Korhonen,et al.  Improving Subcategorization Acquisition Using Word Sense Disambiguation , 2003, ACL.

[9]  John B. Lowe,et al.  The Berkeley FrameNet Project , 1998, ACL.

[10]  Jun'ichi Tsujii,et al.  Bidirectional Inference with the Easiest-First Strategy for Tagging Sequence Data , 2005, HLT.

[11]  Mats Rooth,et al.  Valence Induction with a Head-Lexicalized PCFG , 1998, EMNLP.

[12]  Ted Briscoe,et al.  Automatic Extraction of Subcategorization from Corpora , 1997, ANLP.

[13]  Ted Briscoe,et al.  The Derivation of a Grammatically Indexed Lexicon from the Longman Dictionary of Contemporary English , 1987, ACL.

[14]  Ted Briscoe,et al.  A Large Subcategorization Lexicon for Natural Language Processing Applications , 2006, LREC.

[15]  Christopher D. Manning Automatic Acquisition of a Large Sub Categorization Dictionary From Corpora , 1993, ACL.

[16]  Fernando Pereira,et al.  Multilingual Dependency Analysis with a Two-Stage Discriminative Parser , 2006, CoNLL.

[17]  Takehito Utsuro,et al.  Maximum Entropy Model Learning of Subcategorization Preference , 1997, VLC.

[18]  Daisuke Kawahara,et al.  Case Frame Compilation from the Web using High-Performance Computing , 2006, LREC.

[19]  Eugene Charniak,et al.  A statistical syntactic disambiguation program and what it learns , 1995, Learning for Natural Language Processing.

[20]  Ralph Grishman,et al.  Comlex Syntax: Building a Computational Lexicon , 1994, COLING.

[21]  Masahiko Haruno A Case Frame Learning Method for Japanese Polysemous Verbs , 1995 .

[22]  Daniel Zeman Can Subcategorization Help a Statistical Dependency Parser? , 2002, COLING.

[23]  Susanne Gahl,et al.  Automatic Extraction of Subcorpora based on Subcategorization Frames from a Part-ofSpeech Tagged Corpus , 1998, ACL.

[24]  Maria Lapata,et al.  Acquiring Lexical Generalizations from Corpora: A Case Study for Diathesis Alternations , 1999, ACL.