论文信息 - Automated extraction of Tree-Adjoining Grammars from treebanks - 字舞流文

Automated extraction of Tree-Adjoining Grammars from treebanks

There has been a contemporary surge of interest in the application of stochastic models of parsing. The use of tree-adjoining grammar (TAG) in this domain has been relatively limited due in part to the unavailability, until recently, of large-scale corpora hand-annotated with TAG structures. Our goals are to develop inexpensive means of generating such corpora and to demonstrate their applicability to stochastic modeling. We present a method for automatically extracting a linguistically plausible TAG from the Penn Treebank. Furthermore, we also introduce labor-inexpensive methods for inducing higher-level organization of TAGs. Empirically, we perform an evaluation of various automatically extracted TAGs and also demonstrate how our induced higher-level organization of TAGs can be used for smoothing stochastic TAG models.

Srinivas Bangalore | K. Vijay-Shanker | John Chen

[1] Fei Xia,et al. Comparing Lexicalized Treebank Grammars Extracted from Chinese, Korean, and English Corpora , 2000, ACL 2000.

[2] Ann Bies,et al. The Penn Treebank: Annotating Predicate Argument Structure , 1994, HLT.

[3] Owen Rambow,et al. Use of Deep Linguistic Features for the Recognition and Labeling of Semantic Arguments , 2003, EMNLP.

[4] David Chiang,et al. Recovering Latent Information in Treebanks , 2002, COLING.

[5] Eugene Charniak,et al. A Maximum-Entropy-Inspired Parser , 2000, ANLP.

[6] Slava M. Katz,et al. Estimation of probabilities from sparse data for the language model component of a speech recognizer , 1987, IEEE Trans. Acoust. Speech Signal Process..

[7] Richard M. Schwartz,et al. Coping with Ambiguity and Unknown Words through Probabilistic Models , 1993, CL.

[8] Aravind K. Joshi,et al. Coordination in Tree Adjoining Grammars: Formalization and Implementation , 1996, COLING.

[9] Lillian Lee,et al. Measures of Distributional Similarity , 1999, ACL.

[10] Marilyn A. Walker,et al. Towards Automatic Generation of Natural Language Generation Systems , 2002, COLING.

[11] Fei Xia,et al. A Uniform Method of Grammar Extraction and Its Applications , 2000, EMNLP.

[12] David M. Magerman. Natural Language Parsing as Statistical Pattern Recognition , 1994, ArXiv.

[13] Michael Collins,et al. Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[14] Yves Schabes,et al. Stochastic Lexicalized Tree-adjoining Grammars , 1992, COLING.

[15] Michael Collins,et al. Three Generative, Lexicalised Models for Statistical Parsing , 1997, ACL.

[16] Rebecca Hwa. An Empirical Evaluation of Probabilistic Lexicalized Tree Insertion Grammars , 1998, COLING-ACL.

[17] Ted Briscoe,et al. Automatic Extraction of Subcategorization from Corpora , 1997, ANLP.

[18] Srinivas Bangalore,et al. Reranking an n-gram supertagger , 2002, TAG+.

[19] David J. Weir,et al. D-Tree Grammars , 1995, ACL.

[20] Vijay K. Shanker,et al. Towards efficient statistical parsing using lexicalized grammatical information , 2002 .

[21] Fei Xia,et al. Consistent grammar development using partial-tree descriptions for Lexicalized Tree-Adjoining Grammars , 1998, TAG+.

[22] David Chiang,et al. Statistical Parsing with an Automatically-Extracted Tree Adjoining Grammar , 2000, ACL.

[23] Paola Merlo,et al. Automatic distinction of arguments and modifiers: the case of prepositional phrases , 2001, CoNLL.

[24] Srinivas Bangalore,et al. Supertagging: An Approach to Almost Parsing , 1999, CL.

[25] Mark Steedman,et al. Generative Models for Statistical Parsing with Combinatory Categorial Grammar , 2002, ACL.

[26] Srinivas Bangalore,et al. New Models for Improving Supertag Disambiguation , 1999, EACL.

[27] Srinivas Bangalore,et al. Performance Evaluation of Supertagging for Partial Parsing , 2000 .

[28] Mark Steedman,et al. Acquiring Compact Lexicalized Grammars from a Cleaner Treebank , 2002, LREC.

[29] Aravind K. Joshi,et al. Parsing Strategies with ‘Lexicalized’ Grammars: Application to Tree Adjoining Grammars , 1988, COLING.

[30] Anne Abeillé,et al. A Lexicalized Tree Adjoining Grammar for English , 1990 .

[31] Philip Resnik,et al. Probabilistic Tree-Adjoining Grammar as a Framework for Statistical Natural Language Processing , 1992, COLING.

[32] Stuart M. Shieber,et al. An Alternative Conception of Tree-Adjoining Derivation , 1992, ACL.

[33] Rebecca Hwa. Supervised Grammar Induction using Training Data with Limited Constituent Information , 1999, ACL.

[34] Daniel Gildea,et al. Identifying Semantic Roles Using Combinatory Categorial Grammar , 2003, EMNLP.

[35] Martha Palmer,et al. Adding predicate argument structure to the Penn TreeBank , 2002 .

[36] S. Buchholz,et al. Distinguishing complements from adjuncts using memory-based learning , 1998 .

[37] Andrew Radford,et al. Transformational Grammar: A First Course , 1988 .

[38] Ido Dagan,et al. Similarity-Based Models of Word Cooccurrence Probabilities , 1998, Machine Learning.

[39] Srinivas Bangalore,et al. Bootstrapping A Wide-Coverage CCG from FB-LTAG , 1994, ArXiv.

[40] Carlo Cecchetto,et al. Introduction to Government and Binding Theory , 1996 .

[41] Stephen Clark,et al. Supertagging for Combinatory Categorial Grammar , 2002, TAG+.

[42] Ann Bies,et al. Bracketing Guidelines For Treebank II Style Penn Treebank Project , 1995 .

[43] Geoffrey K. Pullum,et al. Generalized Phrase Structure Grammar , 1985 .

[44] Richard C. Waters,et al. Lexicalized Context-Free Grammars , 1993, ACL.

[45] Srinivas Bangalore,et al. Impact of Quality and Quantity of Corpora on Stochastic Generation , 2001, EMNLP.

[46] Anoop Sarkar. Practical experiments in parsing using Tree Adjoining Grammars , 2000, TAG+.

[47] Anoop Sarkar,et al. Applying Co-Training Methods to Statistical Parsing , 2001, NAACL.

[48] Robert Frank,et al. Phrase Structure Composition and Syntactic Dependencies , 2002, Computational Linguistics.

[49] Günter Neumann. Automatic extraction of stochastic lexicalized tree grammars from treebanks , 1998, TAG+.