论文信息 - Alternative approaches for Generating Bodies of Grammar Rules

Alternative approaches for Generating Bodies of Grammar Rules

We compare two approaches for describing and generating bodies of rules used for natural language parsing. In today's parsers rule bodies do not exist a priori but are generated on the fly, usually with methods based on n-grams, which are one particular way of inducing probabilistic regular languages. We compare two approaches for inducing such languages. One is based on n-grams, the other on minimization of the Kullback-Leibler divergence. The inferred regular languages are used for generating bodies of rules inside a parsing procedure. We compare the two approaches along two dimensions: the quality of the probabilistic regular language they produce, and the performance of the parser they were used to build. The second approach outperforms the first one along both dimensions.

Maarten de Rijke | Gabriel G. Infante López

[1] Michael Collins,et al. Three Generative, Lexicalised Models for Statistical Parsing , 1997, ACL.

[2] Eugene Charniak,et al. Statistical Parsing with a Context-Free Grammar and Word Statistics , 1997, AAAI/IAAI.

[3] Thomas M. Cover,et al. Elements of Information Theory , 2005 .

[4] Khalil Sima'an. Tree-gram Parsing: Lexical Dependencies and Structural Relations , 2000, ACL.

[5] E. Mark Gold,et al. Language Identification in the Limit , 1967, Inf. Control..

[6] Jason Eisner,et al. Three New Probabilistic Models for Dependency Parsing: An Exploration , 1996, COLING.

[7] Dekang Lin,et al. A dependency-based method for evaluating broad-coverage parsers , 1995, Natural Language Engineering.

[8] José Oncina,et al. Learning Stochastic Regular Grammars by Means of a State Merging Method , 1994, ICGI.

[9] Taylor L. Booth,et al. Applying Probability Measures to Abstract Languages , 1973, IEEE Transactions on Computers.

[10] Michael Collins,et al. A New Statistical Parser Based on Bigram Lexical Dependencies , 1996, ACL.

[11] Colin de la Higuera,et al. Probabilistic DFA Inference using Kullback-Leibler Divergence and Minimality , 2000, ICML.