Backoff Parameter Estimation for the DOP Model

The Data Oriented Parsing (DOP) model currently achieves state-of-the-art parsing on benchmark corpora. However, existing DOP parameter estimation methods are known to be biased, and ad hoc adjustments are needed in order to reduce the effects of these biases on performance. In contrast with earlier work, in this paper we show that the DOP parameters constitute a hierarchically structured space of correlated events (rather than a set of disjoint events). The correlations between the different parameters can be expressed by an asymmetric relation called "backoff". Subsequently, we present a novel recursive estimation algorithm that exploits this hierarchical structure for parameter estimation through discounting and backoff. Finally, we report on experiments showing error reductions of up to 15% in comparison to earlier estimation methods.

[1]  I. Good THE POPULATION FREQUENCIES OF SPECIES AND THE ESTIMATION OF POPULATION PARAMETERS , 1953 .

[2]  Slava M. Katz,et al.  Estimation of probabilities from sparse data for the language model component of a speech recognizer , 1987, IEEE Trans. Acoust. Speech Signal Process..

[3]  Rens Bod Combining semantic and syntactic structure for language modeling , 2000, INTERSPEECH.

[4]  F ChenStanley,et al.  An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[5]  Khalil Sima'an,et al.  Learning Efficient Disambiguation , 1999, ArXiv.

[6]  John D. Lafferty,et al.  Towards History-based Grammars: Using Richer Models for Probabilistic Parsing , 1993, ACL.

[7]  Rens Bod,et al.  What is the Minimal Set of Fragments that Achieves Maximal Parse Accuracy? , 2001, ACL.

[8]  Michael Collins,et al.  Review of Beyond grammar: an experience-based theory of language by Rens Bod. CSLI Publications 1998. , 1999 .

[9]  Mark Johnson,et al.  Squibs and Discussions: The DOP Estimation Method is Biased and Inconsistent , 2002, CL.

[10]  Khalil Sima'an,et al.  Disambiguation and Interpretation of Wordgraphs using Data-Oriented Parsing , 1996 .

[11]  Rens Bod,et al.  A Computational Model of Language Performance: Data Oriented Parsing , 1992, COLING.

[12]  Eugene Charniak,et al.  A Maximum-Entropy-Inspired Parser , 2000, ANLP.

[13]  Ralph Grishman,et al.  A Procedure for Quantitatively Comparing the Syntactic Coverage of English Grammars , 1991, HLT.

[14]  R. Bonnema A New Probability Model for Data Oriented Parsing , 1999 .

[15]  KHALIL SIMA’AN Computational Complexity of Probabilistic Disambiguation , 2002, Grammars.

[16]  Frederick Jelinek,et al.  Exploiting Syntactic Structure for Language Modeling , 1998, ACL.

[17]  Rens Bod,et al.  Beyond Grammar: An Experience-Based Theory of Language , 1998 .