论文信息 - Immediate-Head Parsing for Language Models

Immediate-Head Parsing for Language Models

We present two language models based upon an "immediate-head" parser --- our name for a parser that conditions all events below a constituent c upon the head of c. While all of the most accurate statistical parsers are of the immediate-head variety, no previous grammatical language model uses this technology. The perplexity for both of these models significantly improve upon the trigram model base-line as well as the best previous grammar-based language model. For the better of our two models these improvements are 24% and 14% respectively. We also suggest that improvement of the underlying parser should significantly improve the model's perplexity and that even in the near term there is a lot of potential for improvement in immediate-head language models.

Eugene Charniak | Eugene Charniak

[1] David Goddeau,et al. Using probabilistic shift-reduce parsing in speech recognition systems , 1992, ICSLP.

[2] Beatrice Santorini,et al. Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[3] Andreas Stolcke,et al. Precise N-Gram Probabilities From Stochastic Context-Free Grammars , 1994, ACL.

[4] Andreas Stolcke,et al. An Efficient Probabilistic Context-Free Parsing Algorithm that Computes Prefix Probabilities , 1994, CL.

[5] David M. Magerman. Statistical Decision-Tree Models for Parsing , 1995, ACL.

[6] Mark Lauer,et al. Corpus Statistics Meet the Noun Compound: Some Empirical Results , 1995, ACL.

[7] Eugene Charniak,et al. Tree-Bank Grammars , 1996, AAAI/IAAI, Vol. 2.

[8] Michael Collins,et al. Three Generative, Lexicalised Models for Statistical Parsing , 1997, ACL.

[9] Frederick Jelinek,et al. Exploiting Syntactic Structure for Language Modeling , 1998, ACL.

[10] Zhiyi Chi,et al. Estimation of Probabilistic Context-Free Grammars , 1998, Comput. Linguistics.

[11] Eugene Charniak,et al. A Maximum-Entropy-Inspired Parser , 2000, ANLP.

[12] Joshua Goodman,et al. Putting it all together: language model combination , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[13] Rens Bod. What is the Minimal Set of Fragments that Achieves Maximal Parse Accuracy? , 2001, ACL.

[14] Brian Roark,et al. Probabilistic Top-Down Parsing and Language Modeling , 2001, CL.

[15] Michael Collins,et al. Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.