Efficiency in Unification-Based N-Best Parsing

We extend a recently proposed algorithm for n-best unpacking of parse forests to deal efficiently with (a) Maximum Entropy (ME) parse selection models containing important classes of non-local features, and (b) forests produced by unification grammars containing significant proportions of globally inconsistent analyses. The new algorithm empirically exhibits a linear relationship between processing time and the number of analyses unpacked at all degrees of ME feature non-locality; in addition, compared with agenda-driven best-first parsing and exhaustive parsing with post-hoc parse selection it leads to improved parsing speed, coverage, and accuracy.

[1]  Steven P. Abney Stochastic Attribute-Value Grammars , 1996, CL.

[2]  Andrés Marzal,et al.  Computation of the N Best Parse Trees for Weighted and Stochastic Context-Free Grammars , 2000, SSPR/SPR.

[3]  Stephan Oepen,et al.  Efficient parsing for unification-based grammars , 2002 .

[4]  Liang Huang,et al.  Forest Reranking: Discriminative Parsing with Non-Local Features , 2008, ACL.

[5]  Mark Johnson,et al.  Dynamic programming for parsing and estimation of stochastic unification-based grammars , 2002, ACL.

[6]  Gregor Erbach,et al.  A Flexible Parser for a Linguistic Development Environment , 1991, Text Understanding in LILOG.

[7]  Bernard Lang,et al.  RECOGNITION CAN BE HARDER THAN PARSING , 1994, Comput. Intell..

[8]  Mark Johnson,et al.  Estimators for Stochastic “Unification-Based” Grammars , 1999, ACL.

[9]  Mark Johnson,et al.  Parsing the Wall Street Journal using a Lexical-Functional Grammar and Discriminative Estimation Techniques , 2002, ACL.

[10]  James R. Curran,et al.  Parsing the WSJ Using CCG and Log-Linear Models , 2004, ACL.

[11]  Stephan Oepen,et al.  High Efficiency Realization for a Wide-Coverage Unification Grammar , 2005, IJCNLP.

[12]  Eugene Charniak,et al.  Figures of Merit for Best-First Probabilistic Chart Parsing , 1998, Comput. Linguistics.

[13]  Stephan Oepen,et al.  Stochastic HPSG Parse Selection using the Redwoods Corpus , 2005 .

[14]  James R. Curran,et al.  Wide-Coverage Efficient Statistical Parsing with CCG and Log-Linear Models , 2007, Computational Linguistics.

[15]  Stephan Oepen,et al.  Stochastic HPSG Parse Disambiguation using the Redwoods Corpus , 2005 .

[16]  Dan Klein,et al.  Parsing and Hypergraphs , 2001, IWPT.

[17]  Irene Langkilde-Geary,et al.  Forest-Based Statistical Sentence Generation , 2000, ANLP.

[18]  Tadao Kasami,et al.  An Efficient Recognition and Syntax-Analysis Algorithm for Context-Free Languages , 1965 .

[19]  David Chiang,et al.  Better k-best Parsing , 2005, IWPT.

[20]  Stephan Oepen,et al.  Collaborative language engineering : a case study in efficient grammar-based processing , 2002 .

[21]  Berthold Crysmann,et al.  Relative Clause Extraposition in German: An Efficient and Portable Implementation , 2005 .

[22]  Stefan Müller,et al.  HPSG Analysis of German , 2000 .

[23]  FlickingerDan On building a more efficient grammar by exploiting types , 2000 .

[24]  Robert Malouf,et al.  Wide Coverage Parsing with Stochastic Attribute Value Grammars , 2004 .

[25]  Christopher D. Manning,et al.  LinGO Redwoods A Rich and Dynamic Treebank for HPSG , 2002 .

[26]  Bernard Lang,et al.  The Structure of Shared Forests in Ambiguous Parsing , 1989, ACL.

[27]  Rob Malouf,et al.  A Comparison of Algorithms for Maximum Entropy Parameter Estimation , 2002, CoNLL.

[28]  Tsujii Jun'ichi,et al.  Maximum entropy estimation for feature forests , 2002 .

[29]  Dan Klein,et al.  A* Parsing: Fast Exact Viterbi Parse Selection , 2003, NAACL.

[30]  Jun'ichi Tsujii,et al.  Corpus-Oriented Grammar Development for Acquiring a Head-Driven Phrase Structure Grammar from the Penn Treebank , 2004, IJCNLP.

[31]  Jun'ichi Tsujii,et al.  Feature Forest Models for Probabilistic HPSG Parsing , 2008, CL.

[32]  Ann Copestake,et al.  Implementing typed feature structure grammars , 2001, CSLI lecture notes series.