Faster parsing and supertagging model estimation

Parsers are often the bottleneck for data acquisition, processing text too slowly to be widely applied. One way to improve the efficiency of parsers is to construct more confident statistical models. More training data would enable the use of more sophisticated features and also provide more evidence for current features, but gold standard annotated data is limited and expensive to produce. We demonstrate faster methods for training a supertagger using hundreds of millions of automatically annotated words, constructing statistical models that further constrain the number of derivations the parser must consider. By introducing new features and using an automatically annotated corpus we are able to double parsing speed on Wikipedia and the Wall Street Journal, and gain accuracy slightly when parsing Section 00 of the Wall Street Journal.

[1]  Anoop Sarkar,et al.  Applying Co-Training Methods to Statistical Parsing , 2001, NAACL.

[2]  Stephen Clark,et al.  Supertagging for Combinatory Categorial Grammar , 2002, TAG+.

[3]  James R. Curran,et al.  Perceptron Training for a Wide-Coverage Lexicalized-Grammar Parser , 2007, ACL 2007.

[4]  Koby Crammer,et al.  Ultraconservative Online Algorithms for Multiclass Problems , 2001, J. Mach. Learn. Res..

[5]  F ROSENBLATT,et al.  The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.

[6]  Srinivas Bangalore,et al.  The Institute For Research In Cognitive Science Disambiguation of Super Parts of Speech ( or Supertags ) : Almost Parsing by Aravind , 1995 .

[7]  Srinivas Bangalore Using Supertags in Document Filtering: the Eeect of Increased Context on Information Retrieval Eeectiveness , 1997 .

[8]  J. Darroch,et al.  Generalized Iterative Scaling for Log-Linear Models , 1972 .

[9]  James R. Curran,et al.  Log-Linear Models for Wide-Coverage CCG Parsing , 2003, EMNLP.

[10]  Yoav Freund,et al.  Large Margin Classification Using the Perceptron Algorithm , 1998, COLT.

[11]  D. Goldfarb A family of variable-metric methods derived by variational means , 1970 .

[12]  Brian Roark,et al.  Incremental Parsing with the Perceptron Algorithm , 2004, ACL.

[13]  Jun'ichi Tsujii,et al.  HPSG Supertagging: A Sequence Labeling View , 2009, IWPT.

[14]  Srinivas Bangalore,et al.  New Models for Improving Supertag Disambiguation , 1999, EACL.

[15]  Tadao Kasami,et al.  An Efficient Recognition and Syntax-Analysis Algorithm for Context-Free Languages , 1965 .

[16]  Yoav Freund,et al.  Large Margin Classification Using the Perceptron Algorithm , 1998, COLT' 98.

[17]  Nancy Chinchor,et al.  Statistical Significance of MUC-6 Results , 1995, MUC.

[18]  Eugene Charniak,et al.  Effective Self-Training for Parsing , 2006, NAACL.

[19]  James R. Curran,et al.  Bootstrapping POS-taggers using unlabelled data , 2003, CoNLL.

[20]  Daniel H. Younger,et al.  Recognition and Parsing of Context-Free Languages in Time n^3 , 1967, Inf. Control..

[21]  Anoop Sarkar Combining Supertagging and Lexicalized Tree-Adjoining Grammar Parsing∗ , 2006 .

[22]  Srinivas Bangalore,et al.  Reranking an n-gram supertagger , 2002, TAG+.

[23]  C. G. Broyden The Convergence of a Class of Double-rank Minimization Algorithms 2. The New Algorithm , 1970 .

[24]  Fei Xia,et al.  Some Experiments on Indicators of Parsing Complexity for Lexicalized Grammars , 2000, ELSPS.

[25]  R. Fletcher,et al.  A New Approach to Variable Metric Algorithms , 1970, Comput. J..

[26]  Tibor Kiss,et al.  Unsupervised Multilingual Sentence Boundary Detection , 2006, CL.

[27]  Michael Collins,et al.  Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.

[28]  Raman Chandrasekar,et al.  Gleaning Information from the Web: Using Syntax to Filter Out Irrelevant Information , 1996 .

[29]  David Yarowsky,et al.  Unsupervised Word Sense Disambiguation Rivaling Supervised Methods , 1995, ACL.

[30]  Larry Nazareth,et al.  A family of variable metric updates , 1977, Math. Program..

[31]  Jason Eisner Efficient Normal-Form Parsing for Combinatory Categorial Grammar , 1996, ACL.

[32]  D. Shanno Conditioning of Quasi-Newton Methods for Function Minimization , 1970 .

[33]  Ewan Klein,et al.  Natural Language Processing with Python , 2009 .

[34]  Martin Kay,et al.  Syntactic Process , 1979, ACL.

[35]  Julia Hockenmaier,et al.  Data and models for statistical parsing with combinatory categorial grammar , 2003 .