Dynamic Feature Selection for Dependency Parsing

Feature computation and exhaustive search have significantly restricted the speed of graph-based dependency parsing. We propose a faster framework of dynamic feature selection, where features are added sequentially as needed, edges are pruned early, and decisions are made online for each sentence. We model this as a sequential decision-making problem and solve it by imitation learning techniques. We test our method on 7 languages. Our dynamic parser can achieve accuracies comparable or even superior to parsers using a full set of features, while computing fewer than 30% of the feature templates.

[1]  Alexander M. Rush,et al.  Vine Pruning for Efficient Multi-Pass Dependency Parsing , 2012, NAACL.

[2]  Jason Eisner,et al.  Three New Probabilistic Models for Dependency Parsing: An Exploration , 1996, COLING.

[3]  Dan Klein,et al.  Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network , 2003, NAACL.

[4]  John Langford,et al.  Search-based structured prediction , 2009, Machine Learning.

[5]  David A. Smith,et al.  Dependency Parsing by Belief Propagation , 2008, EMNLP.

[6]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[7]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[8]  Robert E. Tarjan,et al.  Finding optimum branchings , 1977, Networks.

[9]  J. Andrew Bagnell,et al.  SpeedBoost: Anytime Prediction with Uniform Near-Optimality , 2012, AISTATS.

[10]  Eric P. Xing,et al.  Stacking Dependency Parsers , 2008, EMNLP.

[11]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[12]  Koby Crammer,et al.  Online Passive-Aggressive Algorithms , 2003, J. Mach. Learn. Res..

[13]  Dale Schuurmans,et al.  Simple Training of Dependency Parsers via Structured Boosting , 2007, IJCAI.

[14]  David M. Bradley,et al.  Boosting Structured Prediction for Imitation Learning , 2006, NIPS.

[15]  Pieter Abbeel,et al.  Apprenticeship learning via inverse reinforcement learning , 2004, ICML.

[16]  Koby Crammer,et al.  Ultraconservative Online Algorithms for Multiclass Problems , 2001, J. Mach. Learn. Res..

[17]  Brian Roark,et al.  Classifying Chart Cells for Quadratic Complexity Context-Free Inference , 2008, COLING.

[18]  Ben Taskar,et al.  Structured Prediction Cascades , 2010, AISTATS.

[19]  Sabine Buchholz,et al.  CoNLL-X Shared Task on Multilingual Dependency Parsing , 2006, CoNLL.

[20]  Daphne Koller,et al.  Active Classification based on Value of Classifier , 2011, NIPS.

[21]  Koby Crammer,et al.  Online Large-Margin Training of Dependency Parsers , 2005, ACL.

[22]  David Ellis,et al.  Multilevel Coarse-to-Fine PCFG Parsing , 2006, NAACL.

[23]  Xavier Carreras,et al.  Experiments with a Higher-Order Projective Dependency Parser , 2007, EMNLP.

[24]  Fernando Pereira,et al.  Non-Projective Dependency Parsing using Spanning Tree Algorithms , 2005, HLT.

[25]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[26]  Geoffrey J. Gordon,et al.  A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.

[27]  Colin Cherry,et al.  Fast and Accurate Arc Filtering for Dependency Parsing , 2010, COLING.

[28]  Yuji Matsumoto,et al.  Statistical Dependency Analysis with Support Vector Machines , 2003, IWPT.

[29]  Jason Eisner,et al.  Cost-sensitive Dynamic Feature Selection , 2012 .

[30]  Fernando Pereira,et al.  Online Learning of Approximate Dependency Parsing Algorithms , 2006, EACL.

[31]  Balázs Kégl,et al.  Fast classification using sparse decision DAGs , 2012, ICML.

[32]  Michael Collins,et al.  Efficient Third-Order Dependency Parsers , 2010, ACL.