MaltOptimizer: A System for MaltParser Optimization

Freely available statistical parsers often require careful optimization to produce state-of-the-art results, which can be a non-trivial task especially for application developers who are not interested in parsing research for its own sake. We present MaltOptimizer, a freely available tool developed to facilitate parser optimization using the open-source system MaltParser, a data-driven parser-generator that can be used to train dependency parsers given treebank data. MaltParser offers a wide range of parameters for optimization, including nine different parsing algorithms, two different machine learning libraries (each with a number of different learners), and an expressive specification language that can be used to define arbitrarily rich feature models. MaltOptimizer is an interactive system that first performs an analysis of the training set in order to select a suitable starting point for optimization and then guides the user through the optimization of parsing algorithm, feature model, and learning algorithm. Empirical evaluation on data from the CoNLL 2006 and 2007 shared tasks on dependency parsing shows that MaltOptimizer consistently improves over the baseline of default settings and sometimes even surpasses the result of manual optimization.

[1]  Joakim Nivre,et al.  Memory-Based Dependency Parsing , 2004, CoNLL.

[2]  Andrew McCallum,et al.  Efficiently Inducing Features of Conditional Random Fields , 2002, UAI.

[3]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[4]  Fernando Pereira,et al.  Multilingual Dependency Analysis with a Two-Stage Discriminative Parser , 2006, CoNLL.

[5]  Dan Klein,et al.  Conditional Structure versus Conditional Estimation in NLP Models , 2002, EMNLP.

[6]  Koby Crammer,et al.  Online Passive-Aggressive Algorithms , 2003, J. Mach. Learn. Res..

[7]  Nur Izura Udzir,et al.  A Study on Feature Selection and Classification Techniques for Automatic Genre Classification of Traditional Malay Music , 2008, ISMIR.

[8]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[9]  Koby Crammer,et al.  Online Large-Margin Training of Dependency Parsers , 2005, ACL.

[10]  Joakim Nivre,et al.  MaltParser: A Data-Driven Parser-Generator for Dependency Parsing , 2006, LREC.

[11]  Joakim Nivre,et al.  Single Malt or Blended? A Study in Multilingual Parser Optimization , 2007, EMNLP.

[12]  Sabine Buchholz,et al.  CoNLL-X Shared Task on Multilingual Dependency Parsing , 2006, CoNLL.

[13]  Abhimanyu Das,et al.  Submodular meets Spectral: Greedy Algorithms for Subset Selection, Sparse Approximation and Dictionary Selection , 2011, ICML.

[14]  Joakim Nivre,et al.  An Improved Oracle for Dependency Parsing with Online Reordering , 2009, IWPT.

[15]  Joakim Nivre,et al.  Pseudo-Projective Dependency Parsing , 2005, ACL.

[16]  Joakim Nivre,et al.  Incrementality in Deterministic Dependency Parsing , 2004 .

[17]  Dan Klein,et al.  Learning Accurate, Compact, and Interpretable Tree Annotation , 2006, ACL.

[18]  Pierre Nugues,et al.  Automatic Discovery of Feature Sets for Dependency Parsing , 2010, COLING.

[19]  Antal van den Bosch Wrapped progressive sampling search for optimizing learning algorithm parameters , 2005 .

[20]  Tapio Salakoski,et al.  Speeding Up Greedy Forward Selection for Regularized Least-Squares , 2010, 2010 Ninth International Conference on Machine Learning and Applications.

[21]  Sebastian Riedel,et al.  The CoNLL 2007 Shared Task on Dependency Parsing , 2007, EMNLP.

[22]  Michael A. Covington,et al.  A Fundamental Algorithm for Dependency Parsing , 2004 .

[23]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[24]  Joakim Nivre,et al.  An Efficient Algorithm for Projective Dependency Parsing , 2003, IWPT.

[25]  Joakim Nivre,et al.  Characterizing the Errors of Data-Driven Dependency Parsing Models , 2007, EMNLP.

[26]  Walter Daelemans,et al.  Simultaneous feature selection and parameter optimization for memory-based natural language processing , 2000 .

[27]  D. Korycinskia,et al.  Adaptive Feature Selection for Hyperspectral Data Analysis , 2003 .

[28]  Joakim Nivre,et al.  Non-Projective Dependency Parsing in Expected Linear Time , 2009, ACL.

[29]  Walter Daelemans,et al.  Combined Optimization of Feature Selection and Algorithm Parameters in Machine Learning of Language , 2003, ECML.

[30]  Joakim Nivre,et al.  MaltOptimizer: An Optimization Tool for MaltParser , 2012, EACL.

[31]  Joakim Nivre,et al.  Inductive Dependency Parsing , 2006, Text, speech and language technology.

[32]  Joakim Nivre,et al.  Algorithms for Deterministic Incremental Dependency Parsing , 2008, CL.

[33]  Joakim Nivre Incremental Non-Projective Dependency Parsing , 2007, HLT-NAACL.