Active Learning for Pipeline Models

For many machine learning solutions to complex applications, there are significant performance advantages to decomposing the overall task into several simpler sequential stages, commonly referred to as a pipeline model. Typically, such scenarios are also characterized by high sample complexity, motivating the study of active learning for these situations. While most active learning research examines single predictions, we extend such work to applications which utilize pipelined predictions. Specifically, we present an adaptive strategy for combining local active learning strategies into one that minimizes the annotation requirements for the overall task. Empirical results for a three-stage entity and relation extraction system demonstrate a significant reduction in supervised data requirements when using the proposed method.

[1]  Dan Roth,et al.  Margin-Based Active Learning for Structured Output Spaces , 2006, ECML.

[2]  D. Roth 1 Global Inference for Entity and Relation Identification via a Linear Programming Formulation , 2007 .

[3]  L. Getoor,et al.  1 Global Inference for Entity and Relation Identification via a Linear Programming Formulation , 2007 .

[4]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[5]  Dan Roth,et al.  A Linear Programming Formulation for Global Inference in Natural Language Tasks , 2004, CoNLL.

[6]  Andrew McCallum,et al.  Reducing Labeling Effort for Structured Prediction Tasks , 2005, AAAI.

[7]  Ming-Wei Chang,et al.  Multilingual dependency parsing: A pipeline approach , 2007 .

[8]  Rosie Jones,et al.  Learning to Extract Entities from Labeled and Unlabeled Text , 2005 .

[9]  Jian Su,et al.  Multi-Criteria-based Active Learning for Named Entity Recognition , 2004, ACL.

[10]  Raymond J. Mooney,et al.  Active Learning for Natural Language Parsing and Information Extraction , 1999, ICML.

[11]  Paul N. Bennett,et al.  Dual Strategy Active Learning , 2007, ECML.

[12]  Michael Collins,et al.  Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.

[13]  David A. Cohn,et al.  Active Learning with Statistical Models , 1996, NIPS.

[14]  Beatrice Alex,et al.  Optimising Selective Sampling for Bootstrapping Named Entity Recognition , 2005, ICML 2005.

[15]  Andrew McCallum,et al.  Toward Optimal Active Learning through Sampling Estimation of Error Reduction , 2001, ICML.

[16]  Andrew Y. Ng,et al.  Solving the Problem of Cascading Errors: Approximate Bayesian Inference for Linguistic Annotation Pipelines , 2006, EMNLP.