An investigation of imitation learning algorithms for structured prediction

In the imitation learning paradigm algorithms learn from expert demonstrations in order to become able to accomplish a particular task. Daume III et al. [2009] framed structured prediction in this paradigm and developed the search-based structured prediction algorithm (Searn) which has been applied successfully to various natural language processing tasks with state-of-the-art performance. Recently, Ross et al. [2011] proposed the dataset aggregation algorithm (DAgger) and compared it with Searn in sequential prediction tasks. In this paper, we compare these two algorithms in the context of a more complex structured prediction task, namely biomedical event extraction. We demonstrate that DAgger has more stable performance and faster learning than Searn, and that these advantages are more pronounced in the parameter-free versions of the algorithms.

[1]  Sampo Pyysalo,et al.  BioNLP Shared Task 2011: Supporting Resources , 2011, BioNLP@ACL.

[2]  Alan Fern,et al.  Output Space Search for Structured Prediction , 2012, ICML.

[3]  Matthew Richardson,et al.  Markov Logic , 2008, Probabilistic Inductive Logic Programming.

[4]  Koby Crammer,et al.  Online Passive-Aggressive Algorithms , 2003, J. Mach. Learn. Res..

[5]  Andreas Vlachos,et al.  Search-based Structured Prediction applied to Biomedical Event Extraction , 2011, CoNLL.

[6]  Stefan Schaal,et al.  Is imitation learning the route to humanoid robots? , 1999, Trends in Cognitive Sciences.

[7]  Pedro M. Domingos MetaCost: a general method for making classifiers cost-sensitive , 1999, KDD '99.

[8]  Pieter Abbeel,et al.  Learning for control from multiple demonstrations , 2008, ICML '08.

[9]  John A. Carroll,et al.  Applied morphological processing of English , 2001, Natural Language Engineering.

[10]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[11]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[12]  Robert E. Schapire,et al.  Imitation Learning with a Value-Based Prior , 2007, UAI.

[13]  Ben Taskar,et al.  Structured Prediction Cascades , 2010, AISTATS.

[14]  Noah A. Smith Linguistic Structure Prediction , 2011, Synthesis Lectures on Human Language Technologies.

[15]  Pedro M. Domingos,et al.  Markov Logic: An Interface Layer for Artificial Intelligence , 2009, Markov Logic: An Interface Layer for Artificial Intelligence.

[16]  Pieter Abbeel,et al.  Apprenticeship learning via inverse reinforcement learning , 2004, ICML.

[17]  Geoffrey J. Gordon,et al.  A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.

[18]  Akinori Yonezawa,et al.  Overview of Genia Event Task in BioNLP Shared Task 2011 , 2011, BioNLP@ACL.

[19]  Pieter Abbeel,et al.  Apprenticeship learning and reinforcement learning with application to robotic control , 2008 .

[20]  Eugene Charniak,et al.  Any Domain Parsing: Automatic Domain Adaptation for Natural Language Parsing , 2010 .

[21]  Andreas Vlachos,et al.  Biomedical event extraction from abstracts and full papers using search-based structured prediction , 2011, BMC Bioinformatics.

[22]  Robert E. Schapire,et al.  Reinforcement learning without rewards , 2010 .

[23]  John Langford,et al.  Search-based structured prediction , 2009, Machine Learning.