Discovering General-Purpose Active Learning Strategies

We propose a general-purpose approach to discovering active learning (AL) strategies from data. These strategies are transferable from one domain to another and can be used in conjunction with many machine learning models. To this end, we formalize the annotation process as a Markov decision process, design universal state and action spaces and introduce a new reward function that precisely model the AL objective of minimizing the annotation cost. We seek to find an optimal (non-myopic) AL strategy using reinforcement learning. We evaluate the learned strategies on multiple unrelated domains and show that they consistently outperform state-of-the-art baselines.

[1]  Hugo Larochelle,et al.  Meta-Learning for Batch Mode Active Learning , 2018, ICLR.

[2]  Yuan Li,et al.  Learning how to Active Learn: A Deep Reinforcement Learning Approach , 2017, EMNLP.

[3]  Sanja Fidler,et al.  Efficient Interactive Annotation of Segmentation Datasets with Polygon-RNN++ , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4]  Christoph H. Lampert,et al.  Learning Intelligent Dialogs for Bounding Box Annotation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[5]  Andreas Nürnberger,et al.  The Power of Ensembles for Active Learning in Image Classification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[6]  Andrew G. Barto,et al.  Reinforcement learning , 1998 .

[7]  Minoru Asada,et al.  Purposive Behavior Acquisition for a Real Robot by Vision-Based Reinforcement Learning , 2005, Machine Learning.

[8]  Pascal Fua,et al.  Learning Active Learning from Real and Synthetic Data , 2017, ArXiv.

[9]  Bernt Schiele,et al.  RALF: A reinforced active learning formulation for object class recognition , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Kamalika Chaudhuri,et al.  Active Learning from Weak and Strong Labelers , 2015, NIPS.

[11]  Rong Jin,et al.  Active Learning by Querying Informative and Representative Examples , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Quoc V. Le,et al.  Neural Architecture Search with Reinforcement Learning , 2016, ICLR.

[13]  Joachim Denzler,et al.  Active learning and discovery of object categories in the presence of unnameable instances , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Naftali Tishby,et al.  Query by Committee Made Real , 2005, NIPS.

[15]  Sergey Levine,et al.  End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[16]  Yang Wu,et al.  Meta-Learning Transferable Active Learning Policies by Deep Reinforcement Learning , 2018, ArXiv.

[17]  Ashutosh Saxena,et al.  High speed obstacle avoidance using monocular vision and reinforcement learning , 2005, ICML.

[18]  Hado van Hasselt,et al.  Double Q-learning , 2010, NIPS.

[19]  Svetlana Lazebnik,et al.  Active Object Localization with Deep Reinforcement Learning , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[20]  William A. Gale,et al.  A sequential algorithm for training text classifiers , 1994, SIGIR '94.

[21]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[22]  Nikos Karampatziakis,et al.  Probabilistic Outputs for SVMs and Comparisons to Regularized Likelihood Methods , 2007 .

[23]  Deva Ramanan,et al.  Tracking as Online Decision-Making: Learning a Policy from Streaming Videos with Reinforcement Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[24]  Chunyan Miao,et al.  Second-Order Online Active Learning and Its Applications , 2018, IEEE Transactions on Knowledge and Data Engineering.

[25]  Kristen Grauman,et al.  Learning to Look Around: Intelligently Exploring Unseen Environments for Unknown Tasks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[26]  Ludovic Denoyer,et al.  A Meta-Learning Approach to One-Step Active-Learning , 2017, AutoML@PKDD/ECML.

[27]  Tom Schaul,et al.  Prioritized Experience Replay , 2015, ICLR.

[28]  Hsuan-Tien Lin,et al.  Can Active Learning Experience Be Transferred? , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[29]  Trevor Darrell,et al.  Learning to Reason: End-to-End Module Networks for Visual Question Answering , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[30]  Kun Deng,et al.  Balancing exploration and exploitation: a new algorithm for active machine learning , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[31]  R. J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[32]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[33]  Ramesh Raskar,et al.  Designing Neural Network Architectures using Reinforcement Learning , 2016, ICLR.

[34]  Ran El-Yaniv,et al.  Online Choice of Active Learning Algorithms , 2003, J. Mach. Learn. Res..

[35]  Joachim Denzler,et al.  Selecting Influential Examples: Active Learning with Expected Model Output Changes , 2014, ECCV.

[36]  Philip Bachman,et al.  Learning Algorithms for Active Learning , 2017, ICML.

[37]  Gholamreza Haffari,et al.  Learning How to Actively Learn: A Deep Imitation Learning Approach , 2018, ACL.

[38]  Gang Hua,et al.  Multi-class Multi-annotator Active Learning with Robust Gaussian Process for Visual Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[39]  Fei-Fei Li,et al.  Best of both worlds: Human-machine collaboration for object annotation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Stefan Lee,et al.  Embodied Question Answering , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[41]  A. Asuncion,et al.  UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences , 2007 .

[42]  Raquel Urtasun,et al.  Latent Structured Active Learning , 2013, NIPS.

[43]  Peter Stone,et al.  Reinforcement learning , 2019, Scholarpedia.

[44]  Chelsea Finn,et al.  Active One-shot Learning , 2017, ArXiv.

[45]  Miriam Bellver,et al.  Hierarchical Object Detection with Deep Reinforcement Learning , 2016, NIPS 2016.

[46]  Dhruv Batra,et al.  Active learning for structured probabilistic models with histogram approximation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Hsuan-Tien Lin,et al.  Active Learning by Learning , 2015, AAAI.

[48]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[49]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.