Inactive learning?: difficulties employing active learning in practice

Despite the tremendous level of adoption of machine learning techniques in real-world settings, and the large volume of research on active learning, active learning techniques have been slow to gain substantial traction in practical applications. This reluctance of adoption is contrary to active learning's promise of reduced model-development costs and increased performance on a model-development budget. This essay presents several important and under-discussed challenges to using active learning well in practice. We hope this paper can serve as a call to arms for researchers in active learning--an encouragement to focus even more attention on how practitioners might actually use active learning.

[1]  Tianshun Yao,et al.  Active Learning with Sampling by Uncertainty and Density for Word Sense Disambiguation and Text Classification , 2008, COLING.

[2]  Hinrich Schütze,et al.  Stopping Criteria for Active Learning of Named Entity Recognition , 2008, COLING.

[3]  David D. Lewis,et al.  Heterogeneous Uncertainty Sampling for Supervised Learning , 1994, ICML.

[4]  Gary M. Weiss The Impact of Small Disjuncts on Classifier Learning , 2010, Data Mining.

[5]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[6]  Josh C. Bongard,et al.  Exploiting multiple classifier types with active learning , 2009, GECCO.

[7]  Ran El-Yaniv,et al.  Online Choice of Active Learning Algorithms , 2003, J. Mach. Learn. Res..

[8]  Grzegorz Swirszcz,et al.  On cross-validation and stacking: building seemingly predictive models on random data , 2011, SKDD.

[9]  Raymond J. Mooney,et al.  Diverse ensembles for active learning , 2004, ICML.

[10]  Udo Hahn,et al.  An Approach to Text Corpus Construction which Cuts Annotation Costs and Maintains Reusability of Annotated Data , 2007, EMNLP.

[11]  Andrew McCallum,et al.  Toward Optimal Active Learning through Sampling Estimation of Error Reduction , 2001, ICML.

[12]  Paul N. Bennett,et al.  Dual Strategy Active Learning , 2007, ECML.

[13]  Foster Provost,et al.  Guided Feature Labeling for Budget-Sensitive Learning Under Extreme Class Imbalance , 2010 .

[14]  Steffen Bickel,et al.  Active Risk Estimation , 2010, ICML.

[15]  Sanjoy Dasgupta,et al.  Analysis of a greedy active learning strategy , 2004, NIPS.

[16]  C. Lee Giles,et al.  Learning on the border: active learning in imbalanced data classification , 2007, CIKM '07.

[17]  Rich Caruana,et al.  An empirical comparison of supervised learning algorithms , 2006, ICML.

[18]  Arnold W. M. Smeulders,et al.  Active learning using pre-clustering , 2004, ICML.

[19]  Andrew McCallum,et al.  Employing EM and Pool-Based Active Learning for Text Classification , 1998, ICML.

[20]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[21]  Kamal Nigamyknigam,et al.  Employing Em in Pool-based Active Learning for Text Classiication , 1998 .

[22]  Foster J. Provost,et al.  Why label when you can search?: alternatives to active learning for applying human resources to build classification models under extreme class imbalance , 2010, KDD.

[23]  Xiaowei Xu,et al.  Representative Sampling for Text Classification Using Support Vector Machines , 2003, ECIR.

[24]  William A. Gale,et al.  A sequential algorithm for training text classifiers , 1994, SIGIR '94.

[25]  K. Vijay-Shanker,et al.  Taking into Account the Differences between Actively and Passively Acquired Data: The Case of Active Learning with Support Vector Machines for Imbalanced Datasets , 2009, NAACL.

[26]  Foster J. Provost,et al.  A Unified Approach to Active Dual Supervision for Labeling Features and Examples , 2010, ECML/PKDD.

[27]  U. Hahn,et al.  Reducing class imbalance during active learning for named entity annotation , 2009, K-CAP '09.

[28]  Jingrui He,et al.  Nearest-Neighbor-Based Active Learning for Rare Category Detection , 2007, NIPS.

[29]  Jeffrey S. Simonoff,et al.  Tree Induction Vs Logistic Regression: A Learning Curve Analysis , 2001, J. Mach. Learn. Res..

[30]  H. Sebastian Seung,et al.  Query by committee , 1992, COLT '92.

[31]  Andreas Vlachos,et al.  A stopping criterion for active learning , 2008, Computer Speech and Language.

[32]  Jaime G. Carbonell,et al.  Paired Sampling in Density-Sensitive Active Learning , 2008, ISAIM.

[33]  Jingbo Zhu,et al.  Multi-Criteria-Based Strategy to Stop Active Learning for Data Annotation , 2008, COLING.

[34]  Gary M. Weiss Mining with rarity: a unifying framework , 2004, SKDD.

[35]  K. Vijay-Shanker,et al.  A Method for Stopping Active Learning Based on Stabilizing Predictions and the Need for User-Adjustable Stopping , 2009, CoNLL.

[36]  Jingbo Zhu,et al.  Active Learning for Word Sense Disambiguation with Methods for Addressing the Class Imbalance Problem , 2007, EMNLP.

[37]  Lyle H. Ungar,et al.  Machine Learning manuscript No. (will be inserted by the editor) Active Learning for Logistic Regression: , 2007 .

[38]  Jason Baldridge,et al.  Active Learning and the Total Cost of Annotation , 2004, EMNLP.