Annotation cost-sensitive active learning by tree sampling

Active learning is an important machine learning setup for reducing the labelling effort of humans. Although most existing works are based on a simple assumption that each labelling query has the same annotation cost, the assumption may not be realistic. That is, the annotation costs may actually vary between data instances. In addition, the costs may be unknown before making the query. Traditional active learning algorithms cannot deal with such a realistic scenario. In this work, we study annotation cost-sensitive active learning algorithms, which need to estimate the utility and cost of each query simultaneously. We propose a novel algorithm, the cost-sensitive tree sampling algorithm, that conducts the two estimation tasks together and solve it with a tree-structured model motivated from hierarchical sampling, a famous algorithm for traditional active learning. Extensive experimental results using datasets with simulated and true annotation costs validate that the proposed method is generally superior to other annotation cost-sensitive algorithms.

[1]  Carolyn Penstein Rosé,et al.  Estimating Annotation Cost for Active Learning in a Multi-Annotator Environment , 2009, HLT-NAACL 2009.

[2]  Goo Jun,et al.  Spatially Cost-Sensitive Active Learning , 2009, SDM.

[3]  Matthias Seeger,et al.  Learning from Labeled and Unlabeled Data , 2010, Encyclopedia of Machine Learning.

[4]  Kwang Ryel Ryu,et al.  Using Cluster-Based Sampling to Select Initial Training Set for Active Learning in Text Classification , 2004, PAKDD.

[5]  Hsuan-Tien Lin,et al.  A Novel Uncertainty Sampling Algorithm for Cost-Sensitive Multiclass Active Learning , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[6]  Christopher H. Bryant,et al.  Functional genomic hypothesis generation and experimentation by a robot scientist , 2004, Nature.

[7]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[8]  David A. Cohn,et al.  Improving generalization with active learning , 1994, Machine Learning.

[9]  Jeff A. Bilmes,et al.  Average-Case Active Learning with Costs , 2009, ALT.

[10]  Bernhard Schölkopf,et al.  Cluster Kernels for Semi-Supervised Learning , 2002, NIPS.

[11]  Udo Hahn,et al.  A Comparison of Models for Cost-Sensitive Active Learning , 2010, COLING.

[12]  Pietro Perona,et al.  Entropy-based active learning for object recognition , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[13]  Rong Jin,et al.  Active Learning by Querying Informative and Representative Examples , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[15]  Arnold W. M. Smeulders,et al.  Active learning using pre-clustering , 2004, ICML.

[16]  Dragos D. Margineantu,et al.  Active Cost-Sensitive Learning , 2005, IJCAI.

[17]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[18]  Sheng-Jun Huang,et al.  Cost-Effective Active Learning for Hierarchical Multi-Label Classification , 2018, IJCAI.

[19]  Kristen Grauman,et al.  Cost-Sensitive Active Visual Category Learning , 2010, International Journal of Computer Vision.

[20]  Ying Liu,et al.  Active Learning with Support Vector Machine Applied to Gene Expression Data for Cancer Classification , 2004, J. Chem. Inf. Model..

[21]  Eric K. Ringger,et al.  Active Learning for Part-of-Speech Tagging: Accelerating Corpus Annotation , 2007, LAW@ACL.

[22]  Dan Roth,et al.  Learning cost-sensitive active classifiers , 2002, Artif. Intell..

[23]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[24]  Andreas Krause,et al.  Adaptive Submodularity: Theory and Applications in Active Learning and Stochastic Optimization , 2010, J. Artif. Intell. Res..

[25]  Sanjoy Dasgupta,et al.  Two faces of active learning , 2011, Theor. Comput. Sci..

[26]  Xiaowei Xu,et al.  Representative Sampling for Text Classification Using Support Vector Machines , 2003, ECIR.

[27]  William A. Gale,et al.  A sequential algorithm for training text classifiers , 1994, SIGIR '94.

[28]  Sanjoy Dasgupta,et al.  Hierarchical sampling for active learning , 2008, ICML '08.

[29]  Huan Xu,et al.  Adaptive Maximization of Pointwise Submodular Functions With Budget Constraint , 2016, NIPS.

[30]  Mark Craven,et al.  Active Learning with Real Annotation Costs , 2008 .

[31]  Jaime G. Carbonell,et al.  Proactive learning: cost-sensitive active learning with multiple imperfect oracles , 2008, CIKM '08.

[32]  Vladimir Cherkassky,et al.  The Nature Of Statistical Learning Theory , 1997, IEEE Trans. Neural Networks.

[33]  Ying Liu Active Learning with Support Vector Machine Applied to Gene Expression Data for Cancer Classification. , 2005 .

[34]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[35]  Zhi-Hua Zhou,et al.  Cost-Effective Active Learning from Diverse Labelers , 2017, IJCAI.