Active Learning for Multiclass Cost-Sensitive Classification Using Probabilistic Models

Multiclass cost-sensitive active learning is a relatively new problem. In this paper, we derive the maximum expected cost and cost-weighted minimum margin strategies for multiclass cost-sensitive active learning. The two strategies can be viewed as extended versions of the classical cost-insensitive active learning strategies. The experimental results demonstrate that the derived strategies are promising for cost-sensitive active learning. In particular, the cost-sensitive strategies out-perform cost-insensitive ones on many benchmark data-sets and justify that an appropriate consideration of the cost information is important for solving cost-sensitive active learning problems.

[1]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[2]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[3]  Lawrence O. Hall,et al.  Active learning to recognize multiple types of plankton , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[4]  Timothy X. Brown,et al.  Reinforcement Learning for Call Admission Control and Routing under Quality of Service Constraints in Multimedia Networks , 2002, Machine Learning.

[5]  Charles Elkan,et al.  The Foundations of Cost-Sensitive Learning , 2001, IJCAI.

[6]  John Langford,et al.  An iterative method for multi-class cost-sensitive learning , 2004, KDD.

[7]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[8]  Chih-Jen Lin,et al.  Generalized Bradley-Terry Models and Multi-Class Probability Estimates , 2006, J. Mach. Learn. Res..

[9]  Rong Yan,et al.  Automatically labeling video data using multi-class active learning , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[10]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[11]  Ashish Kapoor,et al.  Active learning for large multi-class problems , 2009, CVPR.

[12]  Hsuan-Tien Lin,et al.  One-sided Support Vector Regression for Multiclass Cost-sensitive Classification , 2010, ICML.

[13]  Rong Yan,et al.  Multi-class active learning for video semantic feature extraction , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[14]  Brigham Anderson,et al.  Active learning for Hidden Markov Models: objective functions and algorithms , 2005, ICML.

[15]  Andreas Vlachos,et al.  A stopping criterion for active learning , 2008, Computer Speech and Language.

[16]  Rong Jin,et al.  Active Learning by Querying Informative and Representative Examples , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Arnold W. M. Smeulders,et al.  Active learning using pre-clustering , 2004, ICML.

[18]  Nikolaos Papanikolopoulos,et al.  Multi-class active learning for image classification , 2009, CVPR.

[19]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[20]  Ronald Rosenfeld,et al.  Semi-supervised learning with graphs , 2005 .

[21]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[22]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[23]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[24]  Pedro M. Domingos MetaCost: a general method for making classifiers cost-sensitive , 1999, KDD '99.