Decisive Supervised Learning

Traditional active learning selects the most informative (e.g., the most uncertain) example and queries an oracle for the label. However, as more examples are learned in the process, even the most uncertain examples can become certain. In this case, would it be better to make predictions directly and take the consequence if the prediction is wrong, rather than asking the oracle for labels? In this paper, we propose a new learning paradigm. In contrast to the traditional active learning, the learner can obtain true labels not only by querying oracles but also by making predictions and taking the consequence. Under this paradigm, we further propose a novel algorithm named Decisive Learner which always chooses the most decisive action (either querying oracles or making predictions) in the learning process. Compared to other typical learners (indecisive learners, traditional active learners, conservative learners), we show empirically that our decisive learner makes fewer mistakes and incurs the smallest total costs in the learning process.

[1]  Sally A. Goldman,et al.  The Power of Self-Directed Learning , 1994, Machine Learning.

[2]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[3]  Jaime G. Carbonell,et al.  Proactive learning: cost-sensitive active learning with multiple imperfect oracles , 2008, CIKM '08.

[4]  Rich Caruana,et al.  Predicting good probabilities with supervised learning , 2005, ICML.

[5]  Panagiotis G. Ipeirotis,et al.  Get another label? improving data quality and data mining using multiple, noisy labelers , 2008, KDD.

[6]  S. García,et al.  An Extension on "Statistical Comparisons of Classifiers over Multiple Data Sets" for all Pairwise Comparisons , 2008 .

[7]  Jun Du,et al.  Adapting cost-sensitive learning for reject option , 2010, CIKM.

[8]  William A. Gale,et al.  A sequential algorithm for training text classifiers , 1994, SIGIR '94.

[9]  Fabio Roli,et al.  Reject option with multiple thresholds , 2000, Pattern Recognit..

[10]  Man Lung Yiu,et al.  Group-by skyline query processing in relational engines , 2009, CIKM.

[11]  Jun Du,et al.  Active Learning with Human-Like Noisy Oracle , 2010, 2010 IEEE International Conference on Data Mining.

[12]  Bianca Zadrozny,et al.  Transforming classifier scores into accurate multiclass probability estimates , 2002, KDD.

[13]  Peter L. Bartlett,et al.  Classification with a Reject Option using a Hinge Loss , 2008, J. Mach. Learn. Res..

[14]  Mark Craven,et al.  An Analysis of Active Learning Strategies for Sequence Labeling Tasks , 2008, EMNLP.

[15]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[16]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..