Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence Active Learning from Relative Queries

Active learning has been extensively studied and shown to be useful in solving real problems. The typical setting of traditional active learning methods is querying labels from an oracle. This is only possible if an expert exists, which may not be the case in many real world applications. In this paper, we focus on designing easier questions that can be answered by a non-expert. These questions poll relative information as opposed to absolute information and can be even generated from sideinformation. We propose an active learning approach that queries the ordering of the importance of an instance's neighbors rather than its label. We explore our approach on real datasets and make several interesting discoveries including that querying neighborhood information can be an effective question to ask and sometimes can even yield better performance than querying labels.

[1]  Vijay V. Vazirani,et al.  Approximation Algorithms , 2001, Springer Berlin Heidelberg.

[2]  C. Lintott,et al.  Galaxy Zoo 1: data release of morphological classifications for nearly 900 000 galaxies , 2010, 1007.3265.

[3]  Rina Panigrahy,et al.  An Improved Algorithm Finding Nearest Neighbor Using Kd-trees , 2008, LATIN.

[4]  Rong Jin,et al.  Batch mode active learning and its application to medical image classification , 2006, ICML.

[5]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[6]  Diane J. Cook,et al.  Ask me better questions: active learning queries based on rule induction , 2011, KDD.

[7]  Mark Craven,et al.  Multiple-Instance Active Learning , 2007, NIPS.

[8]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[9]  Tong Zhang,et al.  The Value of Unlabeled Data for Classification Problems , 2000, ICML 2000.

[10]  Craig A. Knoblock,et al.  Selective Sampling with Redundant Views , 2000, AAAI/IAAI.

[11]  Mark Craven,et al.  An Analysis of Active Learning Strategies for Sequence Labeling Tasks , 2008, EMNLP.

[12]  Raymond J. Mooney,et al.  Diverse ensembles for active learning , 2004, ICML.

[13]  Sethuraman Panchanathan,et al.  Batch mode active sampling based on marginal probability distribution matching , 2012, TKDD.

[14]  Andrew McCallum,et al.  Reducing Labeling Effort for Structured Prediction Tasks , 2005, AAAI.

[15]  Nebojsa Jojic,et al.  Active spectral clustering via iterative uncertainty reduction , 2012, KDD.

[16]  Aristides Gionis,et al.  Estimating entity importance via counting set covers , 2012, KDD.

[17]  William A. Gale,et al.  A sequential algorithm for training text classifiers , 1994, SIGIR '94.

[18]  Russell Greiner,et al.  Optimistic Active-Learning Using Mutual Information , 2007, IJCAI.

[19]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[20]  On Latin , 1983 .

[21]  J. Lafferty,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[22]  Andrew McCallum,et al.  Toward Optimal Active Learning through Sampling Estimation of Error Reduction , 2001, ICML.

[23]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[24]  Fei Wang,et al.  Label Propagation through Linear Neighborhoods , 2006, IEEE Transactions on Knowledge and Data Engineering.

[25]  Adam Tauman Kalai,et al.  Adaptively Learning the Crowd Kernel , 2011, ICML.

[26]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[27]  Eric Horvitz,et al.  Selective Supervision: Guiding Supervised Learning with Decision-Theoretic Active Learning , 2007, IJCAI.