Selective Sampling for Nearest Neighbor Classifiers

Most existing inductive learning algorithms work under the assumption that their training examples are already tagged. There are domains, however, where the tagging procedure requires significant computation resources or manual labor. In such cases, it may be beneficial for the learner to be active, intelligently selecting the examples for labeling with the goal of reducing the labeling cost. In this paper we present LSS—a lookahead algorithm for selective sampling of examples for nearest neighbor classifiers. The algorithm is looking for the example with the highest utility, taking its effect on the resulting classifier into account. Computing the expected utility of an example requires estimating the probability of its possible labels. We propose to use the random field model for this estimation. The LSS algorithm was evaluated empirically on seven real and artificial data sets, and its performance was compared to other selective sampling algorithms. The experiments show that the proposed algorithm outperforms other methods in terms of average error rate and stability.

[1]  David J. Slate,et al.  Letter Recognition Using Holland-Style Adaptive Classifiers , 1991, Machine Learning.

[2]  Jianping Zhang,et al.  Intelligent Selection of Instances for Prediction Functions in Lazy Learning Algorithms , 1997, Artificial Intelligence Review.

[3]  H. Sebastian Seung,et al.  Selective Sampling Using the Query by Committee Algorithm , 1997, Machine Learning.

[4]  Anders Krogh,et al.  Neural Network Ensembles, Cross Validation, and Active Learning , 1994, NIPS.

[5]  M. Degroot,et al.  Probability and Statistics , 2021, Examining an Operational Approach to Teaching Probability.

[6]  David Barber,et al.  Bayesian Classification With Gaussian Processes , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Lisa J. Schnell Learning How to Tell , 2005, Literature and medicine.

[8]  Leonard G. C. Hamey,et al.  Minimisation of data collection by active learning , 1995, Proceedings of ICNN'95 - International Conference on Neural Networks.

[9]  Athanasios Papoulis,et al.  Probability, Random Variables and Stochastic Processes , 1965 .

[10]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[11]  Barry Smyth,et al.  Building Compact Competent Case-Bases , 1999, ICCBR.

[12]  Tony R. Martinez,et al.  Reduction Techniques for Instance-Based Learning Algorithms , 2000, Machine Learning.

[13]  R. Adler,et al.  The Geometry of Random Fields , 1982 .

[14]  D. Kibler,et al.  Instance-based learning algorithms , 2004, Machine Learning.

[15]  David A. Cohn,et al.  Improving generalization with active learning , 1994, Machine Learning.

[16]  Dana Angluin,et al.  Queries and concept learning , 1988, Machine Learning.

[17]  Steven Salzberg,et al.  Lookahead and Pathology in Decision Tree Induction , 1995, IJCAI.

[18]  David A. Cohn,et al.  Active Learning with Statistical Models , 1996, NIPS.

[19]  E. Wong,et al.  Stochastic Processes in Engineering Systems , 1984 .

[20]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[21]  Martina Hasenjäger,et al.  Active Learning with Local Models , 1998, Neural Processing Letters.

[22]  Shaul Markovitch,et al.  LEARNING OF RESOURCE ALLOCATION STRATEGIES FOR GAME PLAYING , 1993, IJCAI.

[23]  David J. C. MacKay,et al.  Information-Based Objective Functions for Active Data Selection , 1992, Neural Computation.

[24]  Peter D. Turney Cost-Sensitive Classification: Empirical Evaluation of a Hybrid Genetic Decision Tree Induction Algorithm , 1994, J. Artif. Intell. Res..

[25]  Ming Tan,et al.  Two Case Studies in Cost-Sensitive Concept Acquisition , 1990, AAAI.

[26]  D. Mackay,et al.  Introduction to Gaussian processes , 1998 .

[27]  Shaul Markovitch,et al.  Experience Selection and Problem Choice in an Exploratory Learning System , 2004, Machine Learning.

[28]  Martina Hasenjäger,et al.  Active Learning of the Generalized High-Low Game , 1996, ICANN.

[29]  Jenq-Neng Hwang,et al.  Attentional focus training by boundary region data selection , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.

[30]  K. Lang,et al.  Learning to tell two spirals apart , 1988 .

[31]  David D. Lewis,et al.  Heterogeneous Uncertainty Sampling for Supervised Learning , 1994, ICML.

[32]  LindenbaumMichael,et al.  Selective Sampling for Nearest Neighbor Classifiers , 2004 .

[33]  Shlomo Argamon,et al.  Committee-Based Sampling For Training Probabilistic Classi(cid:12)ers , 1995 .

[34]  Yehoshua Y. Zeevi,et al.  The farthest point strategy for progressive image sampling , 1994, Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 2 - Conference B: Computer Vision & Image Processing. (Cat. No.94CH3440-5).

[35]  H. Sebastian Seung,et al.  Query by committee , 1992, COLT '92.

[36]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[37]  W. J. Studden,et al.  Theory Of Optimal Experiments , 1972 .

[38]  Andrew W. Moore,et al.  Q2: memory-based active learning for optimizing noisy continuous functions , 1998, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[39]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[40]  Christopher K. I. Williams Bayesian Classiication with Gaussian Processes , 1998 .

[41]  I. Miller Probability, Random Variables, and Stochastic Processes , 1966 .