Active learning in neural networks

We discuss a new paradigm, called active learning, for supervised learning that aims at improving the efficiency of neural network training procedures. The starting point for active learning is the observation that the traditional approach of randomly selecting training samples leads to large, highly redundant training sets. This redundancy is not always desirable. Especially if the acquisition of training data is expensive, one is rather interested in small, information training sets. Such training sets can be obtained if the learner is enabled to select those training data that he or she expects to be most informative. In this case, the learner is no longer a passive recipient of information but takes an active role in the selection of the training data.

[1]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[2]  Mark Plutowski,et al.  Selecting concise training sets from clean data , 1993, IEEE Trans. Neural Networks.

[3]  Gerhard Paass,et al.  Bayesian Query Construction for Neural Network Models , 1994, NIPS.

[4]  Manfred Opper,et al.  Selection of examples for a linear classifier , 1996 .

[5]  Jenq-Neng Hwang,et al.  Query-based learning applied to partially trained multilayer perceptrons , 1991, IEEE Trans. Neural Networks.

[6]  Eric B. Baum,et al.  Neural net algorithms that learn in polynomial time from examples and queries , 1991, IEEE Trans. Neural Networks.

[7]  Partha Niyogi,et al.  Active Learning for Function Approximation , 1994, NIPS.

[8]  D. Angluin Queries and Concept Learning , 1988 .

[9]  Joachim M. Buhmann,et al.  Active Data Clustering , 1997, NIPS.

[10]  Anders Krogh,et al.  Neural Network Ensembles, Cross Validation, and Active Learning , 1994, NIPS.

[11]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[12]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[13]  David J. C. MacKay,et al.  The Evidence Framework Applied to Classification Networks , 1992, Neural Computation.

[14]  Kenji Fukumizu,et al.  Active Learning in Multilayer Perceptrons , 1995, NIPS.

[15]  Christian Cachin,et al.  Pedagogical pattern selection strategies , 1994, Neural Networks.

[16]  John N. Tsitsiklis,et al.  Active Learning Using Arbitrary Binary Valued Queries , 1993, Machine Learning.

[17]  Tom Heskes,et al.  The Use of Being Stubborn and Introspective , 2000 .

[18]  David A. Cohn,et al.  Improving generalization with active learning , 1994, Machine Learning.

[19]  Ronald L. Rivest,et al.  On the sample complexity of pac-learning using random and chosen examples , 1990, Annual Conference Computational Learning Theory.

[20]  Yann LeCun,et al.  Optimal Brain Damage , 1989, NIPS.

[21]  Eli Shamir,et al.  Query by Committee, Linear Separation and Random Walks , 1999, EuroCOLT.

[22]  David J. C. MacKay,et al.  Information-Based Objective Functions for Active Data Selection , 1992, Neural Computation.

[23]  Sebastian Thrun,et al.  The role of exploration in learning control , 1992 .

[24]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[25]  T. Watkin,et al.  Selecting examples for perceptrons , 1992 .

[26]  Joachim M. Buhmann,et al.  Pairwise Data Clustering by Deterministic Annealing , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[27]  Shlomo Argamon,et al.  Sample selection in natural language learning , 1995, Learning for Natural Language Processing.

[28]  Prasad Tadepalli,et al.  Active Learning with Committees for Text Categorization , 1997, AAAI/IAAI.

[29]  Helge J. Ritter,et al.  Learning and Generalization in Cascade Network Architectures , 1996, Neural Computation.

[30]  Martin A. Riedmiller,et al.  Advanced supervised learning in multi-layer perceptrons — From backpropagation to adaptive learning algorithms , 1994 .

[31]  David A. Cohn,et al.  Minimizing Statistical Bias with Queries , 1996, NIPS.

[32]  George E. P. Box,et al.  Empirical Model‐Building and Response Surfaces , 1988 .

[33]  W. J. Studden,et al.  Theory Of Optimal Experiments , 1972 .

[34]  Teuvo Kohonen,et al.  Learning vector quantization , 1998 .

[35]  Shlomo Argamon,et al.  Committee-Based Sampling For Training Probabilistic Classi(cid:12)ers , 1995 .

[36]  H. Sebastian Seung,et al.  Query by committee , 1992, COLT '92.

[37]  M. Hasenjäger,et al.  Active Learning in Self-Organizing Maps , 1999 .

[38]  Paul W. Munro,et al.  Repeat Until Bored: A Pattern Selection Strategy , 1991, NIPS.

[39]  Kurt Hornik,et al.  Cross-validation with active pattern selection for neural-network classifiers , 1998, IEEE Trans. Neural Networks.

[40]  Wolfgang Kinzel,et al.  Improving a Network Generalization Ability by Selecting Examples , 1990 .

[41]  Anthony C. Atkinson,et al.  Optimum Experimental Designs , 1992 .

[42]  David A. Cohn,et al.  Neural Network Exploration Using Optimal Experiment Design , 1993, NIPS.

[43]  Isabelle Guyon,et al.  Discovering Informative Patterns and Data Cleaning , 1996, Advances in Knowledge Discovery and Data Mining.

[44]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[45]  Naftali Tishby,et al.  Consistent inference of probabilities in layered networks: predictions and generalizations , 1989, International 1989 Joint Conference on Neural Networks.

[46]  Martina Hasenjäger,et al.  Active Learning with Local Models , 1998, Neural Processing Letters.

[47]  A. Linden,et al.  Inversion of multilayer nets , 1989, International 1989 Joint Conference on Neural Networks.

[48]  O. Kinouchi,et al.  Optimal generalization in perceptions , 1992 .

[49]  H. Sebastian Seung,et al.  Selective Sampling Using the Query by Committee Algorithm , 1997, Machine Learning.

[50]  Joel Ratsaby,et al.  An Incremental Nearest Neighbor Algorithm with Queries , 1997, NIPS.

[51]  Christian Lebiere,et al.  The Cascade-Correlation Learning Architecture , 1989, NIPS.

[52]  David A. Cohn,et al.  Theory and Practice of Vector Quantizers Trained on Small Training Sets , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[53]  K. Lang,et al.  Learning to tell two spirals apart , 1988 .

[54]  Ido Dagan,et al.  Selective Sampling In Natural Language Learning , 1995 .

[55]  Dana Angluin,et al.  Learning Regular Sets from Queries and Counterexamples , 1987, Inf. Comput..

[56]  Klaus Obermayer,et al.  A Stochastic Self-Organizing Map for Proximity Data , 1999, Neural Computation.

[57]  Dana Angluin,et al.  When won't membership queries help? , 1991, STOC '91.

[58]  David P. Dobkin,et al.  The quickhull algorithm for convex hulls , 1996, TOMS.

[59]  Sollich Query construction, entropy, and generalization in neural-network models. , 1994, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[60]  David A. Cohn,et al.  Active Learning with Statistical Models , 1996, NIPS.

[61]  Sebastian Thrun,et al.  Exploration in active learning , 1998 .

[62]  Kenneth W. Bauer,et al.  Selecting Optimal Experiments for Multiple Output Multilayer Perceptrons , 1997, Neural Computation.

[63]  Tom Heskes,et al.  Input Selection with Partial Retraining , 1997, ICANN.

[64]  Garrison W. Cottrell,et al.  Experience with selecting exemplars from clean data , 1996, Neural Networks.