Experience with selecting exemplars from clean data

Abstract In previous work, we developed a method for active selection of exemplars called ΔISB. Given a fixed set of exemplars, this method selects a concise subset for training, such that fitting the selected exemplars results in the entire set being fit as well as desired. Our implementation of ΔISB incorporates a method for regulating network complexity, automatically adding exemplars and hidden units as needed. In this paper, we compare ΔISB to three other exemplar selection techniques on three time series prediction problems, the Mackey-Glass time series of dimension 2.1 and 3.5, and the Rossler map. While both of the active selection methods were less expensive than training upon all of the examples, ΔISB performs best in terms of compactness of the selected set of examples, and is more generally applicable. A simplification of our technique we call ‘maximum error’ performs nearly as well in most situations, although it is not as generally applicable.

[1]  Sollich Query construction, entropy, and generalization in neural-network models. , 1994, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[2]  Garrison W. Cottrell,et al.  Learning Mackey-Glass from 25 Examples, Plus or Minus 2 , 1993, NIPS.

[3]  George E. P. Box,et al.  Empirical Model‐Building and Response Surfaces , 1988 .

[4]  David Haussler,et al.  What Size Net Gives Valid Generalization? , 1989, Neural Computation.

[5]  Daniel F. McCaffrey,et al.  Convergence rates for single hidden layer feedforward networks , 1994, Neural Networks.

[6]  Mark Plutowski Selecting training exemplars for neural network learning , 1994 .

[7]  W. J. Studden,et al.  Theory Of Optimal Experiments , 1972 .

[8]  H. Abarbanel,et al.  Determining embedding dimension for phase-space reconstruction using a geometrical construction. , 1992, Physical review. A, Atomic, molecular, and optical physics.

[9]  M. F. Møller,et al.  Efficient Training of Feed-Forward Neural Networks , 1993 .

[10]  H. Müller Optimal designs for nonparametric kernel regression , 1984 .

[11]  Mark Plutowski,et al.  Selecting concise training sets from clean data , 1993, IEEE Trans. Neural Networks.

[12]  A. Lapedes,et al.  Nonlinear signal processing using neural networks: Prediction and system modelling , 1987 .

[13]  Garrison W. Cottrell,et al.  Non-Linear Dimensionality Reduction , 1992, NIPS.

[14]  André I. Khuri,et al.  Response surface methodology: 1966–1988 , 1989 .

[15]  Paul W. Munro,et al.  Repeat Until Bored: A Pattern Selection Strategy , 1991, NIPS.

[16]  Garrison W. Cottrell,et al.  Please Scroll down for Article Connection Science Learning Simple Arithmetic Procedures , 2022 .

[17]  M. Deaton,et al.  Response Surfaces: Designs and Analyses , 1989 .

[18]  L. Glass,et al.  Oscillation and chaos in physiological control systems. , 1977, Science.

[19]  H. White,et al.  Cross-Validation Estimates IMSE , 1993, NIPS 1993.

[20]  A. R. Technische,et al.  The Dynamic Pattern Selection Algorithm: Eeective Training and Controlled Generalization of Backpropagation Neural Networks , 1994 .

[21]  M. Moller,et al.  Supervised learning on large redundant training sets , 1992, Neural Networks for Signal Processing II Proceedings of the 1992 IEEE Workshop.

[22]  L. Tsimring,et al.  The analysis of observed chaotic data in physical systems , 1993 .