Selecting Optimal Experiments for Multiple Output Multilayer Perceptrons

Where should a researcher conduct experiments to provide training data for a multilayer perceptron? This question is investigated, and a statistical method for selecting optimal experimental design points for multiple output multilayer perceptrons is introduced. Multiple class discrimination problems are examined using a framework in which the multilayer perceptron is viewed as a multivariate nonlinear regression model. Following a Bayesian formulation for the case where the variance-covariance matrix of the responses is unknown, a selection criterion is developed. This criterion is based on the volume of the joint confidence ellipsoid for the weights in a multilayer perceptron. An example is used to demonstrate the superiority of optimally selected design points over randomly chosen points, as well as points chosen in a grid pattern. Simplification of the basic criterion is offered through the use of Hadamard matrices to produce uncorrelated outputs.

[1]  David A. Cohn,et al.  Training Connectionist Networks with Queries and Selective Sampling , 1989, NIPS.

[2]  Wolfgang Kinzel,et al.  Improving a Network Generalization Ability by Selecting Examples , 1990 .

[3]  A. Ravindran,et al.  Engineering Optimization: Methods and Applications , 2006 .

[4]  B. Sankur,et al.  Applications of Walsh and related functions , 1986 .

[5]  M. J. Box,et al.  Estimation and Design Criteria for Multiresponse Non‐Linear Models with Non‐Homogeneous Variance , 1972 .

[6]  M. J. D. Powell,et al.  On search directions for minimization algorithms , 1973, Math. Program..

[7]  William H. Press,et al.  Numerical recipes , 1990 .

[8]  Eric B. Baum,et al.  Constructing Hidden Units Using Examples and Queries , 1990, NIPS.

[9]  Jenq-Neng Hwang,et al.  Query-based learning applied to partially trained multilayer perceptrons , 1991, IEEE Trans. Neural Networks.

[10]  Paul V. Biron Backpropagation: Theory, Architectures, and Applications, edited by Yves Chauvin and David E. Rumelhart , 1997, J. Am. Soc. Inf. Sci..

[11]  Sollich Query construction, entropy, and generalization in neural-network models. , 1994, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[12]  T. Watkin,et al.  Selecting examples for perceptrons , 1992 .

[13]  Kenneth W. Bauer,et al.  Integrated feature architecture selection , 1996, IEEE Trans. Neural Networks.

[14]  W. G. Hunter,et al.  Design of experiments for parameter estimation in multiresponse situations , 1966 .

[15]  Kenneth W. Bauer,et al.  Selecting optimal experiments for feedforward multilayer perceptrons , 1995 .

[16]  Garret N. Vanderplaats,et al.  Numerical Optimization Techniques for Engineering Design: With Applications , 1984 .

[17]  H. L. Lucas,et al.  DESIGN OF EXPERIMENTS IN NON-LINEAR SITUATIONS , 1959 .

[18]  David A. Cohn,et al.  Neural Network Exploration Using Optimal Experiment Design , 1993, NIPS.

[19]  George E. P. Box,et al.  SEQUENTIAL DESIGN OF EXPERIMENTS FOR NONLINEAR MODELS. , 1963 .

[20]  R. C. St D-Optimality for Regression Designs: A Review , 1975 .

[21]  George E. P. Box,et al.  The Bayesian estimation of common parameters from several responses , 1965 .

[22]  David J. C. MacKay,et al.  Information-Based Objective Functions for Active Data Selection , 1992, Neural Computation.

[23]  Mark Plutowski,et al.  Selecting concise training sets from clean data , 1993, IEEE Trans. Neural Networks.

[24]  W. J. Studden,et al.  Theory Of Optimal Experiments , 1972 .

[25]  Steven K. Rogers,et al.  An Introduction to Biological and Artificial Neural Networks for Pattern Recognition , 1991 .