Bayesian selection of important features for feedforward neural networks

Abstract This paper presents a probability of error based method of determining the saliency (usefulness) of input features and hidden nodes. We show that the partial derivative of the output nodes with respect to a given input feature yields a sensitivity measure for the probability of error. This partial derivative provides a saliency metric for determining the sensitivity of the feedforward network trained with a mean squared error learning procedure to a given input feature.

[1]  Wright-Patterson Afb,et al.  Feature Selection Using a Multilayer Perceptron , 1990 .

[2]  H. Gish,et al.  A probabilistic approach to the understanding and training of neural network classifiers , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[3]  Bruce W. Suter,et al.  The multilayer perceptron as an approximation to a Bayes optimal discriminant function , 1990, IEEE Trans. Neural Networks.

[4]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[5]  Julius T. Tou,et al.  Pattern Recognition Principles , 1974 .

[6]  Steven K. Rogers,et al.  Multisensor Target Detection And Classification , 1988, Defense, Security, and Sensing.

[7]  R. Lippmann Pattern classification using neural networks , 1989, IEEE Communications Magazine.

[8]  Paul L. Meyer,et al.  Introductory Probability and Statistical Applications , 1970 .

[9]  Steven K. Rogers,et al.  An Approach To Multiple Sensor Target Detection , 1989, Defense, Security, and Sensing.

[10]  Donald H. Foley Considerations of sample and feature size , 1972, IEEE Trans. Inf. Theory.

[11]  M. W. Roth Survey of neural network technology for automatic target recognition , 1990, IEEE Trans. Neural Networks.

[12]  Terrence J. Sejnowski,et al.  Learned classification of sonar targets using a massively parallel network , 1988, IEEE Trans. Acoust. Speech Signal Process..

[13]  Steven K. Rogers,et al.  An Introduction to Biological and Artificial Neural Networks for Pattern Recognition , 1991 .

[14]  Josef Kittler,et al.  Pattern recognition : a statistical approach , 1982 .

[15]  Steven R. Lay Analysis: An Introduction to Proof , 1986 .