Classifier's Complexity Control while Training Multilayer Perceptrons

We consider an integrated approach to design the classification rule. Here qualities of statistical and neural net approaches are merged together. Instead of using the multivariate models and statistical methods directly to design the classifier, we use them in order to whiten the data and then to train the perceptron. A special attention is paid to magnitudes of the weights and to optimization of the training procedure. We study an influence of all characteristics of the cost function (target values, conventional regularization parameters), parameters of the optimization method (learning step, starting weights, a noise injection to original training vectors, to targets, and to the weights) on a result. Some of the discussed methods to control complexity are almost not discussed in the literature yet.

[1]  Josef Kittler,et al.  Pattern recognition : a statistical approach , 1982 .

[2]  B. Chandrasekaran,et al.  On dimensionality and sample size in statistical pattern classification , 1971, Pattern Recognit..

[3]  Sarunas Raudys,et al.  Evolution and generalization of a single neurone: I. Single-layer perceptron as seven statistical classifiers , 1998, Neural Networks.

[4]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[5]  Guozhong An,et al.  The Effects of Adding Noise During Backpropagation Training on a Generalization Performance , 1996, Neural Computation.

[6]  R. Duin Small sample size generalization , 1995 .

[7]  Sarunas Raudys,et al.  Evolution and generalization of a single neurone: : II. Complexity of statistical classifiers and sample size considerations , 1998, Neural Networks.

[8]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[9]  Russell Reed,et al.  Pruning algorithms-a survey , 1993, IEEE Trans. Neural Networks.

[10]  Shun-ichi Amari,et al.  A Theory of Adaptive Pattern Classifiers , 1967, IEEE Trans. Electron. Comput..

[11]  Robert P. W. Duin,et al.  K-nearest Neighbors Directed Noise Injection in Multilayer Perceptron Training , 2000, IEEE Trans. Neural Networks Learn. Syst..

[12]  Robert J. Marks,et al.  Similarities of error regularization, sigmoid gain scaling, target smoothing, and training with jitter , 1995, IEEE Trans. Neural Networks.

[13]  Sing-Tze Bow,et al.  Pattern recognition and image preprocessing , 1992 .