论文信息 - Eigenvalue decay: A new method for neural network regularization

Eigenvalue decay: A new method for neural network regularization

This paper proposes two new training algorithms for multilayer perceptrons based on evolutionary computation, regularization, and transduction. Regularization is a commonly used technique for preventing the learning algorithm from overfitting the training data. In this context, this work introduces and analyzes a novel regularization scheme for neural networks (NNs) named eigenvalue decay, which aims at improving the classification margin. The introduction of eigenvalue decay led to the development of a new training method based on the same principles of SVM, and so named Support Vector NN (SVNN). Finally, by analogy with the transductive SVM (TSVM), it is proposed a transductive NN (TNN), by exploiting SVNN in order to address transductive learning. The effectiveness of the proposed algorithms is evaluated on seven benchmark datasets.

[1] Charles R. Johnson,et al. Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[2] Mohamed Cheriet,et al. Genetic algorithm–based training for semi-supervised SVM , 2010, Neural Computing and Applications.

[3] Mark Beale,et al. Neural Network Toolbox™ User's Guide , 2015 .

[4] Alexander Zien,et al. Semi-Supervised Learning , 2006 .

[5] Shigeo Abe,et al. Support Vector Machines for Pattern Classification (Advances in Pattern Recognition) , 2005 .

[6] Vladimir Vapnik,et al. Statistical learning theory , 1998 .

[7] Martin T. Hagan,et al. Gauss-Newton approximation to Bayesian learning , 1997, Proceedings of International Conference on Neural Networks (ICNN'97).

[8] Bernardete Ribeiro,et al. Improving the Generalization Capacity of Cascade Classifiers , 2013, IEEE Transactions on Cybernetics.

[9] Madan Gopal,et al. SVM-Based Tree-Type Neural Networks as a Critic in Adaptive Critic Designs for Control , 2007, IEEE Transactions on Neural Networks.

[10] Dariu Gavrila,et al. Monocular Pedestrian Detection: Survey and Experiments , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11] Peter L. Bartlett,et al. The Sample Complexity of Pattern Classification with Neural Networks: The Size of the Weights is More Important than the Size of the Network , 1998, IEEE Trans. Inf. Theory.

[12] Yoram Singer,et al. Pegasos: primal estimated sub-gradient solver for SVM , 2011, Math. Program..

[13] M. J. Usher. Applications of Information Theory , 1984 .

[14] Bernhard Sendhoff,et al. Neural network regularization and ensembling using multi-objective evolutionary algorithms , 2004, Proceedings of the 2004 Congress on Evolutionary Computation (IEEE Cat. No.04TH8753).

[15] P. Corral,et al. Optimization of ANN applied to non-linear system identification based in UWB , 2006, Joint IST Workshop on Mobile Future, 2006 and the Symposium on Trends in Communications. SympoTIC '06..

[16] Tamás D. Gedeon,et al. Exploring constructive cascade networks , 1999, IEEE Trans. Neural Networks.

[17] Herman Augusto Lepikson,et al. Applications of information theory, genetic algorithms, and neural models to predict oil flow , 2009 .

[18] David J. C. MacKay,et al. Bayesian Interpolation , 1992, Neural Computation.

[19] Heekuck Oh,et al. Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[20] Bernard Widrow,et al. Improving the learning speed of 2-layer neural networks by choosing initial values of the adaptive weights , 1990, 1990 IJCNN International Joint Conference on Neural Networks.

[21] Kurt Hornik,et al. Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks , 1990, Neural Networks.

[22] Shigeo Abe. Support Vector Machines for Pattern Classification , 2010, Advances in Pattern Recognition.

[23] L. BartlettP.. The sample complexity of pattern classification with neural networks , 2006 .

[24] M GavrilaDariu,et al. Monocular Pedestrian Detection , 2009 .

[25] Urbano Nunes,et al. Novel Maximum-Margin Training Algorithms for Supervised Neural Networks , 2010, IEEE Transactions on Neural Networks.

[26] Oswaldo Ludwig. Study on non-parametric methods for fast pattern recognition with emphasis on neural networks and cascade classifiers , 2012 .