Optimization of Neural Networks with Multi-Objective LASSO Algorithm

This paper presents a bi-objective algorithm that optimizes the error and the sum of the absolute weights of a Multi-Layer Perceptron neural network. The algorithm is based on the linear Least Absolute Shrinkage and Selection Operator (LASSO) and provides simultaneous generalization and weight selection optimization. The algorithm searches for a set of optimal solutions called Pareto set from which a single weight vector with best performance and reduced number of weights is selected based on a validation criterion. The method is applied to classification and regression real problems and compared with the norm based multi-objective algorithm. Results show that the neural networks obtained have improved generalization performance and reduced topology.

[1]  Antônio de Pádua Braga,et al.  Improved generalization learning with sliding mode control and the Levenberg-Marquadt algorithm , 2002, VII Brazilian Symposium on Neural Networks, 2002. SBRN 2002. Proceedings..

[2]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[3]  Anders Krogh,et al.  A Simple Weight Decay Can Improve Generalization , 1991, NIPS.

[4]  Lutz Prechelt,et al.  Automatic early stopping using cross validation: quantifying the criteria , 1998, Neural Networks.

[5]  Russell Reed,et al.  Pruning algorithms-a survey , 1993, IEEE Trans. Neural Networks.

[6]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[7]  Philipp Slusallek,et al.  Introduction to real-time ray tracing , 2005, SIGGRAPH Courses.

[8]  Bernhard Sendhoff,et al.  Neural network regularization and ensembling using multi-objective evolutionary algorithms , 2004, Proceedings of the 2004 Congress on Evolutionary Computation (IEEE Cat. No.04TH8753).

[9]  Ricardo H. C. Takahashi,et al.  Recent Advances in the MOBJ Algorithm for Training Artifical Neural Networks , 2001, Int. J. Neural Syst..

[10]  Peter L. Bartlett,et al.  For Valid Generalization the Size of the Weights is More Important than the Size of the Network , 1996, NIPS.

[11]  Jonathan E. Fieldsend,et al.  Optimizing forecast model complexity using multi-objective evolutionary algorithms , 2004 .

[12]  Michael J. Todd,et al.  The Ellipsoid Method: A Survey , 1980 .

[13]  S. Hyakin,et al.  Neural Networks: A Comprehensive Foundation , 1994 .

[14]  Antônio de Pádua Braga,et al.  Training neural networks with a multi-objective sliding mode control algorithm , 2003, Neurocomputing.

[15]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[16]  Antônio de Pádua Braga,et al.  Improving neural networks generalization with new constructive and pruning methods , 2002, J. Intell. Fuzzy Syst..