A New Constructive Method to Optimize Neural Network Architecture and Generalization

In this paper, after analyzing the reasons of poor generalization and overfitting in neural networks, we consider some noise data as a singular value of a continuous function - jump discontinuity point. The continuous part can be approximated with the simplest neural networks, which have good generalization performance and optimal network architecture, by traditional algorithms such as constructive algorithm for feed-forward neural networks with incremental training, BP algorithm, ELM algorithm, various constructive algorithm, RBF approximation and SVM. At the same time, we will construct RBF neural networks to fit the singular value with every error in, and we prove that a function with jumping discontinuity points can be approximated by the simplest neural networks with a decay RBF neural networks in by each error, and a function with jumping discontinuity point can be constructively approximated by a decay RBF neural networks in by each error and the constructive part have no generalization influence to the whole machine learning system which will optimize neural network architecture and generalization performance, reduce the overfitting phenomenon by avoid fitting the noisy data.

[1]  J. S. Sahambi,et al.  Classification of ECG arrhythmias using multi-resolution analysis and neural networks , 2003, TENCON 2003. Conference on Convergent Technologies for Asia-Pacific Region.

[2]  J. P. Castagna,et al.  Avoiding overfitting caused by noise using a uniform training mode , 1999, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339).

[3]  F. Tay,et al.  Application of support vector machines in financial time series forecasting , 2001 .

[4]  Skander Soltani,et al.  On the use of the wavelet decomposition for time series prediction , 2002, ESANN.

[5]  C. McGreavy,et al.  Application of wavelets and neural networks to diagnostic system development , 1999 .

[6]  Simon Haykin,et al.  Neural Networks and Learning Machines , 2010 .

[7]  Gregg D. Wilensky,et al.  Neural Network Studies , 1993 .

[8]  N Mai Duy,et al.  APPROXIMATION OF FUNCTION AND ITS DERIVATIVES USING RADIAL BASIS FUNCTION NETWORKS , 2003 .

[9]  Xin Yao,et al.  A constructive algorithm for training cooperative neural network ensembles , 2003, IEEE Trans. Neural Networks.

[10]  G. Lewicki,et al.  Approximation by Superpositions of a Sigmoidal Function , 2003 .

[11]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[12]  Xin Yao,et al.  A New Constructive Algorithm for Architectural and Functional Adaptation of Artificial Neural Networks , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[13]  S. Liong,et al.  GENERALIZATION FOR MULTILAYER NEURAL NETWORK BAYESIAN REGULARIZATION OR EARLY STOPPING , 2004 .

[14]  J.I. Mulero-Martinez,et al.  Best Approximation of Gaussian Neural Networks With Nodes Uniformly Spaced , 2008, IEEE Transactions on Neural Networks.

[15]  Xuli Han,et al.  Constructive Approximation to Multivariate Function by Decay RBF Neural Network , 2010, IEEE Transactions on Neural Networks.

[16]  Guang-Bin Huang,et al.  Extreme learning machine: a new learning scheme of feedforward neural networks , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[17]  Bernard Delyon,et al.  Accuracy analysis for wavelet approximations , 1995, IEEE Trans. Neural Networks.

[18]  L. Cooper,et al.  When Networks Disagree: Ensemble Methods for Hybrid Neural Networks , 1992 .

[19]  Igor V. Tetko,et al.  Neural network studies, 1. Comparison of overfitting and overtraining , 1995, J. Chem. Inf. Comput. Sci..

[20]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[21]  Les E. Atlas,et al.  Recurrent neural networks and robust time series prediction , 1994, IEEE Trans. Neural Networks.

[22]  Vince D. Calhoun,et al.  A Parallel Independent Component Analysis Approach to Investigate Genomic Influence on Brain Function , 2008, IEEE Signal Processing Letters.

[23]  Yichuang Sun,et al.  Wavelet neural network approach for fault diagnosis of analogue circuits , 2004 .

[24]  Kenji Fukumizu,et al.  Relation between weight size and degree of over-fitting in neural network regression , 2008, Neural Networks.

[25]  Meng Joo Er,et al.  Face recognition with radial basis function (RBF) neural networks , 2002, IEEE Trans. Neural Networks.

[26]  Guangren Duan,et al.  The Design of RBF Neural Networks for Solving Overfitting Problem , 2006, 2006 6th World Congress on Intelligent Control and Automation.

[27]  Raphaël Féraud,et al.  A Fast and Accurate Face Detector Based on Neural Networks , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[28]  Erkki Oja,et al.  A class of neural networks for independent component analysis , 1997, IEEE Trans. Neural Networks.

[29]  F. J. Sainz,et al.  Constructive approximate interpolation by neural networks , 2006 .

[30]  Jooyoung Park,et al.  Universal Approximation Using Radial-Basis-Function Networks , 1991, Neural Computation.

[31]  Erkki Oja,et al.  Independent component analysis: algorithms and applications , 2000, Neural Networks.

[32]  Francesco Palmieri,et al.  Optimal filtering algorithms for fast learning in feedforward neural networks , 1992, Neural Networks.

[33]  C. Lee Giles,et al.  Overfitting and neural networks: conjugate gradient and backpropagation , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[34]  Kai Wang,et al.  An Expanded Training Set Based Validation Method to Avoid Overfitting for Neural Network Classifier , 2008, 2008 Fourth International Conference on Natural Computation.

[35]  Derong Liu,et al.  A constructive algorithm for feedforward neural networks with incremental training , 2002 .

[36]  Guang-Bin Huang,et al.  Convex incremental extreme learning machine , 2007, Neurocomputing.

[37]  Gustavo Deco,et al.  Two Strategies to Avoid Overfitting in Feedforward Networks , 1997, Neural Networks.

[38]  Lingling Fan,et al.  Singular Points Detection Based on Zero-Pole Model in Fingerprint Images , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Allan Pinkus,et al.  Approximation theory of the MLP model in neural networks , 1999, Acta Numerica.

[40]  Charles K. Chui,et al.  An Introduction to Wavelets , 1992 .

[41]  Liang-Yu Shyu,et al.  Using wavelet transform and fuzzy neural network for VPC detection from the holter ECG , 2004, IEEE Transactions on Biomedical Engineering.

[42]  Robert M. Burton,et al.  Universal approximation in p-mean by neural networks , 1998, Neural Networks.