Two-Phase Construction of Multilayer Perceptrons Using Information Theory

This brief presents a two-phase construction approach for pruning both input and hidden units of multilayer perceptrons (MLPs) based on mutual information (MI). First, all features of input vectors are ranked according to their relevance to target outputs through a forward strategy. The salient input units of an MLP are thus determined according to the order of the ranking result and by considering their contributions to the network's performance. Then, the irrelevant features of input vectors can be identified and eliminated. Second, the redundant hidden units are removed from the trained MLP one after another according to a novel relevance measure. Compared with its related work, the proposed strategy exhibits better performance. Moreover, experimental results show that the proposed method is comparable or even superior to support vector machine (SVM) and support vector regression (SVR). Finally, the advantages of the MI-based method are investigated in comparison with the sensitivity analysis (SA)-based method.

[1]  Brian D. Ripley,et al.  Pattern Recognition and Neural Networks , 1996 .

[2]  Daniel S. Yeung,et al.  Hidden neuron pruning of multilayer perceptrons using a quantified sensitivity measure , 2006, Neurocomputing.

[3]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[4]  Christopher J. Merz,et al.  UCI Repository of Machine Learning Databases , 1996 .

[5]  Marc M. Van Hulle,et al.  Speeding Up Feature Subset Selection Through Mutual Information Relevance Filtering , 2007, PKDD.

[6]  A. Kraskov,et al.  Estimating mutual information. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[7]  Byung Ro Moon,et al.  Hybrid Genetic Algorithms for Feature Selection , 2004, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Yoshua Bengio,et al.  Pattern Recognition and Neural Networks , 1995 .

[9]  Chong-Ho Choi,et al.  Input Feature Selection by Mutual Information Based on Parzen Window , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Andries Petrus Engelbrecht,et al.  A new pruning heuristic based on variance analysis of sensitivity information , 2001, IEEE Trans. Neural Networks.

[11]  J. Freidman,et al.  Multivariate adaptive regression splines , 1991 .

[12]  M.A. Mazurowski,et al.  Limitations of sensitivity analysis for neural networks in cases with dependent inputs , 2006, 2006 IEEE International Conference on Computational Cybernetics.

[13]  J. Friedman,et al.  Estimating Optimal Transformations for Multiple Regression and Correlation. , 1985 .

[14]  Alexander Kraskov,et al.  Least-dependent-component analysis based on mutual information. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.