Feature selection of generalized extreme learning machine for regression problems

Recently a generalized single-hidden layer feedforward network was proposed, which is an extension of the original extreme learning machine (ELM). Different from the traditional ELM, this generalized ELM (GELM) utilizes the p-order reduced polynomial functions of complete input features as output weights. According to the empirical results, there may be insignificant or redundant input features to construct the p-order reduced polynomial function as output weights in GELM. However, to date there has not been such work of selecting appropriate input features used for constructing output weights of GELM. Hence, in this paper two greedy learning algorithms, i.e., a forward feature selection algorithm (FFS-GELM) and a backward feature selection algorithm (BFS-GELM), are first proposed to tackle this issue. To reduce the computational complexity, an iterative strategy is used in FFS-GELM, and its convergence is proved. In BFS-GELM, a decreasing iteration is applied to decay this model, and in this process an accelerating scheme was proposed to speed up computation of removing the insignificant or redundant features. To show the effectiveness of the proposed FFS-GELM and BFS-GELM, twelve benchmark data sets are employed to do experiments. From these reports, it is demonstrated that both FFS-GELM and BFS-GELM can select appropriate input features to construct the p-order reduced polynomial function as output weights for GELM. FFS-GELM and BFS-GELM enhance the generalization performance and simultaneously reduce the testing time compared to the original GELM. BFS-GELM works better than FFS-GELM in terms of the sparsity ratio, the testing time and the training time. However, it slightly loses the advantage in the generalization performance over FFS-GELM.

[1]  S. Billings,et al.  Givens rotation based fast backward elimination algorithm for RBF neural network pruning , 1997 .

[2]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[3]  Guang-Bin Huang,et al.  Upper bounds on the number of hidden neurons in feedforward networks with arbitrary bounded nonlinear activation functions , 1998, IEEE Trans. Neural Networks.

[4]  Lei Chen,et al.  Enhanced random search based incremental extreme learning machine , 2008, Neurocomputing.

[5]  Panos J. Antsaklis,et al.  A simple method to derive bounds on the size and to train multilayer neural networks , 1991, IEEE Trans. Neural Networks.

[6]  Shin'ichi Tamura,et al.  Capabilities of a four-layered feedforward neural network: four layers versus three , 1997, IEEE Trans. Neural Networks.

[7]  Seiichi Ozawa,et al.  Radial Basis Function Network for Multitask Pattern Recognition , 2011, Neural Processing Letters.

[8]  Yong-Ping Zhao,et al.  Fast cross validation for regularized extreme learning machine , 2014 .

[9]  Allan Pinkus,et al.  Multilayer Feedforward Networks with a Non-Polynomial Activation Function Can Approximate Any Function , 1991, Neural Networks.

[10]  Meng Joo Er,et al.  Generalized Single-Hidden Layer Feedforward Networks for Regression Problems , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[11]  Bernard Mulgrew,et al.  Nonlinear prediction of chaotic signals using a normalised radial basis function network , 2002, Signal Process..

[12]  Henry Leung,et al.  Prediction of noisy chaotic time series using an optimal radial basis function neural network , 2001, IEEE Trans. Neural Networks.

[13]  Meng Joo Er,et al.  Parsimonious Extreme Learning Machine Using Recursive Orthogonal Least Squares , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[14]  Martin Fodslette Møller,et al.  A scaled conjugate gradient algorithm for fast supervised learning , 1993, Neural Networks.

[15]  Ying Zhang,et al.  Fast unit pruning algorithm for feedforward neural network design , 2008, Appl. Math. Comput..

[16]  Bernhard Schölkopf,et al.  Improving the Accuracy and Speed of Support Vector Machines , 1996, NIPS.

[17]  Chee Kheong Siew,et al.  Universal Approximation using Incremental Constructive Feedforward Networks with Random Hidden Nodes , 2006, IEEE Transactions on Neural Networks.

[18]  Andy J. Keane,et al.  Some Greedy Learning Algorithms for Sparse Regression and Classification with Mercer Kernels , 2003, J. Mach. Learn. Res..

[19]  Amaury Lendasse,et al.  OP-ELM: Optimally Pruned Extreme Learning Machine , 2010, IEEE Transactions on Neural Networks.

[20]  Vladimir Vapnik,et al.  An overview of statistical learning theory , 1999, IEEE Trans. Neural Networks.

[21]  Dipti Srinivasan,et al.  Neural Networks for Continuous Online Learning and Control , 2006, IEEE Transactions on Neural Networks.

[22]  Meng Joo Er,et al.  A Novel Extreme Learning Control Framework of Unmanned Surface Vehicles , 2016, IEEE Transactions on Cybernetics.

[23]  Jean-Philippe Thiran,et al.  Kernel matching pursuit for large datasets , 2005, Pattern Recognit..

[24]  George W. Irwin,et al.  A fast nonlinear model identification method , 2005, IEEE Transactions on Automatic Control.

[25]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[26]  Jianzhong Wang,et al.  Excavation Equipment Recognition Based on Novel Acoustic Statistical Features , 2017, IEEE Transactions on Cybernetics.

[27]  Henry Leung,et al.  Signal detection using the radial basis function coupled map lattice , 2000, IEEE Trans. Neural Networks Learn. Syst..

[28]  Bing Li,et al.  An accelerating scheme for destructive parsimonious extreme learning machine , 2015, Neurocomputing.

[29]  Guang-Bin Huang,et al.  Convex incremental extreme learning machine , 2007, Neurocomputing.

[30]  Mohammad Bagher Menhaj,et al.  Training feedforward networks with the Marquardt algorithm , 1994, IEEE Trans. Neural Networks.

[31]  Rudy Setiono,et al.  A Penalty-Function Approach for Pruning Feedforward Neural Networks , 1997, Neural Computation.

[32]  Rosalind W. Picard,et al.  On the efficiency of the orthogonal least squares training method for radial basis function networks , 1996, IEEE Trans. Neural Networks.

[33]  Massimiliano Pontil,et al.  Properties of Support Vector Machines , 1998, Neural Computation.

[34]  Jooyoung Park,et al.  Universal Approximation Using Radial-Basis-Function Networks , 1991, Neural Computation.

[35]  Zexuan Zhu,et al.  A fast pruned-extreme learning machine for classification problem , 2008, Neurocomputing.

[36]  Yih-Fang Huang,et al.  Bounds on the number of hidden neurons in multilayer perceptrons , 1991, IEEE Trans. Neural Networks.

[37]  Yong-Ping Zhao,et al.  Improvements on parsimonious extreme learning machine using recursive orthogonal least squares , 2016, Neurocomputing.

[38]  Barry P. Haynes,et al.  Pruning Artificial Neural Networks Using Neural Complexity Measures , 2008, Int. J. Neural Syst..

[39]  Meng Joo Er,et al.  Hybrid recursive least squares algorithm for online sequential identification using data chunks , 2016, Neurocomputing.

[40]  Hongming Zhou,et al.  Extreme Learning Machine for Regression and Multiclass Classification , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[41]  O Kinouchi,et al.  Optimal pruning in neural networks. , 2000, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[42]  Roberto Battiti,et al.  BFGS Optimization for Faster and Automated Supervised Learning , 1990 .

[43]  Xinyu Guo,et al.  Pruning Support Vector Machines Without Altering Performances , 2008, IEEE Transactions on Neural Networks.

[44]  Kurt Hornik,et al.  Approximation capabilities of multilayer feedforward networks , 1991, Neural Networks.

[45]  Giovanna Castellano,et al.  An iterative pruning algorithm for feedforward neural networks , 1997, IEEE Trans. Neural Networks.

[46]  Hao Yu,et al.  Improved Computation for Levenberg–Marquardt Training , 2010, IEEE Transactions on Neural Networks.

[47]  Dong Liang,et al.  Gram-Schmidt process based incremental extreme learning machine , 2017, Neurocomputing.

[48]  Guang-Bin Huang,et al.  Extreme learning machine: a new learning scheme of feedforward neural networks , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[49]  Kai Zhang,et al.  Extreme learning machine and adaptive sparse representation for image classification , 2016, Neural Networks.

[50]  Sheng Chen,et al.  Orthogonal least squares methods and their application to non-linear system identification , 1989 .

[51]  Haibo Zhang,et al.  Pruning least objective contribution in KMSE , 2011, Neurocomputing.

[52]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[53]  Cheng Wu,et al.  Kernelized LARS–LASSO for constructing radial basis function neural networks , 2012, Neural Computing and Applications.

[54]  Bernhard Schölkopf,et al.  Improving the accuracy and speed of support vector learning machines , 1997, NIPS 1997.

[55]  Yong-Ping Zhao,et al.  Parsimonious regularized extreme learning machine based on orthogonal transformation , 2015, Neurocomputing.

[56]  Pascal Vincent,et al.  Kernel Matching Pursuit , 2002, Machine Learning.

[57]  Chuanhou Gao,et al.  A comparative analysis of support vector machines and extreme learning machines , 2012, Neural Networks.

[58]  Robert K. L. Gay,et al.  Error Minimized Extreme Learning Machine With Growth of Hidden Nodes and Incremental Learning , 2009, IEEE Transactions on Neural Networks.

[59]  Qing Li,et al.  Adaptive simplification of solution for support vector machine , 2007, Pattern Recognit..

[60]  J. A. Leonard,et al.  Radial basis function networks for classifying process faults , 1991, IEEE Control Systems.