Meta-ELM: ELM with ELM hidden nodes

Extreme Leaning Machine (ELM) simply randomly assigns input weights and biases, ineluctably leading to certain stochastic behaviors and reducing generalization performance. In this paper, we propose a meta-learning model of ELM, called Meta-ELM. The Meta-ELM architecture consists of several base ELMs and one top ELM. Therefore, the Meta-ELM learning proceeds in two stages. First, each base ELM is trained on a subset of the training data. Then, the top ELM is learned with the base ELMs as hidden nodes. Theoretical analysis and experimental results on a few artificial and benchmark regression datasets show that the proposed Meta-ELM model is feasible and effective.

[1]  Chunyan Miao,et al.  Enhanced Extreme Learning Machine with stacked generalization , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[2]  KohaviRon,et al.  An Empirical Comparison of Voting Classification Algorithms , 1999 .

[3]  Erkki Oja,et al.  GPU-accelerated and parallelized ELM ensembles for large-scale regression , 2011, Neurocomputing.

[4]  Yuan Lan,et al.  Ensemble of online sequential extreme learning machine , 2009, Neurocomputing.

[5]  Chee Kheong Siew,et al.  Can threshold networks be trained directly? , 2006, IEEE Transactions on Circuits and Systems II: Express Briefs.

[6]  Yong Yu,et al.  Sales forecasting using extreme learning machine with applications in fashion retailing , 2008, Decis. Support Syst..

[7]  Guang-Bin Huang,et al.  Extreme learning machine: a new learning scheme of feedforward neural networks , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[8]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[9]  Guang-Bin Huang,et al.  Convex incremental extreme learning machine , 2007, Neurocomputing.

[10]  Guang-Bin Huang,et al.  Upper bounds on the number of hidden neurons in feedforward networks with arbitrary bounded nonlinear activation functions , 1998, IEEE Trans. Neural Networks.

[11]  Salvatore J. Stolfo,et al.  Experiments on multistrategy learning by meta-learning , 1993, CIKM '93.

[12]  Lei Chen,et al.  Enhanced random search based incremental extreme learning machine , 2008, Neurocomputing.

[13]  Chee Kheong Siew,et al.  Universal Approximation using Incremental Constructive Feedforward Networks with Random Hidden Nodes , 2006, IEEE Transactions on Neural Networks.

[14]  Eric Bauer,et al.  An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.

[15]  Robert E. Schapire,et al.  The strength of weak learnability , 1990, Mach. Learn..

[16]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[17]  Samy Bengio,et al.  A Parallel Mixture of SVMs for Very Large Scale Problems , 2001, Neural Computation.

[18]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[19]  D. Serre Matrices: Theory and Applications , 2002 .

[20]  Manjunatha K Prasad,et al.  Generalized Inverse of a Matrix and its Applications , 2011 .

[21]  Hongming Zhou,et al.  Extreme Learning Machine for Regression and Multiclass Classification , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[22]  James T. Kwok Support vector mixture for classification and regression problems , 1998, ICPR.

[23]  Guang-Bin Huang,et al.  Learning capability and storage capacity of two-hidden-layer feedforward networks , 2003, IEEE Trans. Neural Networks.

[24]  김용수,et al.  Extreme Learning Machine 기반 퍼지 패턴 분류기 설계 , 2015 .

[25]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[26]  Amaury Lendasse,et al.  Adaptive Ensemble Models of Extreme Learning Machines for Time Series Prediction , 2009, ICANN.

[27]  Christian Pellegrini,et al.  Local experts combination through density decomposition , 1999, AISTATS.