Regularized ensemble neural networks models in the Extreme Learning Machine framework

Abstract Extreme Learning Machine (ELM) has proven to be an efficient and speedy algorithm for classification. In order to generalize the results of standard ELM, several ensemble meta-algorithms have been implemented. On this manuscript, we propose a hierarchical ensemble methodology that promotes diversity among the elements of an ensemble, explicitly through the loss function in the single-hidden-layer feedforward network version of ELM. The diversity term in the loss function is justified using the concept of regularization from the Negative Correlation Learning framework. Statistical tests show that our proposal is competitive in both performance and diversity measures against bagging and boosting ensemble methodologies.

[1]  Amaury Lendasse,et al.  OP-ELM: Optimally Pruned Extreme Learning Machine , 2010, IEEE Transactions on Neural Networks.

[2]  Bo Meng,et al.  A new modeling method based on bagging ELM for day-ahead electricity price prediction , 2010, 2010 IEEE Fifth International Conference on Bio-Inspired Computing: Theories and Applications (BIC-TA).

[3]  A. N. Tikhonov,et al.  Solutions of ill-posed problems , 1977 .

[4]  Amaury Lendasse,et al.  Adaptive Ensemble Models of Extreme Learning Machines for Time Series Prediction , 2009, ICANN.

[5]  Erkki Oja,et al.  GPU-accelerated and parallelized ELM ensembles for large-scale regression , 2011, Neurocomputing.

[6]  Xin Yao,et al.  Ensemble learning via negative correlation , 1999, Neural Networks.

[7]  Chunxia Zhang,et al.  An effective hierarchical extreme learning machine based multimodal fusion framework , 2018, Neurocomputing.

[8]  M. Friedman A Comparison of Alternative Tests of Significance for the Problem of $m$ Rankings , 1940 .

[9]  Guang-Bin Huang,et al.  Extreme learning machine: a new learning scheme of feedforward neural networks , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[10]  Fuzhen Zhuang,et al.  Parallel extreme learning machine for regression based on MapReduce , 2013, Neurocomputing.

[11]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[12]  Ram Pal Singh,et al.  Application of Extreme Learning Machine Method for Time Series Analysis , 2007 .

[13]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[14]  Senén Barro,et al.  Do we need hundreds of classifiers to solve real world classification problems? , 2014, J. Mach. Learn. Res..

[15]  Xizhao Wang,et al.  Dynamic ensemble extreme learning machine based on sample entropy , 2012, Soft Comput..

[16]  Douglas A. G. Vieira,et al.  A Comparative Study of Extreme Learning Machine Pruning Based on Detection of Linear Independence , 2014, 2014 IEEE 26th International Conference on Tools with Artificial Intelligence.

[17]  Annalisa Riccardi,et al.  Cost-Sensitive AdaBoost Algorithm for Ordinal Regression Based on Extreme Learning Machine , 2014, IEEE Transactions on Cybernetics.

[18]  Amaury Lendasse,et al.  TROP-ELM: A double-regularized ELM using LARS and Tikhonov regularization , 2011, Neurocomputing.

[19]  Qi Tian,et al.  DisturbLabel: Regularizing CNN on the Loss Layer , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Peter Tiño,et al.  Managing Diversity in Regression Ensembles , 2005, J. Mach. Learn. Res..

[21]  Han Wang,et al.  Ensemble Based Extreme Learning Machine , 2010, IEEE Signal Processing Letters.

[22]  Pietro Perona,et al.  Pedestrian Detection: An Evaluation of the State of the Art , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[25]  Q. M. Jonathan Wu,et al.  Human face recognition based on multidimensional PCA and extreme learning machine , 2011, Pattern Recognit..

[26]  Bernhard Schölkopf,et al.  The Kernel Trick for Distances , 2000, NIPS.

[27]  A. E. Hoerl,et al.  Ridge regression: biased estimation for nonorthogonal problems , 2000 .

[28]  Ludmila I. Kuncheva,et al.  Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy , 2003, Machine Learning.

[29]  Peter L. Bartlett,et al.  The Sample Complexity of Pattern Classification with Neural Networks: The Size of the Weights is More Important than the Size of the Network , 1998, IEEE Trans. Inf. Theory.

[30]  Min Han,et al.  Online sequential extreme learning machine with kernels for nonstationary time series prediction , 2014, Neurocomputing.

[31]  Pedro Antonio Gutiérrez,et al.  Negative Correlation Ensemble Learning for Ordinal Regression , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[32]  Dianhui Wang,et al.  Evolutionary extreme learning machine ensembles with size control , 2013, Neurocomputing.

[33]  P. N. Suganthan,et al.  Empirical comparison of bagging-based ensemble classifiers , 2012, 2012 15th International Conference on Information Fusion.

[34]  Nicolas H. Younan,et al.  Fusion of diverse features and kernels using LP-norm based multiple kernel learning in hyperspectral image processing , 2016, 2016 8th Workshop on Hyperspectral Image and Signal Processing: Evolution in Remote Sensing (WHISPERS).

[35]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[36]  Wang Xin,et al.  Boosting ridge extreme learning machine , 2012, 2012 IEEE Symposium on Robotics and Applications (ISRA).

[37]  Badong Chen,et al.  Extreme Learning Machine With Affine Transformation Inputs in an Activation Function , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[38]  Luiz Eduardo Soares de Oliveira,et al.  The implication of data diversity for a classifier-free ensemble selection in random subspaces , 2008, 2008 19th International Conference on Pattern Recognition.

[39]  Yann LeCun,et al.  Regularization of Neural Networks using DropConnect , 2013, ICML.

[40]  Gonzalo A. Ruz,et al.  Extreme learning machine with a deterministic assignment of hidden weights in two parallel layers , 2017, Neurocomputing.

[41]  Dianhui Wang,et al.  Extreme learning machines: a survey , 2011, Int. J. Mach. Learn. Cybern..

[42]  Jian Zhang,et al.  Deep Extreme Learning Machine and Its Application in EEG Classification , 2015 .

[43]  Pedro Antonio Gutiérrez,et al.  A dynamic over-sampling procedure based on sensitivity for multi-class problems , 2011, Pattern Recognit..

[44]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[45]  Yuan Lan,et al.  Ensemble of online sequential extreme learning machine , 2009, Neurocomputing.

[46]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[47]  Marti A. Hearst Trends & Controversies: Support Vector Machines , 1998, IEEE Intell. Syst..

[48]  Guang-Bin Huang,et al.  Convex incremental extreme learning machine , 2007, Neurocomputing.

[49]  Francisco Herrera,et al.  A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms , 2011, Swarm Evol. Comput..

[50]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[51]  Guang-Bin Huang,et al.  Trends in extreme learning machines: A review , 2015, Neural Networks.

[52]  Hongming Zhou,et al.  Optimization method based extreme learning machine for classification , 2010, Neurocomputing.

[53]  Xiong Luo,et al.  Improving Classification Performance through an Advanced Ensemble Based Heterogeneous Extreme Learning Machines , 2017, Comput. Intell. Neurosci..

[54]  Hongming Zhou,et al.  Extreme Learning Machine for Regression and Multiclass Classification , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[55]  Mark J. van der Laan,et al.  The relative performance of ensemble methods with deep convolutional neural networks for image classification , 2017, Journal of applied statistics.

[56]  Dipankar Das,et al.  Enhanced SenticNet with Affective Labels for Concept-Based Opinion Mining , 2013, IEEE Intelligent Systems.

[57]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[58]  Mark R. Segal,et al.  Machine Learning Benchmarks and Random Forest Regression , 2004 .

[59]  Badong Chen,et al.  Deep Weighted Extreme Learning Machine , 2018, Cognitive Computation.

[60]  Cristiano Cervellera,et al.  Low-Discrepancy Points for Deterministic Assignment of Hidden Weights in Extreme Learning Machines , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[61]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[62]  Guang-Bin Huang,et al.  Extreme Learning Machine for Multilayer Perceptron , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[63]  Leo Breiman,et al.  Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author) , 2001 .

[64]  O. J. Dunn Multiple Comparisons among Means , 1961 .

[65]  Xue-wen Chen,et al.  Big Data Deep Learning: Challenges and Perspectives , 2014, IEEE Access.

[66]  Ponnuthurai Nagaratnam Suganthan,et al.  Ensemble methods for wind and solar power forecasting—A state-of-the-art review , 2015 .

[67]  Xin Yao,et al.  Evolutionary ensembles with negative correlation learning , 2000, IEEE Trans. Evol. Comput..

[68]  William W. Hager,et al.  Updating the Inverse of a Matrix , 1989, SIAM Rev..

[69]  Ji Chen,et al.  Regularization incremental extreme learning machine with random reduced kernel for regression , 2018, Neurocomputing.

[70]  Huanhuan Chen,et al.  Negative correlation learning for classification ensembles , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[71]  Yoav Freund,et al.  A Short Introduction to Boosting , 1999 .