A Maximally Split and Relaxed ADMM for Regularized Extreme Learning Machines

One of the salient features of the extreme learning machine (ELM) is its fast learning speed. However, in a big data environment, the ELM still suffers from an overly heavy computational load due to the high dimensionality and the large amount of data. Using the alternating direction method of multipliers (ADMM), a convex model fitting problem can be split into a set of concurrently executable subproblems, each with just a subset of model coefficients. By maximally splitting across the coefficients and incorporating a novel relaxation technique, a maximally split and relaxed ADMM (MS-RADMM), along with a scalarwise implementation, is developed for the regularized ELM (RELM). The convergence conditions and the convergence rate of the MS-RADMM are established, which exhibits linear convergence with a smaller convergence ratio than the unrelaxed maximally split ADMM. The optimal parameter values of the MS-RADMM are obtained and a fast parameter selection scheme is provided. Experiments on ten benchmark classification data sets are conducted, the results of which demonstrate the fast convergence and parallelism of the MS-RADMM. Complexity comparisons with the matrix-inversion-based method in terms of the numbers of multiplication and addition operations, the computation time and the number of memory cells are provided for performance evaluation of the MS-RADMM.

[1]  Shiqian Ma,et al.  Global Convergence of Unmodified 3-Block ADMM for a Class of Convex Minimization Problems , 2015, Journal of Scientific Computing.

[2]  Zhiqiong Wang,et al.  ELM ∗ : distributed extreme learning machine with MapReduce , 2013, World Wide Web.

[3]  Kenli Li,et al.  A Parallel Multiclassification Algorithm for Big Data Using an Extreme Learning Machine , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[4]  Robert P. W. Duin,et al.  Feedforward neural networks with random weights , 1992, Proceedings., 11th IAPR International Conference on Pattern Recognition. Vol.II. Conference B: Pattern Recognition Methodology and Systems.

[5]  Xiaoming Yuan,et al.  The direct extension of ADMM for three-block separable convex minimization models is convergent when one function is strongly convex , 2014 .

[6]  Paolo Gastaldo,et al.  Efficient Digital Implementation of Extreme Learning Machines for Classification , 2012, IEEE Transactions on Circuits and Systems II: Express Briefs.

[7]  Qinghua Zheng,et al.  Distributed extreme learning machine with alternating direction method of multiplier , 2017, Neurocomputing.

[8]  Bingsheng He,et al.  On the O(1/n) Convergence Rate of the Douglas-Rachford Alternating Direction Method , 2012, SIAM J. Numer. Anal..

[9]  Dimitri P. Bertsekas,et al.  Convex Optimization Algorithms , 2015 .

[10]  Hongchao Zhang,et al.  Generalized symmetric ADMM for separable convex optimization , 2017, Computational Optimization and Applications.

[11]  Zhiqiong Wang,et al.  Elastic extreme learning machine for big data classification , 2015, Neurocomputing.

[12]  C. L. Philip Chen,et al.  A rapid supervised learning neural network for function interpolation and approximation , 1996, IEEE Trans. Neural Networks.

[13]  Richard G. Baraniuk,et al.  Fast Alternating Direction Optimization Methods , 2014, SIAM J. Imaging Sci..

[14]  Wotao Yin,et al.  Parallel Multi-Block ADMM with o(1 / k) Convergence , 2013, Journal of Scientific Computing.

[15]  Nan Liu,et al.  Voting based extreme learning machine , 2012, Inf. Sci..

[16]  Bingsheng He,et al.  A class of ADMM-based algorithms for three-block separable convex programming , 2018, Computational Optimization and Applications.

[17]  Dimitri P. Bertsekas,et al.  On the Douglas—Rachford splitting method and the proximal point algorithm for maximal monotone operators , 1992, Math. Program..

[18]  Wotao Yin,et al.  On the Global and Linear Convergence of the Generalized Alternating Direction Method of Multipliers , 2016, J. Sci. Comput..

[19]  Victor C. M. Leung,et al.  Extreme Learning Machines [Trends & Controversies] , 2013, IEEE Intelligent Systems.

[20]  Bingsheng He,et al.  The direct extension of ADMM for multi-block convex minimization problems is not necessarily convergent , 2014, Mathematical Programming.

[21]  Panos J. Antsaklis,et al.  A Linear Systems Primer , 2007 .

[22]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[23]  Narasimhan Sundararajan,et al.  A Fast and Accurate Online Sequential Learning Algorithm for Feedforward Networks , 2006, IEEE Transactions on Neural Networks.

[24]  Guang-Bin Huang,et al.  Extreme Learning Machine for Multilayer Perceptron , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[25]  R. Glowinski,et al.  Sur l'approximation, par éléments finis d'ordre un, et la résolution, par pénalisation-dualité d'une classe de problèmes de Dirichlet non linéaires , 1975 .

[26]  Xia Liu,et al.  Is Extreme Learning Machine Feasible? A Theoretical Assessment (Part I) , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[27]  Hongming Zhou,et al.  Extreme Learning Machine for Regression and Multiclass Classification , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[28]  Y. Takefuji,et al.  Functional-link net computing: theory, system architecture, and functionalities , 1992, Computer.

[29]  Jiuwen Cao,et al.  Kernel-Based Multilayer Extreme Learning Machines for Representation Learning , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[30]  Andreas Antoniou,et al.  Practical Optimization: Algorithms and Engineering Applications , 2007, Texts in Computer Science.

[31]  Badong Chen,et al.  Extreme Learning Machine With Affine Transformation Inputs in an Activation Function , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[32]  Jonathan Eckstein Augmented Lagrangian and Alternating Direction Methods for Convex Optimization: A Tutorial and Some Illustrative Computational Results , 2012 .

[33]  Xiaoming Yuan,et al.  Convergence analysis of the direct extension of ADMM for multiple-block separable convex minimization , 2016, Adv. Comput. Math..

[34]  Yinghuan Shi,et al.  Group-Based Alternating Direction Method of Multipliers for Distributed Linear Classification , 2017, IEEE Transactions on Cybernetics.

[35]  Marek Wegrzyn,et al.  Hardware implementation of real-time Extreme Learning Machine in FPGA: Analysis of precision, resource occupation and performance , 2016, Comput. Electr. Eng..

[36]  Volkan Cevher,et al.  Convex Optimization for Big Data: Scalable, randomized, and parallel algorithms for big data analytics , 2014, IEEE Signal Processing Magazine.

[37]  Zhi-Quan Luo,et al.  On the linear convergence of the alternating direction method of multipliers , 2012, Mathematical Programming.

[38]  Fuzhen Zhuang,et al.  Parallel extreme learning machine for regression based on MapReduce , 2013, Neurocomputing.

[39]  Kai Zhang,et al.  Extreme learning machine and adaptive sparse representation for image classification , 2016, Neural Networks.

[40]  Euhanna Ghadimi,et al.  Optimal Parameter Selection for the Alternating Direction Method of Multipliers (ADMM): Quadratic Problems , 2013, IEEE Transactions on Automatic Control.

[41]  Xia Liu,et al.  Is Extreme Learning Machine Feasible? A Theoretical Assessment (Part I) , 2015, IEEE Trans. Neural Networks Learn. Syst..

[42]  David Zhang,et al.  Evolutionary Cost-Sensitive Extreme Learning Machine , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[43]  Yurii Nesterov,et al.  Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[44]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[45]  Zhiping Lin,et al.  A Novel Relaxed ADMM with Highly Parallel Implementation for Extreme Learning Machine , 2018, 2018 IEEE International Symposium on Circuits and Systems (ISCAS).

[46]  Zhiping Lin,et al.  Kernel based online learning for imbalance multiclass classification , 2018, Neurocomputing.

[47]  Erkki Oja,et al.  GPU-accelerated and parallelized ELM ensembles for large-scale regression , 2011, Neurocomputing.

[48]  Chi-Sing Leung,et al.  ADMM-Based Algorithm for Training Fault Tolerant RBF Networks and Selecting Centers , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[49]  Shiqian Ma,et al.  On the Global Linear Convergence of the ADMM with MultiBlock Variables , 2014, SIAM J. Optim..

[50]  Yong Dou,et al.  PR-ELM: Parallel regularized extreme learning machine based on cluster , 2016, Neurocomputing.

[51]  Terrence J. Sejnowski,et al.  The “independent components” of natural scenes are edge filters , 1997, Vision Research.