Binary/ternary extreme learning machines

In this paper, a new hidden layer construction method for Extreme Learning Machines (ELMs) is investigated, aimed at generating a diverse set of weights. The paper proposes two new ELM variants: Binary ELM, with a weight initialization scheme based on { 0 , 1 } -weights; and Ternary ELM, with a weight initialization scheme based on { - 1 , 0 , 1 } -weights. The motivation behind this approach is that these features will be from very different subspaces and therefore each neuron extracts more diverse information from the inputs than neurons with completely random features traditionally used in ELM. Therefore, ideally it should lead to better ELMs. Experiments show that indeed ELMs with ternary weights generally achieve lower test error. Furthermore, the experiments show that the Binary and Ternary ELMs are more robust to irrelevant and noisy variables and are in fact performing implicit variable selection. Finally, since only the weight generation scheme is adapted, the computational time of the ELM is unaffected, and the improved accuracy, added robustness and the implicit variable selection of Binary ELM and Ternary ELM come for free.

[1]  Robert K. L. Gay,et al.  Error Minimized Extreme Learning Machine With Growth of Hidden Nodes and Incremental Learning , 2009, IEEE Transactions on Neural Networks.

[2]  Amaury Lendasse,et al.  Interpreting Extreme Learning Machine as an Approximation to an Infinite Neural Network , 2010, KDIR.

[3]  Amaury Lendasse,et al.  OP-ELM: Optimally Pruned Extreme Learning Machine , 2010, IEEE Transactions on Neural Networks.

[4]  Klaus Neumann,et al.  Batch Intrinsic Plasticity for Extreme Learning Machines , 2011, ICANN.

[5]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[6]  Jochen J. Steil,et al.  Online reservoir adaptation by intrinsic plasticity for backpropagation-decorrelation and echo state learning , 2007, Neural Networks.

[7]  Lei Chen,et al.  Enhanced random search based incremental extreme learning machine , 2008, Neurocomputing.

[8]  Chee Kheong Siew,et al.  Universal Approximation using Incremental Constructive Feedforward Networks with Random Hidden Nodes , 2006, IEEE Transactions on Neural Networks.

[9]  Nasser L. Azad,et al.  Optimally pruned extreme learning machine with ensemble of regularization techniques and negative correlation penalty applied to automotive engine coldstart hydrocarbon emission identification , 2014, Neurocomputing.

[10]  Qinghua Zheng,et al.  Regularized Extreme Learning Machine , 2009, 2009 IEEE Symposium on Computational Intelligence and Data Mining.

[11]  R. H. Myers Classical and modern regression with applications , 1986 .

[12]  Jochen Triesch,et al.  Synergies between Intrinsic and Synaptic Plasticity in Individual Model Neurons , 2004, NIPS.

[13]  Peter L. Bartlett,et al.  The Sample Complexity of Pattern Classification with Neural Networks: The Size of the Weights is More Important than the Size of the Network , 1998, IEEE Trans. Inf. Theory.

[14]  Tapani Raiko,et al.  Deep Learning Made Easier by Linear Transformations in Perceptrons , 2012, AISTATS.

[15]  Amaury Lendasse,et al.  TROP-ELM: A double-regularized ELM using LARS and Tikhonov regularization , 2011, Neurocomputing.

[16]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[17]  Guang-Bin Huang,et al.  Convex incremental extreme learning machine , 2007, Neurocomputing.

[18]  Klaus Neumann,et al.  Optimizing extreme learning machines via ridge regression and batch intrinsic plasticity , 2013, Neurocomputing.

[19]  John A. Nelder,et al.  A Simplex Method for Function Minimization , 1965, Comput. J..

[20]  Klaus Neumann,et al.  Regularization by Intrinsic Plasticity and Its Synergies with Recurrence for Random Projection Methods , 2012 .

[21]  Guang-Bin Huang,et al.  Extreme learning machine: a new learning scheme of feedforward neural networks , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[22]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[23]  Jeffrey C. Lagarias,et al.  Convergence Properties of the Nelder-Mead Simplex Method in Low Dimensions , 1998, SIAM J. Optim..

[24]  Jochen Triesch,et al.  A Gradient Rule for the Plasticity of a Neuron's Intrinsic Excitability , 2005, ICANN.

[25]  Jochen Triesch,et al.  Synergies Between Intrinsic and Synaptic Plasticity Mechanisms , 2007, Neural Computation.

[26]  Benjamin Schrauwen,et al.  An experimental unification of reservoir computing methods , 2007, Neural Networks.

[27]  Klaus Neumann,et al.  Intrinsic Plasticity via Natural Gradient Descent , 2012, ESANN.

[28]  Erkki Oja,et al.  GPU-accelerated and parallelized ELM ensembles for large-scale regression , 2011, Neurocomputing.