Regularized training of the extreme learning machine using the conjugate gradient method

We describe a new algorithm providing regularized training of the extreme learning machine (ELM) that uses a modified conjugate gradient (CG) method to determine the network hidden to output weights. The CG method is modified to include a validation set performance calculation at each iteration step. The solution is initialized to zero and during the CG iterations, we monitor the validation set error. When the error begins to rise we terminate the CG algorithm. The operations per iteration is O(P2), where P is the number of output weights, which is significantly faster than the O(P3) operations per iteration required by ridge regression regularization methods. We demonstrate the effectiveness of our method by classifying the MNIST database and achieve an accuracy of 99.2% using an ELM classifier processing the unmodified pixel values.

[1]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[2]  Bernard Widrow,et al.  The No-Prop algorithm: A new learning algorithm for multilayer neural networks , 2013, Neural Networks.

[3]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[4]  Mark D. McDonnell,et al.  Enhanced image classification with a fast-learning shallow convolutional neural network , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[5]  Chris Eliasmith,et al.  Neural Engineering: Computation, Representation, and Dynamics in Neurobiological Systems , 2004, IEEE Transactions on Neural Networks.

[6]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[7]  Grgoire Montavon,et al.  Neural Networks: Tricks of the Trade , 2012, Lecture Notes in Computer Science.

[8]  Dong Yu,et al.  Efficient and effective algorithms for training single-hidden-layer neural networks , 2012, Pattern Recognit. Lett..

[9]  Mark D. McDonnell,et al.  Modular expansion of the hidden layer in Single Layer Feedforward neural Networks , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[10]  Lutz Prechelt,et al.  Early Stopping-But When? , 1996, Neural Networks: Tricks of the Trade.

[11]  Mark D. McDonnell,et al.  Efficient computation of the Levenberg-Marquardt algorithm for feedforward networks with linear outputs , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[12]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[13]  Mark D. McDonnell,et al.  Fast, Simple and Accurate Handwritten Digit Classification by Training Shallow Neural Network Classifiers with the ‘Extreme Learning Machine’ Algorithm , 2015, PloS one.

[14]  R. Snee,et al.  Ridge Regression in Practice , 1975 .

[15]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.