论文信息 - Structural adaptation for sparsely connected MLP using Newton's method

Structural adaptation for sparsely connected MLP using Newton's method

In this work, we propose a paradigm for constructing a sparsely-connected multi-layer perceptron (MLP). Using Orthogonal Least Squares (OLS) method for training, the proposed method prunes the hidden units and output weights based on their usefulness to design a sparsely connected MLP. We formulate second order algorithm to obtain the closed-form expression for hidden unit learning factors thereby minimizing hand-tuned parameters. The usefulness of the proposed algorithm is further substantiated by its ability to differentiate two combined datasets. Using widely available datasets, the proposed algorithm's 10-fold testing error is shown to be less than that of several other algorithms. Inducing sparsity into a fully-connected neural network, pruning of the hidden units, Newton's method for optimization, and orthogonal least squares are the subject matter of the present work.

Michael T. Manry | Son Nguyen | Kanishka Tyagi | Parastoo Kheirkhah

[1] H. Sebastian Seung,et al. Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[2] Günther Palm,et al. Sparse activity and sparse connectivity in supervised learning , 2016, J. Mach. Learn. Res..

[3] Hynek Hermansky,et al. Multilayer perceptron with sparse hidden outputs for phoneme recognition , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[4] Michael Elad,et al. Double Sparsity: Learning Sparse Dictionaries for Sparse Signal Approximation , 2010, IEEE Transactions on Signal Processing.

[5] Vipin Kumar,et al. Highly Scalable Parallel Algorithms for Sparse Matrix Factorization , 1997, IEEE Trans. Parallel Distributed Syst..

[6] Michael Elad,et al. Sparse Representation for Color Image Restoration , 2008, IEEE Transactions on Image Processing.

[7] Russell Reed,et al. Pruning algorithms-a survey , 1993, IEEE Trans. Neural Networks.

[8] David J. Field,et al. Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[9] Michael T. Manry,et al. Minimizing validation error with respect to network size and number of training epochs , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).

[10] Yuan Xu,et al. Sparse Pseudo Inverse of the Discrete Plane Wave Transform , 2008, IEEE Transactions on Antennas and Propagation.

[11] Michael T. Manry,et al. Partially affine invariant back propagation , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[12] D. Donoho,et al. Sparse MRI: The application of compressed sensing for rapid MR imaging , 2007, Magnetic resonance in medicine.

[13] Michael T. Manry,et al. An efficient hidden layer training method for the multilayer perceptron , 2006, Neurocomputing.

[14] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[15] Michael T. Manry,et al. Multiple optimal learning factors for feed-forward networks , 2010, Defense + Commercial Sensing.

[16] Yann LeCun,et al. Optimal Brain Damage , 1989, NIPS.

[17] David S. Wishart,et al. Applications of Machine Learning in Cancer Prediction and Prognosis , 2006, Cancer informatics.

[18] Yonina C. Eldar,et al. Reduce and Boost: Recovering Arbitrary Sets of Jointly Sparse Vectors , 2008, IEEE Transactions on Signal Processing.

[19] Patrik O. Hoyer,et al. Non-negative Matrix Factorization with Sparseness Constraints , 2004, J. Mach. Learn. Res..

[20] Allen Y. Yang,et al. Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21] Federico Girosi,et al. An Equivalence Between Sparse Approximation and Support Vector Machines , 1998, Neural Computation.

[22] H. Sebastian Seung,et al. Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[23] Gregory J. Wolff,et al. Optimal Brain Surgeon and general network pruning , 1993, IEEE International Conference on Neural Networks.

[24] Rajat Raina,et al. Efficient sparse coding algorithms , 2006, NIPS.