Denoising AutoEncoder in Neural Networks with Modified Elliott Activation Function and Sparsity-Favoring Cost Function

Neural networks (NN) are architectures and algorithms for machine learning. They are quite powerful for tasks like classification, clustering, and pattern recognition. Large neural networks can be considered a universal function that can approximate any function, and hence are effective at learning from the training data, not only the useful information but also the noise in the training data. However, as the number of neurons and the number of hidden layers grow, the number of connections in the network increases exponentially, and the over fitting problem becomes more severe biased towards noise. Various methods have been proposed to address this problem such as AutoEncoder, Dropout, DropConnect, and Factored Mean training. In this paper, we propose a denoising autoencoder approach using a modified Elliott activation function and a cost function that favors sparsity in the input data. Preliminary experiments using the modified algorithm on several real data sets showed that the proposed approach performed well.

[1]  Ah Chung Tsoi,et al.  Lessons in Neural Network Training: Overfitting May be Harder than Expected , 1997, AAAI/IAAI.

[2]  Yoshua Bengio,et al.  Greedy Layer-Wise Training of Deep Networks , 2006, NIPS.

[3]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[4]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[5]  Yoshua Bengio,et al.  Exploring Strategies for Training Deep Neural Networks , 2009, J. Mach. Learn. Res..

[6]  Julien Mairal,et al.  Proximal Methods for Hierarchical Sparse Coding , 2010, J. Mach. Learn. Res..

[7]  Wing W. Y. Ng,et al.  Auto-encoder using the bi-firing activation function , 2014, 2014 International Conference on Machine Learning and Cybernetics.

[8]  Geoffrey E. Hinton,et al.  Autoencoders, Minimum Description Length and Helmholtz Free Energy , 1993, NIPS.

[9]  Yann LeCun,et al.  Regularization of Neural Networks using DropConnect , 2013, ICML.

[10]  Tara N. Sainath,et al.  Improving deep neural networks for LVCSR using rectified linear units and dropout , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[11]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[12]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[13]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[14]  Scott T. Rickard,et al.  Comparing Measures of Sparsity , 2008, IEEE Transactions on Information Theory.

[15]  Pascal Vincent,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[16]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[17]  David L. Elliott,et al.  A Better Activation Function for Artificial Neural Networks , 1993 .

[18]  Pascal Vincent,et al.  The Difficulty of Training Deep Architectures and the Effect of Unsupervised Pre-Training , 2009, AISTATS.