论文信息 - Training neural networks by marginalizing out hidden layer noise

Training neural networks by marginalizing out hidden layer noise

The generalization ability of neural networks is influenced by the size of the training set. The training process for single-hidden-layer feedforward neural networks (SLFNs) consists of two stages: nonlinear feature mapping and predictor optimization in the hidden layer space. In this paper, we propose a new approach, called marginalizing out hidden layer noise (MHLN), in which the predictor of SLFNs is trained with infinite samples. First, MHLN augments the training set in the hidden layer space with constrained samples, which are generated by corrupting the hidden layer outputs of the training set with given noise. For any given training sample, when the number of corruptions is close to infinity, according to the weak law of large numbers, the explicitly generated constrained samples can be replaced with their expectations. In this way, the training set is implicitly extended in the hidden layer space by an infinite number of constrained samples. Then, MHLN constructs the predictor of SLFNs by optimizing the expected value of a quadratic loss function under the given noise distribution. The results of experiments on twenty benchmark datasets show that MHLN achieves better generalization ability.

Yanjun Li | Ping Guo

[1] Nitish Srivastava,et al. Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[2] Jian Pei,et al. Distance metric learning using dropout: a structured regularization approach , 2014, KDD.

[3] Yanjun Li,et al. Neural Networks with Marginalized Corrupted Hidden Layer , 2015, ICONIP.

[4] Hao Yu,et al. Neural Network Learning Without Backpropagation , 2010, IEEE Transactions on Neural Networks.

[5] Alexander J. Smola,et al. Convex Learning with Invariances , 2007, NIPS.

[6] Lei Chen,et al. Enhanced random search based incremental extreme learning machine , 2008, Neurocomputing.

[7] Pascal Vincent,et al. Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[8] J. Mesirov,et al. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[9] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.

[10] J. Urgen Branke. Evolutionary Algorithms for Neural Network Design and Training , 1995 .

[11] Christopher D. Manning,et al. Fast dropout training , 2013, ICML.