暂无分享,去创建一个
[1] Frank Nielsen,et al. Statistical exponential families: A digest with flash cards , 2009, ArXiv.
[2] Kilian Q. Weinberger,et al. Marginalizing stacked linear denoising autoencoders , 2015, J. Mach. Learn. Res..
[3] Saad,et al. Exact solution for on-line learning in multilayer neural networks. , 1995, Physical review letters.
[4] Jeffrey Pennington,et al. Nonlinear random matrix theory for deep learning , 2019, NIPS.
[5] Stephen Tyree,et al. Learning with Marginalized Corrupted Features , 2013, ICML.
[6] Max Tegmark,et al. Why Does Deep and Cheap Learning Work So Well? , 2016, Journal of Statistical Physics.
[7] Surya Ganguli,et al. Resurrecting the sigmoid in deep learning through dynamical isometry: theory and practice , 2017, NIPS.
[8] Anders Krogh,et al. A Simple Weight Decay Can Improve Generalization , 1991, NIPS.
[9] Andrew M. Saxe,et al. High-dimensional dynamics of generalization error in neural networks , 2017, Neural Networks.
[10] James L. McClelland,et al. Learning hierarchical category structure in deep neural networks , 2013 .
[11] Christopher D. Manning,et al. Fast dropout training , 2013, ICML.
[12] D Zipser,et al. Learning the hidden structure of speech. , 1988, The Journal of the Acoustical Society of America.
[13] Naftali Tishby,et al. Opening the Black Box of Deep Neural Networks via Information , 2017, ArXiv.
[14] Sida I. Wang,et al. Dropout Training as Adaptive Regularization , 2013, NIPS.
[15] Pascal Vincent,et al. Generalized Denoising Auto-Encoders as Generative Models , 2013, NIPS.
[16] Pascal Vincent,et al. Contractive Auto-Encoders: Explicit Invariance During Feature Extraction , 2011, ICML.
[17] Geoffrey E. Hinton,et al. On rectified linear units for speech processing , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[18] Gustav Larsson,et al. Discovery of Visual Semantics by Unsupervised and Self-Supervised Representation Learning , 2017, ArXiv.
[19] Yoshua Bengio,et al. Greedy Layer-Wise Training of Deep Networks , 2006, NIPS.
[20] Hugo Larochelle,et al. An Autoencoder Approach to Learning Bilingual Word Representations , 2014, NIPS.
[21] Yoshua Bengio,et al. Marginalized Denoising Auto-encoders for Nonlinear Representations , 2014, ICML.
[22] Jeffrey Pennington,et al. Geometry of Neural Network Loss Surfaces via Random Matrix Theory , 2017, ICML.
[23] Elad Hoffer,et al. Exponentially vanishing sub-optimal local minima in multilayer neural networks , 2017, ICLR.
[24] Opper. Learning times of neural networks: Exact solution for a PERCEPTRON algorithm. , 1988, Physical review. A, General physics.
[25] Yang Liu,et al. Neural Machine Translation with Reconstruction , 2016, AAAI.
[26] Yoshua Bengio,et al. What regularized auto-encoders learn from the data-generating distribution , 2012, J. Mach. Learn. Res..
[27] Stefano Soatto,et al. Emergence of invariance and disentangling in deep representations , 2017 .
[28] Pascal Vincent,et al. A Connection Between Score Matching and Denoising Autoencoders , 2011, Neural Computation.
[29] Matthias Hein,et al. The Loss Surface of Deep and Wide Neural Networks , 2017, ICML.
[30] Terence D. Sanger,et al. Optimal unsupervised learning in a single-layer linear feedforward neural network , 1989, Neural Networks.
[31] Ruslan Salakhutdinov,et al. Geometry of Optimization and Implicit Regularization in Deep Learning , 2017, ArXiv.
[32] Surya Ganguli,et al. Exact solutions to the nonlinear dynamics of learning in deep linear neural networks , 2013, ICLR.
[33] Kurt Hornik,et al. Neural networks and principal component analysis: Learning from examples without local minima , 1989, Neural Networks.
[34] Surya Ganguli,et al. Analyzing noise in autoencoders and deep networks , 2014, ArXiv.
[35] Christopher M. Bishop,et al. Current address: Microsoft Research, , 2022 .
[36] Zhenyu Liao,et al. A Random Matrix Approach to Neural Networks , 2017, ArXiv.
[37] Razvan Pascanu,et al. Sharp Minima Can Generalize For Deep Nets , 2017, ICML.
[38] Yoshua Bengio,et al. Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.
[39] Razvan Pascanu,et al. Local minima in training of neural networks , 2016, 1611.06310.