Why Does Unsupervised Pre-training Help Deep Learning?
暂无分享,去创建一个
Yoshua Bengio | Dumitru Erhan | Aaron C. Courville | Pascal Vincent | Yoshua Bengio | D. Erhan | Pascal Vincent
[1] A. Yao. Separating the polynomial-time hierarchy by oracles , 1985 .
[2] Johan Håstad,et al. Almost optimal lower bounds for small depth circuits , 1986, STOC '86.
[3] Lalit R. Bahl,et al. Maximum mutual information estimation of hidden Markov model parameters for speech recognition , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.
[4] M. Bornstein. Sensitive periods in development : interdisciplinary perspectives , 1987 .
[5] Yann LeCun. PhD thesis: Modeles connexionnistes de l'apprentissage (connectionist learning models) , 1987 .
[6] Andrew R. Barron,et al. Complexity Regularization with Application to Artificial Neural Networks , 1991 .
[7] L. Ljung,et al. Overtraining, regularization and searching for a minimum, with application to neural networks , 1995 .
[8] H. Sebastian Seung,et al. Learning Continuous Attractors in Recurrent Networks , 1997, NIPS.
[9] Klaus-Robert Müller,et al. Asymptotic statistical theory of overtraining and cross-validation , 1997, IEEE Trans. Neural Networks.
[10] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[11] David Haussler,et al. Exploiting Generative Models in Discriminative Classifiers , 1998, NIPS.
[12] Yoshua Bengio,et al. Object Recognition with Gradient-Based Learning , 1999, Shape, Contour and Grouping in Computer Vision.
[13] J. Tenenbaum,et al. A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.
[14] Michael I. Jordan,et al. On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes , 2001, NIPS.
[15] Mikhail Belkin,et al. Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.
[16] Simon Haykin,et al. GradientBased Learning Applied to Document Recognition , 2001 .
[17] Daniel Povey,et al. Minimum Phone Error and I-smoothing for improved discriminative training , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[18] Bernhard Schölkopf,et al. Cluster Kernels for Semi-Supervised Learning , 2002, NIPS.
[19] Geoffrey E. Hinton. Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.
[20] Geoffrey E. Hinton,et al. Exponential Family Harmoniums with an Application to Information Retrieval , 2004, NIPS.
[21] Johan Håstad,et al. On the power of small-depth threshold circuits , 1991, computational complexity.
[22] L. Bottou,et al. Training Invariant Support Vector Machines using Selective Sampling , 2005 .
[23] Nicolas Le Roux,et al. The Curse of Highly Variable Functions for Local Kernel Machines , 2005, NIPS.
[24] Yoshua Bengio,et al. Greedy Layer-Wise Training of Deep Networks , 2006, NIPS.
[25] Geoffrey E. Hinton,et al. Reducing the Dimensionality of Data with Neural Networks , 2006, Science.
[26] Alexander Zien,et al. Semi-Supervised Learning , 2006 .
[27] Yee Whye Teh,et al. A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.
[28] Marc'Aurelio Ranzato,et al. Efficient Learning of Sparse Representations with an Energy-Based Model , 2006, NIPS.
[29] Bernhard Schölkopf,et al. Introduction to Semi-Supervised Learning , 2006, Semi-Supervised Learning.
[30] Tom Minka,et al. Principled Hybrids of Generative and Discriminative Models , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).
[31] Geoffrey E. Hinton,et al. Restricted Boltzmann machines for collaborative filtering , 2007, ICML '07.
[32] Honglak Lee,et al. Sparse deep belief net model for visual area V2 , 2007, NIPS.
[33] Marc'Aurelio Ranzato,et al. Sparse Feature Learning for Deep Belief Networks , 2007, NIPS.
[34] Geoffrey E. Hinton,et al. To recognize shapes, first learn to generate images. , 2007, Progress in brain research.
[35] Jason Weston,et al. Large-scale kernel machines , 2007 .
[36] John F. Kalaska,et al. Computational neuroscience : theoretical insights into brain function , 2007 .
[37] Yoshua Bengio,et al. Scaling learning algorithms towards AI , 2007 .
[38] Geoffrey E. Hinton,et al. Modeling image patches with a directed hierarchy of Markov random fields , 2007, NIPS.
[39] Yoshua Bengio,et al. An empirical evaluation of deep architectures on problems with many factors of variation , 2007, ICML '07.
[40] Yoshua. Bengio,et al. Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..
[41] Geoffrey E. Hinton,et al. Using Deep Belief Nets to Learn Covariance Kernels for Gaussian Processes , 2007, NIPS.
[42] Yann LeCun,et al. Deep belief net learning in a long-range vision system for autonomous off-road driving , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[43] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .
[44] Geoffrey E. Hinton,et al. Generating Facial Expressions with Deep Belief Nets , 2008 .
[45] Jason Weston,et al. A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.
[46] Jason Weston,et al. Deep learning via semi-supervised embedding , 2008, ICML '08.
[47] Yoshua Bengio,et al. Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.
[48] Yoshua Bengio,et al. Classification using discriminative restricted Boltzmann machines , 2008, ICML '08.
[49] Geoffrey E. Hinton. Reducing the Dimensionality of Data with Neural , 2008 .
[50] Yoshua Bengio,et al. Exploring Strategies for Training Deep Neural Networks , 2009, J. Mach. Learn. Res..
[51] Honglak Lee,et al. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.
[52] Quoc V. Le,et al. Measuring Invariances in Deep Networks , 2009, NIPS.
[53] Pascal Vincent,et al. Visualizing Higher-Layer Features of a Deep Network , 2009 .
[54] Long Zhu,et al. Unsupervised Learning of Probabilistic Grammar-Markov Models for Object Categories , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[55] Jason Weston,et al. Curriculum learning , 2009, ICML '09.
[56] Yoshua Bengio,et al. Justifying and Generalizing Contrastive Divergence , 2009, Neural Computation.
[57] Pascal Vincent,et al. The Difficulty of Training Deep Architectures and the Effect of Unsupervised Pre-Training , 2009, AISTATS.
[58] Geoffrey E. Hinton,et al. Semantic hashing , 2009, Int. J. Approx. Reason..
[59] Hossein Mobahi,et al. Deep learning from temporal coherence in video , 2009, ICML '09.