Exploring Strategies for Training Deep Neural Networks
暂无分享,去创建一个
Yoshua Bengio | Hugo Larochelle | Jérôme Louradour | Pascal Lamblin | Yoshua Bengio | H. Larochelle | Pascal Lamblin | J. Louradour
[1] A. Yao. Separating the polynomial-time hierarchy by oracles , 1985 .
[2] Paul Smolensky,et al. Information processing in dynamical systems: foundations of harmony theory , 1986 .
[3] Johan Håstad,et al. Almost optimal lower bounds for small depth circuits , 1986, STOC '86.
[4] Ingo Wegener,et al. The complexity of Boolean functions , 1987 .
[5] Kurt Hornik,et al. Neural networks and principal component analysis: Learning from examples without local minima , 1989, Neural Networks.
[6] Christian Lebiere,et al. The Cascade-Correlation Learning Architecture , 1989, NIPS.
[7] Geoffrey E. Hinton. Connectionist Learning Procedures , 1989, Artif. Intell..
[8] Eric Saund,et al. Dimensionality-Reduction Using Connectionist Networks , 1989, IEEE Trans. Pattern Anal. Mach. Intell..
[9] Kurt Hornik,et al. Multilayer feedforward networks are universal approximators , 1989, Neural Networks.
[10] Christian Jutten,et al. Blind separation of sources, part I: An adaptive algorithm based on neuromimetic architecture , 1991, Signal Process..
[11] Radford M. Neal. Connectionist Learning of Belief Networks , 1992, Artif. Intell..
[12] Garrison W. Cottrell,et al. Non-Linear Dimensionality Reduction , 1992, NIPS.
[13] J. Friedman,et al. A Statistical View of Some Chemometrics Regression Tools , 1993 .
[14] J. Friedman,et al. [A Statistical View of Some Chemometrics Regression Tools]: Response , 1993 .
[15] Pierre Comon,et al. Independent component analysis, A new concept? , 1994, Signal Process..
[16] Terrence J. Sejnowski,et al. An Information-Maximization Approach to Blind Separation and Blind Deconvolution , 1995, Neural Computation.
[17] Peter Auer,et al. Exponentially many local minima for single neurons , 1995, NIPS.
[18] Geoffrey E. Hinton,et al. The Helmholtz Machine , 1995, Neural Computation.
[19] Geoffrey E. Hinton,et al. The "wake-sleep" algorithm for unsupervised neural networks. , 1995, Science.
[20] Michael I. Jordan,et al. Mean Field Theory for Sigmoid Belief Networks , 1996, J. Artif. Intell. Res..
[21] Régis Lengellé,et al. Training MLPs layer by layer using an objective function for internal representations , 1996, Neural Networks.
[22] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[23] Brendan J. Frey,et al. Graphical Models for Machine Learning and Digital Communication , 1998 .
[24] David Haussler,et al. Exploiting Generative Models in Discriminative Classifiers , 1998, NIPS.
[25] Kenji Fukumizu,et al. Local minima and plateaus in hierarchical structures of multilayer perceptrons , 2000, Neural Networks.
[26] Michael I. Jordan,et al. On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes , 2001, NIPS.
[27] Paul Mineiro,et al. A Monte Carlo EM Approach for Partially Observable Diffusion Processes: Theory and Applications to Neural Networks , 2002, Neural Computation.
[28] Geoffrey E. Hinton. Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.
[29] Geoffrey E. Hinton,et al. A New Learning Algorithm for Mean Field Boltzmann Machines , 2002, ICANN.
[30] Alan F. Murray,et al. Continuous restricted Boltzmann machine with an implementable training algorithm , 2003 .
[31] Tony Jebara,et al. Machine Learning: Discriminative and Generative (Kluwer International Series in Engineering and Computer Science) , 2003 .
[32] Geoffrey E. Hinton,et al. Exponential Family Harmoniums with an Application to Information Retrieval , 2004, NIPS.
[33] Guillaume Bouchard,et al. The Tradeoff Between Generative and Discriminative Classifiers , 2004 .
[34] Pietro Perona,et al. A discriminative framework for modelling object classes , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).
[35] Johan Håstad,et al. On the power of small-depth threshold circuits , 1991, computational complexity.
[36] Tong Zhang,et al. A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , 2005, J. Mach. Learn. Res..
[37] Nicolas Le Roux,et al. The Curse of Highly Variable Functions for Local Kernel Machines , 2005, NIPS.
[38] Miguel Á. Carreira-Perpiñán,et al. On Contrastive Divergence Learning , 2005, AISTATS.
[39] Yoshua Bengio,et al. Greedy Layer-Wise Training of Deep Networks , 2006, NIPS.
[40] Geoffrey E. Hinton,et al. Reducing the Dimensionality of Data with Neural Networks , 2006, Science.
[41] Yee Whye Teh,et al. A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.
[42] Marc'Aurelio Ranzato,et al. Efficient Learning of Sparse Representations with an Energy-Based Model , 2006, NIPS.
[43] Tom Minka,et al. Principled Hybrids of Generative and Discriminative Models , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).
[44] Geoffrey E. Hinton,et al. Restricted Boltzmann machines for collaborative filtering , 2007, ICML '07.
[45] Marc'Aurelio Ranzato,et al. Sparse Feature Learning for Deep Belief Networks , 2007, NIPS.
[46] Geoffrey E. Hinton,et al. To recognize shapes, first learn to generate images. , 2007, Progress in brain research.
[47] Jason Weston,et al. Large-scale kernel machines , 2007 .
[48] Yoshua Bengio,et al. Scaling learning algorithms towards AI , 2007 .
[49] Geoffrey E. Hinton,et al. Modeling image patches with a directed hierarchy of Markov random fields , 2007, NIPS.
[50] Geoffrey E. Hinton,et al. Learning a Nonlinear Embedding by Preserving Class Neighbourhood Structure , 2007, AISTATS.
[51] Yoshua Bengio,et al. An empirical evaluation of deep architectures on problems with many factors of variation , 2007, ICML '07.
[52] Marc'Aurelio Ranzato,et al. Unsupervised Learning of Invariant Feature Hierarchies with Applications to Object Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.
[53] Yoshua. Bengio,et al. Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..
[54] Geoffrey E. Hinton,et al. Using Deep Belief Nets to Learn Covariance Kernels for Gaussian Processes , 2007, NIPS.
[55] Yann LeCun,et al. Deep belief net learning in a long-range vision system for autonomous off-road driving , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[56] Ruslan Salakhutdinov,et al. On the quantitative analysis of deep belief networks , 2008, ICML '08.
[57] Jason Weston,et al. A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.
[58] Jason Weston,et al. Deep learning via semi-supervised embedding , 2008, ICML '08.
[59] Yoshua Bengio,et al. Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.
[60] Yoshua Bengio,et al. Classification using discriminative restricted Boltzmann machines , 2008, ICML '08.
[61] Yoshua Bengio,et al. Justifying and Generalizing Contrastive Divergence , 2009, Neural Computation.
[62] Geoffrey E. Hinton,et al. Semantic hashing , 2009, Int. J. Approx. Reason..