A tutorial on stochastic approximation algorithms for training Restricted Boltzmann Machines and Deep Belief Nets
暂无分享,去创建一个
Nando de Freitas | Bo Chen | Benjamin M. Marlin | Kevin Swersky | Benjamin M Marlin | N. D. Freitas | Kevin Swersky | Bo Chen
[1] Bruno A. Olshausen,et al. Learning Horizontal Connections in a Sparse Coding Model of Natural Images , 2007, NIPS.
[2] Geoffrey E. Hinton,et al. Semantic hashing , 2009, Int. J. Approx. Reason..
[3] V. Borkar. Stochastic Approximation: A Dynamical Systems Viewpoint , 2008 .
[4] Yoshua Bengio,et al. Exploring Strategies for Training Deep Neural Networks , 2009, J. Mach. Learn. Res..
[5] H. Kushner,et al. Asymptotic Properties of Stochastic Approximations with Constant Coefficients. , 1981 .
[6] H. Kushner,et al. Averaging Methods for the Asymptotic Analysis of Learning and Adaptive Systems, with Small Adjustment Rate. Analysis of Nonlinear Stochastic Systems with Wide-Band Inputs. , 1980 .
[7] Honglak Lee,et al. Sparse deep belief net model for visual area V2 , 2007, NIPS.
[8] Geoffrey E. Hinton,et al. Reducing the Dimensionality of Data with Neural Networks , 2006, Science.
[9] Michael I. Jordan,et al. Learning with Mixtures of Trees , 2001, J. Mach. Learn. Res..
[10] Léon Bottou,et al. The Tradeoffs of Large Scale Learning , 2007, NIPS.
[11] Alan L. Yuille,et al. The Convergence of Contrastive Divergences , 2004, NIPS.
[12] Boris Polyak. Some methods of speeding up the convergence of iteration methods , 1964 .
[13] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[14] Yoshua Bengio,et al. Scaling learning algorithms towards AI , 2007 .
[15] Eugenius Kaszkurewicz,et al. Steepest descent with momentum for quadratic functions is a version of the conjugate gradient method , 2004, Neural Networks.
[16] J. Spall. Adaptive stochastic approximation by the simultaneous perturbation method , 1998, Proceedings of the 37th IEEE Conference on Decision and Control (Cat. No.98CH36171).
[17] Max Welling,et al. Herding Dynamic Weights for Partially Observed Random Field Models , 2009, UAI.
[18] Geoffrey E. Hinton. Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.
[19] B. Delyon. General results on the convergence of stochastic algorithms , 1996, IEEE Trans. Autom. Control..
[20] Tijmen Tieleman,et al. Training restricted Boltzmann machines using approximations to the likelihood gradient , 2008, ICML '08.
[21] Miguel Á. Carreira-Perpiñán,et al. On Contrastive Divergence Learning , 2005, AISTATS.
[22] Quoc V. Le,et al. Measuring Invariances in Deep Networks , 2009, NIPS.
[23] William A. Sethares,et al. Analysis of momentum adaptive filtering algorithms , 1998, IEEE Trans. Signal Process..
[24] Harold J. Kushner,et al. Stochastic Approximation Algorithms and Applications , 1997, Applications of Mathematics.
[25] Terrence J. Sejnowski,et al. Slow Feature Analysis: Unsupervised Learning of Invariances , 2002, Neural Computation.
[26] Boris Polyak,et al. Acceleration of stochastic approximation by averaging , 1992 .
[27] Thomas Hofmann,et al. Greedy Layer-Wise Training of Deep Networks , 2007 .
[28] L. Younes. Parametric Inference for imperfectly observed Gibbsian fields , 1989 .
[29] Yoshua Bengio,et al. Classification using discriminative restricted Boltzmann machines , 2008, ICML '08.
[30] Geoffrey E. Hinton,et al. Exponential Family Harmoniums with an Application to Information Retrieval , 2004, NIPS.
[31] Christophe Andrieu,et al. A tutorial on adaptive MCMC , 2008, Stat. Comput..
[32] Alexander V. Nazin,et al. Generalization Error Bounds for Aggregation by Mirror Descent with Averaging , 2005, NIPS.
[33] D. Ruppert. A Newton-Raphson Version of the Multivariate Robbins-Monro Procedure , 1985 .
[34] Honglak Lee,et al. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.
[35] Pascal Vincent,et al. The Difficulty of Training Deep Architectures and the Effect of Unsupervised Pre-Training , 2009, AISTATS.
[36] P. Kumar,et al. Theory and practice of recursive identification , 1985, IEEE Transactions on Automatic Control.
[37] D. George,et al. Hierarchical Temporal Memory Concepts , Theory , and Terminology , 2006 .
[38] Yann LeCun,et al. The mnist database of handwritten digits , 2005 .
[39] Geoffrey E. Hinton,et al. Restricted Boltzmann machines for collaborative filtering , 2007, ICML '07.
[40] Yee Whye Teh,et al. A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.
[41] Pierre Priouret,et al. Adaptive Algorithms and Stochastic Approximations , 1990, Applications of Mathematics.