Convergence of contrastive divergence algorithm in exponential family
暂无分享,去创建一个
Bai Jiang | Yifan Jin | Tung-Yu Wu | W. Wong | Bai Jiang | Tung-Yu Wu | Yifan Jin | Wing H. Wong
[1] Marc Teboulle,et al. A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..
[2] Geoffrey E. Hinton,et al. A Learning Algorithm for Boltzmann Machines , 1985, Cogn. Sci..
[3] Geoffrey E. Hinton. Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.
[4] Garry Robins,et al. An introduction to exponential random graph (p*) models for social networks , 2007, Soc. Networks.
[5] J. Rosenthal,et al. Geometric Ergodicity and Hybrid Markov Chains , 1997 .
[6] Yoshua Bengio,et al. Classification using discriminative restricted Boltzmann machines , 2008, ICML '08.
[7] Gersende Fort,et al. On Perturbed Proximal Gradient Algorithms , 2014, J. Mach. Learn. Res..
[8] R. Durrett. Probability: Theory and Examples , 1993 .
[9] S. Meyn,et al. Stability of Markovian processes I: criteria for discrete-time Chains , 1992, Advances in Applied Probability.
[10] Geoffrey E. Hinton,et al. Acoustic Modeling Using Deep Belief Networks , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[11] Persi Diaconis,et al. The Markov chain Monte Carlo revolution , 2008 .
[12] Pavel N. Krivitsky,et al. Using contrastive divergence to seed Monte Carlo MLE for exponential-family random graph models , 2017, Comput. Stat. Data Anal..
[13] Thomas Hofmann,et al. Greedy Layer-Wise Training of Deep Networks , 2007 .
[14] Jonathan C. Mattingly,et al. Yet Another Look at Harris’ Ergodic Theorem for Markov Chains , 2008, 0810.2777.
[15] Yee Whye Teh,et al. A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.
[16] D. Mackay,et al. Failures of the One-Step Learning Algorithm , 2001 .
[17] R. Tweedie,et al. Geometric convergence and central limit theorems for multidimensional Hastings and Metropolis algorithms , 1996 .
[18] Miguel Á. Carreira-Perpiñán,et al. On Contrastive Divergence Learning , 2005, AISTATS.
[19] Geoffrey E. Hinton,et al. Reducing the Dimensionality of Data with Neural Networks , 2006, Science.
[20] Martina Morris,et al. ergm: A Package to Fit, Simulate and Diagnose Exponential-Family Models for Networks. , 2008, Journal of statistical software.
[21] R. Tweedie. Criteria for classifying general Markov chains , 1976, Advances in Applied Probability.
[22] E. L. Lehmann,et al. Theory of point estimation , 1950 .
[23] J. Rosenthal,et al. General state space Markov chains and MCMC algorithms , 2004, math/0404033.
[24] Kazuoki Azuma. WEIGHTED SUMS OF CERTAIN DEPENDENT RANDOM VARIABLES , 1967 .
[25] Stephen P. Boyd,et al. Proximal Algorithms , 2013, Found. Trends Optim..
[26] Christopher K. I. Williams,et al. An analysis of contrastive divergence learning in gaussian boltzmann machines , 2002 .
[27] S. Meyn,et al. Geometric ergodicity and the spectral gap of non-reversible Markov chains , 2009, 0906.5322.
[28] Miguel Á. Carreira-Perpiñán,et al. Multiscale conditional random fields for image labeling , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..
[29] R. Tweedie,et al. Rates of convergence of the Hastings and Metropolis algorithms , 1996 .
[30] Michael J. Black,et al. Fields of Experts: a framework for learning image priors , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).
[31] Geoffrey E. Hinton,et al. Restricted Boltzmann machines for collaborative filtering , 2007, ICML '07.
[32] Alan L. Yuille,et al. The Convergence of Contrastive Divergences , 2004, NIPS.
[33] Yee Whye Teh,et al. Energy-Based Models for Sparse Overcomplete Representations , 2003, J. Mach. Learn. Res..
[34] F. G. Foster. On the Stochastic Matrices Associated with Certain Queuing Processes , 1953 .
[35] Padhraic Smyth,et al. Learning with Blocks: Composite Likelihood and Contrastive Divergence , 2010, AISTATS.
[36] Yoshua Bengio,et al. Justifying and Generalizing Contrastive Divergence , 2009, Neural Computation.
[37] Geoffrey E. Hinton,et al. Replicated Softmax: an Undirected Topic Model , 2009, NIPS.
[38] S. Meyn,et al. Stability of Markovian processes III: Foster–Lyapunov criteria for continuous-time processes , 1993, Advances in Applied Probability.
[39] D. Rudolf,et al. Explicit error bounds for Markov chain Monte Carlo , 2011, 1108.3201.
[40] Ilya Sutskever,et al. On the Convergence Properties of Contrastive Divergence , 2010, AISTATS.
[41] Y. Amit. Convergence properties of the Gibbs sampler for perturbations of Gaussians , 1996 .
[42] Honglak Lee,et al. An Analysis of Single-Layer Networks in Unsupervised Feature Learning , 2011, AISTATS.
[43] Aapo Hyvärinen,et al. Consistency of Pseudolikelihood Estimation of Fully Visible Boltzmann Machines , 2006, Neural Computation.
[44] Richard L. Tweedie,et al. Markov Chains and Stochastic Stability , 1993, Communications and Control Engineering Series.