Average Contrastive Divergence for Training Restricted Boltzmann Machines

This paper studies contrastive divergence (CD) learning algorithm and proposes a new algorithm for training restricted Boltzmann machines (RBMs). We derive that CD is a biased estimator of the log-likelihood gradient method and make an analysis of the bias. Meanwhile, we propose a new learning algorithm called average contrastive divergence (ACD) for training RBMs. It is an improved CD algorithm, and it is different from the traditional CD algorithm. Finally, we obtain some experimental results. The results show that the new algorithm is a better approximation of the log-likelihood gradient method and outperforms the traditional CD algorithm.

[1]  Geoffrey E. Hinton Learning multiple layers of representation , 2007, Trends in Cognitive Sciences.

[2]  Tapani Raiko,et al.  Parallel tempering is efficient for learning restricted Boltzmann machines , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[3]  Geoffrey E. Hinton,et al.  Using fast weights to improve persistent contrastive divergence , 2009, ICML '09.

[4]  Ruslan Salakhutdinov,et al.  On the quantitative analysis of deep belief networks , 2008, ICML '08.

[5]  Xuesi Ma,et al.  Convergence Analysis of Contrastive Divergence Algorithm Based on Gradient Method with Errors , 2015 .

[6]  Yoshua Bengio,et al.  Justifying and Generalizing Contrastive Divergence , 2009, Neural Computation.

[7]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[8]  Geoffrey E. Hinton,et al.  An Efficient Learning Procedure for Deep Boltzmann Machines , 2012, Neural Computation.

[9]  Hugo Larochelle,et al.  Efficient Learning of Deep Boltzmann Machines , 2010, AISTATS.

[10]  Yoshua Bengio,et al.  Greedy Layer-Wise Training of Deep Networks , 2006, NIPS.

[11]  Shotaro Akaho,et al.  Information Geometry of Contrastive Divergence , 2008, ITSL.

[12]  Ruslan Salakhutdinov,et al.  Learning Deep Boltzmann Machines using Adaptive MCMC , 2010, ICML.

[13]  Ruifan Li,et al.  Deep correspondence restricted Boltzmann machine for cross-modal retrieval , 2015, Neurocomputing.

[14]  L. Younes Parametric Inference for imperfectly observed Gibbsian fields , 1989 .

[15]  Christian Igel,et al.  Training restricted Boltzmann machines: An introduction , 2014, Pattern Recognit..

[16]  Tijmen Tieleman,et al.  Training restricted Boltzmann machines using approximations to the likelihood gradient , 2008, ICML '08.

[17]  Miguel Á. Carreira-Perpiñán,et al.  On Contrastive Divergence Learning , 2005, AISTATS.

[18]  Alan L. Yuille,et al.  The Convergence of Contrastive Divergences , 2004, NIPS.

[19]  Christian Igel,et al.  An Introduction to Restricted Boltzmann Machines , 2012, CIARP.

[20]  Tapani Raiko,et al.  Enhanced Gradient for Training Restricted Boltzmann Machines , 2013, Neural Computation.

[21]  Pascal Vincent,et al.  Tempered Markov Chain Monte Carlo for training of Restricted Boltzmann Machines , 2010, AISTATS.

[22]  Hui Li,et al.  Improving mixing rate with tempered transition for learning restricted Boltzmann machines , 2014, Neurocomputing.

[23]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[24]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[25]  Ilya Sutskever,et al.  On the Convergence Properties of Contrastive Divergence , 2010, AISTATS.

[26]  Christian Igel,et al.  Bounding the Bias of Contrastive Divergence Learning , 2011, Neural Computation.