Restricted Boltzmann machine: Recent advances and mean-field theory

This review deals with restricted Boltzmann machine (RBM) under the light of statistical physics. The RBM is a classical family of machine learning (ML) models which played a central role in the development of deep learning. Viewing it as a spin glass model and exhibiting various links with other models of statistical physics, we gather recent results dealing with mean-field theory in this context. First the functioning of the RBM can be analyzed via the phase diagrams obtained for various statistical ensembles of RBM, leading in particular to identify a compositional phase where a small number of features or modes are combined to form complex patterns. Then we discuss recent works either able to devise mean-field based learning algorithms; either able to reproduce generic aspects of the learning process from some ensemble dynamics equations or/and from linear stability arguments.

[1]  Corentin Tallec,et al.  Creating artificial human genomes using generative neural networks , 2021, PLoS genetics.

[2]  Haiping Huang Variational mean-field theory for training restricted Boltzmann machines with binary synapses. , 2020, Physical review. E.

[3]  Isabelle Guyon,et al.  Generation and evaluation of privacy preserving synthetic health data , 2020, Neurocomputing.

[4]  Simona Cocco,et al.  ‘Place-cell’ emergence and learning of invariant data with restricted Boltzmann machines: breaking and dynamical restoration of continuous symmetries in the weight space , 2019, Journal of Physics A: Mathematical and Theoretical.

[5]  Giancarlo Fissore,et al.  Robust Multi-Output Learning with Highly Incomplete Data via Restricted Boltzmann Machines , 2019, STAIRS@ECAI.

[6]  G. Genovese,et al.  Legendre equivalences of spherical Boltzmann machines , 2019, Journal of Physics A: Mathematical and Theoretical.

[7]  Cyril Furtlehner,et al.  Gaussian-spherical restricted Boltzmann machines , 2019, Journal of Physics A: Mathematical and Theoretical.

[8]  Cyril Furtlehner,et al.  Creating Artificial Human Genomes Using Generative Models , 2019, bioRxiv.

[9]  Daniele Tantari,et al.  Inverse problems for structured datasets using parallel TAP equations and RBM. , 2019, 1906.11988.

[10]  Florent Krzakala,et al.  High-temperature expansions and message passing algorithms , 2019, Journal of Statistical Mechanics: Theory and Experiment.

[11]  Martin Weigt,et al.  Selection of sequence motifs and generative Hopfield-Potts models for protein families , 2019, bioRxiv.

[12]  P. S. Sastry,et al.  An Overview of Restricted Boltzmann Machines , 2019, Journal of the Indian Institute of Science.

[13]  Muneki Yasuda,et al.  Restricted Boltzmann Machine with Multivalued Hidden Variables , 2018, The Review of Socionetwork Strategies.

[14]  Jérôme Tubiana,et al.  Restricted Boltzmann machines : from compositional representations to protein sequence analysis , 2018 .

[15]  A. Barra,et al.  Free energies of Boltzmann machines: self-averaging, annealed and replica symmetric approximations in the thermodynamic limit , 2018, Journal of Statistical Mechanics: Theory and Experiment.

[16]  Guido Montúfar,et al.  Restricted Boltzmann Machines: Introduction and Review , 2018, ArXiv.

[17]  David J. Schwab,et al.  A high-bias, low-variance introduction to Machine Learning for physicists , 2018, Physics reports.

[18]  Gavin Hartnett,et al.  Replica Symmetry Breaking in Bipartite Spin Glasses and Neural Networks , 2018, Physical review. E.

[19]  Bo Peng,et al.  Latent source mining in FMRI via restricted Boltzmann machine , 2018, Human brain mapping.

[20]  Shifei Ding,et al.  An overview on Restricted Boltzmann Machines , 2018, Neurocomputing.

[21]  Giancarlo Fissore,et al.  Thermodynamics of Restricted Boltzmann Machines and Related Learning Dynamics , 2018, Journal of Statistical Physics.

[22]  Giancarlo Fissore,et al.  Spectral dynamics of learning in restricted Boltzmann machines , 2017 .

[23]  Haiping Huang,et al.  Role of zero synapses in unsupervised feature learning , 2017, ArXiv.

[24]  Adriano Barra,et al.  Phase Diagram of Restricted Boltzmann Machines and Generalised Hopfield Networks with Arbitrary Priors , 2017, Physical review. E.

[25]  Florent Krzakala,et al.  A Deterministic and Generalized Framework for Unsupervised Learning with Restricted Boltzmann Machines , 2017, Physical Review X.

[26]  Haiping Huang,et al.  Statistical mechanics of unsupervised feature learning in a restricted Boltzmann machine with binary synapses , 2016, ArXiv.

[27]  Rémi Monasson,et al.  Emergence of Compositional Representations in Restricted Boltzmann Machines , 2016, Physical review letters.

[28]  Haiping Huang,et al.  Unsupervised feature learning from finite data by message passing: discontinuous versus continuous phase transition , 2016, Physical review. E.

[29]  M. Mézard Mean-field message-passing equations in the Hopfield model and its generalizations. , 2016, Physical review. E.

[30]  Masato Okada,et al.  Dynamical analysis of contrastive divergence learning: Restricted Boltzmann machines with Gaussian visible units , 2016, Neural Networks.

[31]  Florent Krzakala,et al.  Inferring sparsity: Compressed sensing using generalized restricted Boltzmann machines , 2016, 2016 IEEE Information Theory Workshop (ITW).

[32]  Roger G. Melko,et al.  Machine learning phases of matter , 2016, Nature Physics.

[33]  Muneki Yasuda,et al.  Mean-Field Inference in Gaussian Restricted Boltzmann Machine , 2015, 1512.00927.

[34]  Sotirios A. Tsaftaris,et al.  Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015 , 2015, Lecture Notes in Computer Science.

[35]  Florent Krzakala,et al.  Training Restricted Boltzmann Machines via the Thouless-Anderson-Palmer Free Energy , 2015, NIPS 2015.

[36]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[37]  Haiping Huang,et al.  Advanced Mean Field Theory of Restricted Boltzmann Machine , 2015, Physical review. E, Statistical, nonlinear, and soft matter physics.

[38]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[39]  Hironobu Fujiyoshi,et al.  To Be Bernoulli or to Be Gaussian, for a Restricted Boltzmann Machine , 2014, 2014 22nd International Conference on Pattern Recognition.

[40]  Vince D. Calhoun,et al.  Restricted Boltzmann machines for neuroimaging: An application in identifying intrinsic networks , 2014, NeuroImage.

[41]  Federico Ricci-Tersenghi,et al.  Replica cluster variational method: the replica symmetric solution for the 2D random bond Ising model , 2012, 1204.0439.

[42]  F. Ricci-Tersenghi The Bethe approximation for solving the inverse Ising problem: a comparison with other inference methods , 2011, 1112.4814.

[43]  J. Berg,et al.  Bethe–Peierls approximation and the inverse Ising problem , 2011, 1112.3501.

[44]  Elena Agliari,et al.  Multitasking associative networks. , 2011, Physical review letters.

[45]  Tapani Raiko,et al.  Enhanced Gradient and Adaptive Learning Rate for Training Restricted Boltzmann Machines , 2011, ICML.

[46]  Tapani Raiko,et al.  Improved Learning of Gaussian-Bernoulli Restricted Boltzmann Machines , 2011, ICANN.

[47]  Adriano Barra,et al.  On the equivalence of Hopfield networks and Boltzmann Machines , 2011, Neural Networks.

[48]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[49]  Kazuyuki Tanaka,et al.  Approximate Learning Algorithm in Boltzmann Machines , 2009, Neural Computation.

[50]  Geoffrey E. Hinton,et al.  Deep Boltzmann Machines , 2009, AISTATS.

[51]  Ruslan Salakhutdinov,et al.  On the quantitative analysis of deep belief networks , 2008, ICML '08.

[52]  Tijmen Tieleman,et al.  Training restricted Boltzmann machines using approximations to the likelihood gradient , 2008, ICML '08.

[53]  Nicolas Le Roux,et al.  Representational Power of Restricted Boltzmann Machines and Deep Belief Networks , 2008, Neural Computation.

[54]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[55]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[56]  Yuhong Yang,et al.  Information Theory, Inference, and Learning Algorithms , 2005 .

[57]  A. Coolen,et al.  Finite connectivity attractor neural networks , 2003, cond-mat/0304282.

[58]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[59]  Richard Hans Robert Hahnloser,et al.  Digital selection and analogue amplification coexist in a cortex-inspired silicon circuit , 2000, Nature.

[60]  Hilbert J. Kappen,et al.  Nonmonotonic Generalization Bias of Gaussian Mixture Models , 2000, Neural Computation.

[61]  Erkki Oja,et al.  Independent component analysis: algorithms and applications , 2000, Neural Networks.

[62]  Hilbert J. Kappen,et al.  Symmetry Breaking and Training from Incomplete Data with Radial Basis Boltzmann Machines , 1997, Int. J. Neural Syst..

[63]  P. Tavan,et al.  Deterministic annealing for density estimation by multivariate normal mixtures , 1997 .

[64]  K. Hukushima,et al.  Exchange Monte Carlo Method and Application to Spin Glass Simulations , 1995, cond-mat/9512035.

[65]  J. Yedidia,et al.  How to expand around mean-field theory using high-temperature expansions , 1991 .

[66]  Rose,et al.  Statistical mechanics and phase transitions in clustering. , 1990, Physical review letters.

[67]  E. Gardner The space of interactions in neural network models , 1988 .

[68]  E. Gardner,et al.  Optimal storage properties of neural network models , 1988 .

[69]  M. Mézard,et al.  Spin Glass Theory And Beyond: An Introduction To The Replica Method And Its Applications , 1986 .

[70]  Sompolinsky,et al.  Storing infinite numbers of patterns in a spin-glass model of neural networks. , 1985, Physical review letters.

[71]  Sompolinsky,et al.  Spin-glass models of neural networks. , 1985, Physical review. A, General physics.

[72]  T. Plefka Convergence condition of the TAP equation for the infinite-ranged Ising spin glass model , 1982 .

[73]  J J Hopfield,et al.  Neural networks and physical systems with emergent collective computational abilities. , 1982, Proceedings of the National Academy of Sciences of the United States of America.

[74]  S. Kirkpatrick,et al.  Infinite-ranged models of spin-glasses , 1978 .

[75]  S.-I. Amari,et al.  Neural theory of association and concept-formation , 1977, Biological Cybernetics.

[76]  R. Palmer,et al.  Solution of 'Solvable model of a spin glass' , 1977 .

[77]  H. E. Stanley,et al.  Spherical Model as the Limit of Infinite Spin Dimensionality , 1968 .

[78]  F ROSENBLATT,et al.  The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.

[79]  Mark Kac,et al.  The Spherical Model of a Ferromagnet , 1952 .

[80]  Oswin Krause,et al.  Algorithms for estimating the partition function of restricted Boltzmann machines , 2020, Artif. Intell..

[81]  Elena Agliari,et al.  Multitasking attractor networks with neuronal threshold noise , 2014, Neural Networks.

[82]  Christian Igel,et al.  Training restricted Boltzmann machines: An introduction , 2014, Pattern Recognit..

[83]  Ilya Sutskever,et al.  Data Normalization in the Learning of Restricted Boltzmann Machines , 2011 .

[84]  Pascal Vincent,et al.  Parallel Tempering for Training of Restricted Boltzmann Machines , 2010 .

[85]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[86]  Miguel Á. Carreira-Perpiñán,et al.  On Contrastive Divergence Learning , 2005, AISTATS.

[87]  Yee Whye Teh,et al.  Rate-coded Restricted Boltzmann Machines for Face Recognition , 2000, NIPS.

[88]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[89]  D. Amit,et al.  Statistical mechanics of neural networks near saturation , 1987 .

[90]  Geoffrey E. Hinton,et al.  A Learning Algorithm for Boltzmann Machines , 1985, Cogn. Sci..