Gaussian-spherical restricted Boltzmann machines

We consider a special type of Restricted Boltzmann machine (RBM), namely a Gaussian-spherical RBM where the visible units have Gaussian priors while the vector of hidden variables is constrained to stay on an ${\mathbbm L}_2$ sphere. The spherical constraint having the advantage to admit exact asymptotic treatments, various scaling regimes are explicitly identified based solely on the spectral properties of the coupling matrix (also called weight matrix of the RBM). Incidentally these happen to be formally related to similar scaling behaviours obtained in a different context dealing with spatial condensation of zero range processes. More specifically, when the spectrum of the coupling matrix is doubly degenerated an exact treatment can be proposed to deal with finite size effects. Interestingly the known parallel between the ferromagnetic transition of the spherical model and the Bose-Einstein condensation can be made explicit in that case. More importantly this gives us the ability to extract all needed response functions with arbitrary precision for the training algorithm of the RBM. This allows us then to numerically integrate the dynamics of the spectrum of the weight matrix during learning in a precise way. This dynamics reveals in particular a sequential emergence of modes from the Marchenko-Pastur bulk of singular vectors of the coupling matrix.

[1]  Sompolinsky,et al.  Storing infinite numbers of patterns in a spin-glass model of neural networks. , 1985, Physical review letters.

[2]  J. R. Jackson Networks of Waiting Lines , 1957 .

[3]  J. M. Luck,et al.  Nonequilibrium dynamics of urn models , 2002 .

[4]  D. Amit,et al.  Statistical mechanics of neural networks near saturation , 1987 .

[5]  Adriano Barra,et al.  Phase transitions in Restricted Boltzmann Machines with generic priors , 2016, Physical review. E.

[6]  S. Majumdar,et al.  Canonical Analysis of Condensation in Factorised Steady States , 2005, cond-mat/0510512.

[7]  John J. Hopfield,et al.  Neural networks and physical systems with emergent collective computational abilities , 1999 .

[8]  Mark Kac,et al.  The Spherical Model of a Ferromagnet , 1952 .

[9]  Hironobu Fujiyoshi,et al.  To Be Bernoulli or to Be Gaussian, for a Restricted Boltzmann Machine , 2014, 2014 22nd International Conference on Pattern Recognition.

[10]  Anne Auger,et al.  Learning Multiple Belief Propagation Fixed Points for Real Time Inference , 2009, Physica A: Statistical Mechanics and its Applications.

[11]  M. Mézard Mean-field message-passing equations in the Hopfield model and its generalizations. , 2016, Physical review. E.

[12]  Giancarlo Fissore,et al.  Thermodynamics of Restricted Boltzmann Machines and Related Learning Dynamics , 2018, Journal of Statistical Physics.

[13]  Antonio Auffinger,et al.  Free Energy and Complexity of Spherical Bipartite Models , 2014, 1405.2321.

[14]  Masato Okada,et al.  Dynamical analysis of contrastive divergence learning: Restricted Boltzmann machines with Gaussian visible units , 2016, Neural Networks.

[15]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[16]  E. Gardner,et al.  Optimal storage properties of neural network models , 1988 .

[17]  E. Gardner The space of interactions in neural network models , 1988 .

[18]  Jérôme Tubiana,et al.  Restricted Boltzmann machines : from compositional representations to protein sequence analysis , 2018 .

[19]  Arthur Jacot,et al.  Neural tangent kernel: convergence and generalization in neural networks (invited paper) , 2018, NeurIPS.

[20]  J. Baik,et al.  Free energy of bipartite spherical Sherrington–Kirkpatrick model , 2017, Annales de l'Institut Henri Poincaré, Probabilités et Statistiques.

[21]  Geoffrey E. Hinton,et al.  Deep Boltzmann Machines , 2009, AISTATS.

[22]  Adriano Barra,et al.  Phase Diagram of Restricted Boltzmann Machines and Generalised Hopfield Networks with Arbitrary Priors , 2017, Physical review. E.

[23]  Bo Peng,et al.  Latent source mining in FMRI via restricted Boltzmann machine , 2018, Human brain mapping.

[24]  Timo Aila,et al.  A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Guy Pujolle,et al.  Introduction to queueing networks , 1987 .

[26]  D. Owen Handbook of Mathematical Functions with Formulas , 1965 .

[27]  K. Mani Chandy,et al.  Open, Closed, and Mixed Networks of Queues with Different Classes of Customers , 1975, JACM.

[28]  Tijmen Tieleman,et al.  Training restricted Boltzmann machines using approximations to the likelihood gradient , 2008, ICML '08.

[29]  D. Thouless,et al.  Spherical Model of a Spin-Glass , 1976 .

[30]  B. Sagan The Symmetric Group , 2001 .

[31]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[32]  M. .. Moore Exactly Solved Models in Statistical Mechanics , 1983 .

[33]  Milton Abramowitz,et al.  Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables , 1964 .

[34]  Florent Krzakala,et al.  High-temperature expansions and message passing algorithms , 2019, Journal of Statistical Mechanics: Theory and Experiment.

[35]  Haiping Huang,et al.  Statistical mechanics of unsupervised feature learning in a restricted Boltzmann machine with binary synapses , 2016, ArXiv.

[36]  Yoshiyuki Kabashima,et al.  Entropy landscape of solutions in the binary perceptron problem , 2013, ArXiv.

[37]  Rémi Monasson,et al.  Emergence of Compositional Representations in Restricted Boltzmann Machines , 2016, Physical review letters.

[38]  Florent Krzakala,et al.  Training Restricted Boltzmann Machines via the Thouless-Anderson-Palmer Free Energy , 2015, NIPS 2015.

[39]  Vince D. Calhoun,et al.  Restricted Boltzmann machines for neuroimaging: An application in identifying intrinsic networks , 2014, NeuroImage.

[40]  Naftali Tishby,et al.  Opening the Black Box of Deep Neural Networks via Information , 2017, ArXiv.

[41]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[42]  Kazuyuki Tanaka,et al.  Approximate Learning Algorithm in Boltzmann Machines , 2009, Neural Computation.

[43]  Tapani Raiko,et al.  Improved Learning of Gaussian-Bernoulli Restricted Boltzmann Machines , 2011, ICANN.

[44]  W. Kinzel,et al.  Layered neural networks , 1989 .

[45]  Cyril Furtlehner,et al.  Creating Artificial Human Genomes Using Generative Models , 2019, bioRxiv.

[46]  L. Pastur Disordered spherical model , 1982 .

[47]  Daniele Tantari,et al.  Legendre equivalences of spherical Boltzmann machines , 2020 .

[48]  Cristopher Moore,et al.  Phase transition in the detection of modules in sparse networks , 2011, Physical review letters.

[49]  Giancarlo Fissore,et al.  Spectral dynamics of learning in restricted Boltzmann machines , 2017 .

[50]  Sompolinsky,et al.  Spin-glass models of neural networks. , 1985, Physical review. A, General physics.

[51]  Florent Krzakala,et al.  Statistical physics-based reconstruction in compressed sensing , 2011, ArXiv.

[52]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[53]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .