Symmetry-via-Duality: Invariant Neural Network Densities from Parameter-Space Correlators

Parameter-space and function-space provide two different duality frames in which to study neural networks. We demonstrate that symmetries of network densities may be determined via dual computations of network correlation functions, even when the density is unknown and the network is not equivariant. Symmetry-viaduality relies on invariance properties of the correlation functions, which stem from the choice of network parameter distributions. Input and output symmetries of neural network densities are determined, which recover known Gaussian process results in the infinite width limit. The mechanism may also be utilized to determine symmetries during training, when parameters are correlated, as well as symmetries of the Neural Tangent Kernel. We demonstrate that the amount of symmetry in the initialization density affects the accuracy of networks trained on Fashion-MNIST, and that symmetry breaking helps only when it is in the direction of ground truth.

[1]  Arthur Jacot,et al.  Neural tangent kernel: convergence and generalization in neural networks (invited paper) , 2018, NeurIPS.

[2]  Max Welling,et al.  Group Equivariant Convolutional Networks , 2016, ICML.

[3]  Ethan Dyer,et al.  Asymptotics of Wide Networks from Feynman Diagrams , 2019, ICLR.

[4]  Maurice Weiler,et al.  Intertwiners between Induced Representations (with Applications to the Theory of Equivariant Neural Networks) , 2018, ArXiv.

[5]  Greg Yang,et al.  Scaling Limits of Wide Neural Networks with Weight Sharing: Gaussian Process Behavior, Gradient Independence, and Neural Tangent Kernel Derivation , 2019, ArXiv.

[6]  Roland Vollgraf,et al.  Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[7]  Zhen Lin,et al.  Clebsch-Gordan Nets: a Fully Fourier Space Spherical Convolutional Neural Network , 2018, NeurIPS.

[8]  Max Welling,et al.  Gauge Equivariant Convolutional Networks and the Icosahedral CNN 1 , 2019 .

[9]  Fabian B. Fuchs,et al.  SE(3)-Transformers: 3D Roto-Translation Equivariant Attention Networks , 2020, NeurIPS.

[10]  David Pfau,et al.  Towards a Definition of Disentangled Representations , 2018, ArXiv.

[11]  Kevin O'Connor,et al.  Permutation Invariant Likelihoods and Equivariant Transformations , 2019, ArXiv.

[12]  Zohar Ringel,et al.  Learning curves for overparametrized deep neural networks: A field theory perspective , 2021 .

[13]  Greg Yang,et al.  Tensor Programs I: Wide Feedforward or Recurrent Neural Networks of Any Architecture are Gaussian Processes , 2019, NeurIPS.

[14]  Risi Kondor,et al.  Cormorant: Covariant Molecular Neural Networks , 2019, NeurIPS.

[15]  Frank Noé,et al.  Equivariant Flows: exact likelihood generative learning for symmetric densities , 2020, ICML.

[16]  James Halverson,et al.  Neural networks and quantum field theory , 2021, Mach. Learn. Sci. Technol..

[17]  Nicola De Cao,et al.  Explorations in Homeomorphic Variational Auto-Encoding , 2018, ArXiv.

[18]  Yaron Lipman,et al.  Invariant and Equivariant Graph Networks , 2018, ICLR.

[19]  Cengiz Pehlevan,et al.  Exact priors of finite neural networks , 2021, ArXiv.

[20]  David W. Miller,et al.  Lorentz Group Equivariant Neural Network for Particle Physics , 2020, ICML.

[21]  J. Polchinski Dualities of Fields and Strings , 2014, 1412.5704.

[22]  Li Li,et al.  Tensor Field Networks: Rotation- and Translation-Equivariant Neural Networks for 3D Point Clouds , 2018, ArXiv.

[23]  Ingmar Schuster,et al.  Set Flow: A Permutation Invariant Normalizing Flow , 2019, ArXiv.

[24]  Taco S. Cohen,et al.  A Data and Compute Efficient Design for Limited-Resources Deep Learning , 2020, ArXiv.

[25]  Erik J Bekkers B-Spline CNNs on Lie Groups , 2020, ICLR.

[26]  Sven Krippendorf,et al.  Detecting symmetries with neural networks , 2020, Mach. Learn. Sci. Technol..

[27]  Gal Chechik,et al.  On Learning Sets of Symmetric Elements , 2020, ICML.

[28]  Gurtej Kanwar,et al.  Equivariant flow-based sampling for lattice gauge theory , 2020, Physical review letters.

[29]  Christopher K. I. Williams Computing with Infinite Networks , 1996, NIPS.

[30]  Sho Yaida,et al.  Non-Gaussian processes and neural networks at finite widths , 2019, MSML.

[31]  J. Maldacena The Large-N Limit of Superconformal Field Theories and Supergravity , 1997, hep-th/9711200.

[32]  Zohar Ringel,et al.  Learning Curves for Deep Neural Networks: A Gaussian Field Theory Perspective , 2019, ArXiv.

[33]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[34]  Jaehoon Lee,et al.  Bayesian Convolutional Neural Networks with Many Channels are Gaussian Processes , 2018, ArXiv.

[35]  Gurtej Kanwar,et al.  Sampling using SU(N) gauge equivariant flows , 2021, Physical Review D.

[36]  Max Welling,et al.  3D Steerable CNNs: Learning Rotationally Equivariant Features in Volumetric Data , 2018, NeurIPS.

[37]  N. Seiberg,et al.  Electric - magnetic duality, monopole condensation, and confinement in N=2 supersymmetric Yang-Mills theory , 1994 .

[38]  Jaehoon Lee,et al.  Wide neural networks of any depth evolve as linear models under gradient descent , 2019, NeurIPS.

[39]  Stephan J. Garbin,et al.  Harmonic Networks: Deep Translation and Rotation Equivariance , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  John H. Schwarz,et al.  Anomaly cancellations in supersymmetric D=10 gauge theory and superstring theory , 1984 .

[41]  Richard E. Turner,et al.  Gaussian Process Behaviour in Wide Deep Neural Networks , 2018, ICLR.

[42]  Risi Kondor,et al.  On the Generalization of Equivariance and Convolution in Neural Networks to the Action of Compact Groups , 2018, ICML.

[43]  Yaron Lipman,et al.  On the Universality of Invariant Networks , 2019, ICML.

[44]  Surya Ganguli,et al.  Neural Mechanics: Symmetry and Broken Conservation Laws in Deep Learning Dynamics , 2021, ICLR.

[45]  Laurence Aitchison,et al.  Deep Convolutional Networks as shallow Gaussian Processes , 2018, ICLR.