Group Equivariant Neural Architecture Search via Group Decomposition and Reinforcement Learning

Recent works show that including group equivariance as an inductive bias improves neural network performance for both classification and generation tasks. Designing group-equivariant neural networks is, however, challenging when the group of interest is large and is unknown. Moreover, inducing equivariance can significantly reduce the number of independent parameters in a network with fixed feature size, affecting its overall performance. We address these problems by proving a new group-theoretic result in the context of equivariant neural networks that shows that a network is equivariant to a large group if and only if it is equivariant to smaller groups from which it is constructed. We also design an algorithm to construct equivariant networks that significantly improves computational complexity. Further, leveraging our theoretical result, we use deep Q-learning to search for group equivariant networks that maximize performance, in a significantly reduced search space than naive approaches, yielding what we call autoequivariant networks (AENs). To evaluate AENs, we construct and release new benchmark datasets, G-MNIST and G-Fashion-MNIST, obtained via group transformations on MNIST and FashionMNIST respectively. We show that AENs find the right balance between group equivariance and number of parameters, thereby consistently having good task performance.

[1]  Yee Whye Teh,et al.  LieTransformer: Equivariant self-attention for Lie Groups , 2020, ICML.

[2]  Ivan Dokmanic,et al.  Truly shift-invariant convolutional neural networks , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Stephan Günnemann,et al.  Equivariant Normalizing Flows for Point Processes and Sets , 2020, ArXiv.

[4]  William Gropp,et al.  HAL: Computer System for Scalable Deep Learning , 2020, PEARC.

[5]  Chelsea Finn,et al.  Meta-Learning Symmetries by Reparameterization , 2020, ICLR.

[6]  Herke van Hoof,et al.  MDP Homomorphic Networks: Group Symmetries in Reinforcement Learning , 2020, NeurIPS.

[7]  Kostas Daniilidis,et al.  Spin-Weighted Spherical CNNs , 2020, NeurIPS.

[8]  Fabian B. Fuchs,et al.  SE(3)-Transformers: 3D Roto-Translation Equivariant Attention Networks , 2020, NeurIPS.

[9]  Erik J. Bekkers,et al.  Wavelet Networks: Scale Equivariant Learning From Raw Waveforms , 2020, ArXiv.

[10]  David W. Miller,et al.  Lorentz Group Equivariant Neural Network for Particle Physics , 2020, ICML.

[11]  Frank Noé,et al.  Equivariant Flows: exact likelihood generative learning for symmetric densities , 2020, ICML.

[12]  N. Mitsume,et al.  Isometric Transformation Invariant and Equivariant Graph Convolutional Networks , 2020, ArXiv.

[13]  Antong Chen,et al.  Group Equivariant Generative Adversarial Networks , 2020, ICLR.

[14]  David Lopez-Paz,et al.  Permutation Equivariant Models for Compositional Generalization in Language , 2020, ICLR.

[15]  Max Welling,et al.  Gauge Equivariant Mesh CNNs: Anisotropic convolutions on geometric graphs , 2020, ICLR.

[16]  Pavel Izmailov,et al.  Generalizing Convolutional Neural Networks for Equivariance to Lie Groups on Arbitrary Continuous Data , 2020, ICML.

[17]  Jakub M. Tomczak,et al.  Attentive Group Equivariant Convolutional Networks , 2020, ICML.

[18]  Maurice Weiler,et al.  General E(2)-Equivariant Steerable CNNs , 2019, NeurIPS.

[19]  Mark Hoogendoorn,et al.  Co-Attentive Equivariant Neural Networks: Focusing Equivariance On Transformations Co-Occurring In Data , 2019, ICLR.

[20]  E. Bekkers B-Spline CNNs on Lie Groups , 2019, ICLR.

[21]  Quoc V. Le,et al.  AutoAugment: Learning Augmentation Strategies From Data , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Daniel E. Worrall,et al.  Deep Scale-spaces: Equivariance Over Scale , 2019, NeurIPS.

[23]  Yaron Lipman,et al.  Provably Powerful Graph Networks , 2019, NeurIPS.

[24]  Gabriel Peyré,et al.  Universal Invariant and Equivariant Graph Neural Networks , 2019, NeurIPS.

[25]  Richard Zhang,et al.  Making Convolutional Networks Shift-Invariant Again , 2019, ICML.

[26]  Patrick Forré,et al.  Reparameterizing Distributions on Lie Groups , 2019, AISTATS.

[27]  Max Welling,et al.  Gauge Equivariant Convolutional Networks and the Icosahedral CNN 1 , 2019 .

[28]  Alexander Zhebrak,et al.  Extracting Invariant Features From Images Using An Equivariant Autoencoder , 2018, ACML.

[29]  Yaron Lipman,et al.  Invariant and Equivariant Graph Networks , 2018, ICLR.

[30]  Nicola De Cao,et al.  Explorations in Homeomorphic Variational Auto-Encoding , 2018, ArXiv.

[31]  Pascal Libuschewski,et al.  Group Equivariant Capsule Networks , 2018, NeurIPS.

[32]  Risi Kondor,et al.  On the Generalization of Equivariance and Convolution in Neural Networks to the Action of Compact Groups , 2018, ICML.

[33]  Quoc V. Le,et al.  Efficient Neural Architecture Search via Parameter Sharing , 2018, ICML.

[34]  Max Welling,et al.  Spherical CNNs , 2018, ICLR.

[35]  Kostas Daniilidis,et al.  Learning SO(3) Equivariant Representations with Spherical CNNs , 2017, International Journal of Computer Vision.

[36]  Geoffrey E. Hinton,et al.  Dynamic Routing Between Capsules , 2017, NIPS.

[37]  Xiaowei Zhou,et al.  Polar Transformer Networks , 2017, ICLR.

[38]  Roland Vollgraf,et al.  Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[39]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[40]  Barnabás Póczos,et al.  Equivariance Through Parameter-Sharing , 2017, ICML.

[41]  Max Welling,et al.  Steerable CNNs , 2016, ICLR.

[42]  Stephan J. Garbin,et al.  Harmonic Networks: Deep Translation and Rotation Equivariance , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Ramesh Raskar,et al.  Designing Neural Network Architectures using Reinforcement Learning , 2016, ICLR.

[44]  Quoc V. Le,et al.  Neural Architecture Search with Reinforcement Learning , 2016, ICLR.

[45]  Max Welling,et al.  Group Equivariant Convolutional Networks , 2016, ICML.

[46]  Andrew Zisserman,et al.  Spatial Transformer Networks , 2015, NIPS.

[47]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[48]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[49]  Pedro M. Domingos,et al.  Deep Symmetry Networks , 2014, NIPS.

[50]  Robert Babuska,et al.  Experience Replay for Real-Time Reinforcement Learning Control , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[51]  Benjamin Baumslag,et al.  A Simple Way of Proving the Jordan-Hölder-Schreier Theorem , 2006, Am. Math. Mon..

[52]  Mehryar Mohri,et al.  Multi-armed Bandit Algorithms and Empirical Evaluation , 2005, ECML.

[53]  Kenkichi Iwasawa,et al.  On Some Types of Topological Groups , 1949 .

[54]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[55]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[56]  Long-Ji Lin,et al.  Reinforcement learning for robots using neural networks , 1992 .