Shared Representational Geometry Across Neural Networks

Different neural networks trained on the same dataset often learn similar input-output mappings with very different weights. Is there some correspondence between these neural network solutions? For linear networks, it has been shown that different instances of the same network architecture encode the same representational similarity matrix, and their neural activity patterns are connected by orthogonal transformations. However, it is unclear if this holds for non-linear networks. Using a shared response model, we show that different neural networks encode the same input examples as different orthogonal transformations of an underlying shared representation. We test this claim using both standard convolutional neural networks and residual networks on CIFAR10 and CIFAR100.

[1]  Andrew M. Saxe,et al.  High-dimensional dynamics of generalization error in neural networks , 2017, Neural Networks.

[2]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[3]  Po-Hsuan Chen,et al.  A Reduced-Dimension fMRI Shared Response Model , 2015, NIPS.

[4]  Nenghai Yu,et al.  G-SGD: Optimizing ReLU Neural Networks in its Positively Scale-Invariant Space , 2018, ICLR.

[5]  J. S. Guntupalli,et al.  A Model of Representational Spaces in Human Cortex , 2016, Cerebral cortex.

[6]  Sanjeev Arora,et al.  Mapping between fMRI responses to movies and their natural language annotations , 2016, NeuroImage.

[7]  Yuan Chang Leong,et al.  Shared memories reveal shared structure in neural activity across individuals , 2016, Nature Neuroscience.

[8]  Hod Lipson,et al.  Convergent Learning: Do different neural networks learn the same representations? , 2015, FE@NIPS.

[9]  Nikolaus Kriegeskorte,et al.  Deep Supervised, but Not Unsupervised, Models May Explain IT Cortical Representation , 2014, PLoS Comput. Biol..

[10]  Yida Wang,et al.  Enabling factor analysis on thousand-subject neuroimaging datasets , 2016, 2016 IEEE International Conference on Big Data (Big Data).

[11]  Surya Ganguli,et al.  Exact solutions to the nonlinear dynamics of learning in deep linear neural networks , 2013, ICLR.

[12]  Nikolaus Kriegeskorte,et al.  Frontiers in Systems Neuroscience Systems Neuroscience , 2022 .

[13]  Qi Meng,et al.  Optimizing Neural Networks in the Equivalent Class Space , 2018, ArXiv.

[14]  Bryan R. Conroy,et al.  A Common, High-Dimensional Model of the Representational Space in Human Ventral Temporal Cortex , 2011, Neuron.

[15]  Jascha Sohl-Dickstein,et al.  SVCCA: Singular Vector Canonical Correlation Analysis for Deep Learning Dynamics and Interpretability , 2017, NIPS.

[16]  Samy Bengio,et al.  Insights on representational similarity in neural networks with canonical correlation , 2018, NeurIPS.

[17]  N. Kriegeskorte,et al.  Author ' s personal copy Representational geometry : integrating cognition , computation , and the brain , 2013 .

[18]  Surya Ganguli,et al.  A mathematical theory of semantic development in deep neural networks , 2018, Proceedings of the National Academy of Sciences.

[19]  Surya Ganguli,et al.  Identifying and attacking the saddle point problem in high-dimensional non-convex optimization , 2014, NIPS.