Exploring the common principal subspace of deep features in neural networks

Abstract We find that different Deep Neural Networks (DNNs) trained with the same dataset share a common principal subspace in latent spaces, no matter in which architectures (e.g., Convolutional Neural Networks (CNNs), Multi-Layer Preceptors (MLPs) and Autoencoders (AEs)) the DNNs were built or even whether labels have been used in training (e.g., supervised, unsupervised, and self-supervised learning). Specifically, we design a new metric P-vector to represent the principal subspace of deep features learned in a DNN, and propose to measure angles between the principal subspaces using Pvectors. Small angles (with cosine close to 1.0) have been found in the comparisons between any two DNNs trained with different algorithms/architectures. Furthermore, during the training procedure from random scratch, the angle decrease from a larger one (70◦−80◦ usually) to the small one, which coincides the progress of feature space learning from scratch to convergence. Then, we carry out case studies to measure the angle between the P-vector and the principal subspace of training dataset, and connect such angle with generalization performance. Extensive experiments with practically-used Multi-Layer Perceptron (MLPs), AEs and CNNs for classification, image reconstruction, and self-supervised learning tasks on MNIST, CIFAR-10 and CIFAR-100 datasets

[1]  H. Abdi,et al.  Principal component analysis , 2010 .

[2]  Geoffrey E. Hinton,et al.  A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.

[3]  Alexei A. Efros,et al.  Generative Visual Manipulation on the Natural Image Manifold , 2016, ECCV.

[4]  Surya Ganguli,et al.  A mathematical theory of semantic development in deep neural networks , 2018, Proceedings of the National Academy of Sciences.

[5]  Dongrui Wu,et al.  Empirical Studies on the Properties of Linear Regions in Deep Neural Networks , 2020, ICLR.

[6]  Tom White,et al.  Sampling Generative Networks: Notes on a Few Effective Techniques , 2016, ArXiv.

[7]  Andrew Zisserman,et al.  Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[8]  Lars Kai Hansen,et al.  Latent Space Oddity: on the Curvature of Deep Generative Models , 2017, ICLR.

[9]  Bolei Zhou,et al.  GAN Dissection: Visualizing and Understanding Generative Adversarial Networks , 2018, ICLR.

[10]  Nathan Halko,et al.  Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions , 2009, SIAM Rev..

[11]  Vasyl Pihur,et al.  RankAggreg, an R package for weighted rank aggregation , 2009, BMC Bioinformatics.

[12]  Bolei Zhou,et al.  Network Dissection: Quantifying Interpretability of Deep Visual Representations , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[14]  Yair Weiss,et al.  On GANs and GMMs , 2018, NeurIPS.

[15]  Phillip Isola,et al.  On the "steerability" of generative adversarial networks , 2019, ICLR.

[16]  Sebastian Gehrmann,et al.  Interactive Visual Exploration of Latent Space (IVELS) for peptide auto-encoder model selection , 2019, DGS@ICLR.

[17]  Oliver Deussen,et al.  Towards an Interpretable Latent Space – An Intuitive Comparison of Autoencoders with Variational Autoencoders , 2018 .

[18]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[19]  David Berthelot,et al.  Understanding and Improving Interpolation in Autoencoders via an Adversarial Regularizer , 2018, ICLR.

[20]  Thomas Brox,et al.  Synthesizing the preferred inputs for neurons in neural networks via deep generator networks , 2016, NIPS.

[21]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.