Random Matrix Theory Proves that Deep Learning Representations of GAN-data Behave as Gaussian Mixtures

This paper shows that deep learning (DL) representations of data produced by generative adversarial nets (GANs) are random vectors which fall within the class of so-called \textit{concentrated} random vectors. Further exploiting the fact that Gram matrices, of the type $G = X^T X$ with $X=[x_1,\ldots,x_n]\in \mathbb{R}^{p\times n}$ and $x_i$ independent concentrated random vectors from a mixture model, behave asymptotically (as $n,p\to \infty$) as if the $x_i$ were drawn from a Gaussian mixture, suggests that DL representations of GAN-data can be fully described by their first two statistical moments for a wide range of standard classifiers. Our theoretical findings are validated by generating images with the BigGAN model and across different popular deep representation networks.

[1]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[2]  Romain Couillet,et al.  Concentration of Measure and Large Random Matrices with an application to Sample Covariance Matrices , 2018, 1805.08295.

[3]  Amos J. Storkey,et al.  Data Augmentation Generative Adversarial Networks , 2017, ICLR 2018.

[4]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[5]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machines , 2002 .

[6]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[7]  Société de mathématiques appliquées et industrielles,et al.  ESAIM. Probability and statistics , 1997 .

[8]  Yuichi Yoshida,et al.  Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[9]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[10]  R. Couillet,et al.  Kernel spectral clustering of large dimensional data , 2015, 1510.03547.

[11]  Klaus-Robert Müller,et al.  Kernel Analysis of Deep Networks , 2011, J. Mach. Learn. Res..

[12]  Michael I. Jordan,et al.  Advances in Neural Information Processing Systems 30 , 1995 .

[13]  O. Chapelle,et al.  Semi-Supervised Learning (Chapelle, O. et al., Eds.; 2006) [Book reviews] , 2009, IEEE Transactions on Neural Networks.

[14]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[15]  Sebastian Nowozin,et al.  Stabilizing Training of Generative Adversarial Networks through Regularization , 2017, NIPS.

[16]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[17]  Jascha Sohl-Dickstein,et al.  PCA of high dimensional random walks with comparison to neural network training , 2018, NeurIPS.

[18]  M. Ledoux The concentration of measure phenomenon , 2001 .

[19]  Romain Couillet,et al.  A random matrix analysis and improvement of semi-supervised learning for large dimensional data , 2017, J. Mach. Learn. Res..

[20]  W. Hachem,et al.  Deterministic equivalents for certain functionals of large random matrices , 2005, math/0507172.

[21]  Thomas G. Dietterich,et al.  In Advances in Neural Information Processing Systems 12 , 1991, NIPS 1991.

[22]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[23]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Jeff Donahue,et al.  Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[25]  Simon Osindero,et al.  Conditional Generative Adversarial Nets , 2014, ArXiv.

[26]  Zhenyu Liao,et al.  Random matrices meet machine learning: A large dimensional analysis of LS-SVM , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[27]  J. W. Silverstein,et al.  Analysis of the limiting spectral distribution of large dimensional random matrices , 1995 .

[28]  R. Shah,et al.  Least Squares Support Vector Machines , 2022 .

[29]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[30]  Pradeep Ravikumar,et al.  Representer Point Selection for Explaining Deep Neural Networks , 2018, NeurIPS.

[31]  R. Couillet,et al.  Spectral analysis of the Gram matrix of mixture models , 2015, 1510.03463.

[32]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[33]  S. Crawford,et al.  Volume 1 , 2012, Journal of Diabetes Investigation.