The Inductive Bias of Restricted f-GANs

Generative adversarial networks are a novel method for statistical inference that have achieved much empirical success; however, the factors contributing to this success remain ill-understood. In this work, we attempt to analyze generative adversarial learning -- that is, statistical inference as the result of a game between a generator and a discriminator -- with the view of understanding how it differs from classical statistical inference solutions such as maximum likelihood inference and the method of moments. Specifically, we provide a theoretical characterization of the distribution inferred by a simple form of generative adversarial learning called restricted f-GANs -- where the discriminator is a function in a given function class, the distribution induced by the generator is restricted to lie in a pre-specified distribution class and the objective is similar to a variational form of the f-divergence. A consequence of our result is that for linear KL-GANs -- that is, when the discriminator is a linear function over some feature space and f corresponds to the KL-divergence -- the distribution induced by the optimal generator is neither the maximum likelihood nor the method of moments solution, but an interesting combination of both.

[1]  R. Tyrrell Rockafellar Risk and Utility in the Duality Framework of Convex Analysis , 2017 .

[2]  L. Hansen Large Sample Properties of Generalized Method of Moments Estimators , 1982 .

[3]  Gerald B. Folland,et al.  Real Analysis: Modern Techniques and Their Applications , 1984 .

[4]  David Lopez-Paz,et al.  Geometrical Insights for Implicit Generative Modeling , 2017, Braverman Readings in Machine Learning.

[5]  Chun-Liang Li,et al.  Nonparametric Density Estimation under Adversarial Losses , 2018, NeurIPS.

[6]  Martin J. Wainwright,et al.  Estimating Divergence Functionals and the Likelihood Ratio by Convex Risk Minimization , 2008, IEEE Transactions on Information Theory.

[7]  Sebastian Nowozin,et al.  f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization , 2016, NIPS.

[8]  C. Zălinescu Convex analysis in general vector spaces , 2002 .

[9]  J. Zico Kolter,et al.  Gradient descent GAN optimization is locally stable , 2017, NIPS.

[10]  Samuel A. Barnett,et al.  Convergence Problems with Generative Adversarial Networks (GANs) , 2018, ArXiv.

[11]  Sebastian Nowozin,et al.  The Numerics of GANs , 2017, NIPS.

[12]  Yingyu Liang,et al.  Generalization and Equilibrium in Generative Adversarial Nets (GANs) , 2017, ICML.

[13]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[14]  Mark D. Reid,et al.  Tighter Variational Representations of f-Divergences via Restriction to Probability Measures , 2012, ICML.

[15]  Kamalika Chaudhuri,et al.  Approximation and Convergence Properties of Generative Adversarial Learning , 2017, NIPS.

[16]  K Fan,et al.  Minimax Theorems. , 1953, Proceedings of the National Academy of Sciences of the United States of America.

[17]  Maxime Sangnier,et al.  Some Theoretical Properties of GANs , 2018, The Annals of Statistics.

[18]  Johannes O. Royset,et al.  Measures of Residual Risk with Connections to Regression, Risk Tracking, Surrogate Models, and Ambiguity , 2015, SIAM J. Optim..

[19]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[20]  Tengyuan Liang,et al.  How Well Can Generative Adversarial Networks (GAN) Learn Densities: A Nonparametric View , 2017, ArXiv.

[21]  T. H. Hildebrandt,et al.  On bounded linear functional operations , 1934 .

[22]  Richard Nock,et al.  f-GANs in an Information Geometric Nutshell , 2017, NIPS.

[23]  Jerry Li,et al.  Towards Understanding the Dynamics of Generative Adversarial Networks , 2017, ArXiv.

[24]  R. Rockafellar Integrals which are convex functionals. II , 1968 .

[25]  S. M. Ali,et al.  A General Class of Coefficients of Divergence of One Distribution from Another , 1966 .

[26]  Fei Xia,et al.  Understanding GANs: the LQG Setting , 2017, ArXiv.

[27]  Yu Bai,et al.  Approximability of Discriminators Implies Diversity in GANs , 2018, ICLR.

[28]  A. Keziou Dual representation of Φ-divergences and applications , 2003 .