论文信息 - Deep generative modelling for amortised variational inference

Deep generative modelling for amortised variational inference

Probabilistic and statistical modelling are the fundamental frameworks that underlie a large proportion of the modern machine learning (ML) techniques. These frameworks allow for the practitioners to develop tailor-made models for their problems that may include their expert knowledge and can learn from data. Learning from data in the Bayesian framework is referred as inference. In general, model-specific inference methods are hard to derive as they require high level of mathematical and statistical dexterity on the practitioner’s part. As a result, there is a large industry of researchers in ML and statistics that work towards developing automatic methods of inference (Carpenter et al., 2017; Tran et al., 2016; Kucukelbir et al., 2016; Ge et al., 2018; Salvatier et al., 2016; Uber, 2017; Lintusaari et al., 2018). These methods are generally model agnostic and are therefore called black-box inference. Recent work has shown that use of deep learning techniques (Rezende and Mohamed, 2015b; Kingma et al., 2016; Srivastava and Sutton, 2017; Mescheder et al., 2017a) within the framework of variational inference (Jordan et al., 1999) not only allows for automatic and accurate inference but does so in a drastically efficient way. The added efficiency comes from the amortisation of the learning cost by using deep neural networks to leverage the smoothness between data points and their posterior parameters. The field of deep learning based amortised variational inference is relatively new and therefore has numerous challenges and issues to be tackled before it can be established as a standard method of inference. To this end, this thesis presents four pieces of original work in the domain of automatic amortised variational inference in statistical models. We first introduce two sets of techniques for amortising variational inference in Bayesian generative models such as the Latent Dirichlet Allocation (Blei et al., 2003) and Pachinko Allocation Machine (Li and McCallum, 2006). These techniques use deep neural networks and stochastic gradient based first order optimisers for inference and can be generically applied for inference in a large number of Bayesian generative models. Similarly, we also introduce a novel variational framework for implicit generative models of data, called VEEGAN. This framework allows for doing inference in statistical models where unlike the Bayesian generative models, a prescribed likelihood function is not available. It makes use of a discriminator based density ratio estimator (Sugiyama et al., 2012) to deal with the intractability of the likelihood function. Implicit generative models such as the generative adversarial networks (Goodfellow et al., 2014) suffer from learning issues like mode collapse (Srivastava et al., 2017) and training instability (Arjovsky et al., 2017). We tackle the mode collapse in GANs using VEE-

Akash Srivastava | Akash Srivastava

[1] Radford M. Neal. Probabilistic Inference Using Markov Chain Monte Carlo Methods , 2011 .

[2] Lucas Theis,et al. Amortised MAP Inference for Image Super-resolution , 2016, ICLR.

[3] Wei Li,et al. Nonparametric Bayes Pachinko Allocation , 2007, UAI.

[4] Geoffrey E. Hinton,et al. Exponential Family Harmoniums with an Application to Information Retrieval , 2004, NIPS.

[5] Radford M. Neal. MCMC Using Hamiltonian Dynamics , 2011, 1206.1901.

[6] Geoffrey E. Hinton,et al. The Helmholtz Machine , 1995, Neural Computation.

[7] Thore Graepel,et al. Kernel Topic Models , 2011, AISTATS.

[8] David M. Mimno,et al. Reconstructing Pompeian Households , 2011, UAI.

[9] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[10] Aapo Hyvärinen,et al. Noise-contrastive estimation: A new estimation principle for unnormalized statistical models , 2010, AISTATS.

[11] Mario Lucic,et al. Are GANs Created Equal? A Large-Scale Study , 2017, NeurIPS.