Neural Density Estimation and Likelihood-free Inference

I consider two problems in machine learning and statistics: the problem of estimating the joint probability density of a collection of random variables, known as density estimation, and the problem of inferring model parameters when their likelihood is intractable, known as likelihood-free inference. The contribution of the thesis is a set of new methods for addressing these problems that are based on recent advances in neural networks and deep learning.

[1]  Barnabás Póczos,et al.  Transformation Autoregressive Networks , 2018, ICML.

[2]  Cassia Valentini-Botinhao,et al.  Modelling acoustic feature dependencies with artificial neural networks: Trajectory-RNADE , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[3]  V. A. Epanechnikov Non-Parametric Estimation of a Multivariate Probability Density , 1969 .

[4]  Jiajun Wu,et al.  Galileo: Perceiving Physical Object Properties by Integrating a Physics Engine with Deep Learning , 2015, NIPS.

[5]  Aaron Smith,et al.  The use of a single pseudo-sample in approximate Bayesian computation , 2014, Stat. Comput..

[6]  Sean Gerrish,et al.  Black Box Variational Inference , 2013, AISTATS.

[7]  S. Wood Statistical inference for noisy nonlinear ecological dynamic systems , 2010, Nature.

[8]  Samy Bengio,et al.  Density estimation using Real NVP , 2016, ICLR.

[9]  Patrick Billingsley,et al.  Probability and Measure. , 1986 .

[10]  Jorge Nocedal,et al.  On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima , 2016, ICLR.

[11]  Robert Leenders,et al.  Hamiltonian ABC , 2015, UAI.

[12]  Gilles Louppe,et al.  Likelihood-free inference with an improved cross-entropy estimator , 2018, ArXiv.

[13]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[14]  Peter Skands,et al.  A brief introduction to PYTHIA 8.1 , 2007, Comput. Phys. Commun..

[15]  Iain Murray,et al.  Fast $\epsilon$-free Inference of Simulation Models with Bayesian Conditional Density Estimation , 2016, 1605.06376.

[16]  Alexander A. Alemi,et al.  WAIC, but Why? Generative Ensembles for Robust Anomaly Detection , 2018 .

[17]  Xi Chen,et al.  PixelCNN++: Improving the PixelCNN with Discretized Logistic Mixture Likelihood and Other Modifications , 2017, ICLR.

[18]  Aapo Hyvärinen,et al.  Noise-Contrastive Estimation of Unnormalized Statistical Models, with Applications to Natural Image Statistics , 2012, J. Mach. Learn. Res..

[19]  Geoffrey E. Hinton,et al.  The Helmholtz Machine , 1995, Neural Computation.

[20]  Iain Murray,et al.  Distilling Intractable Generative Models , 2015 .

[21]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[22]  Eric Moulines,et al.  On‐line expectation–maximization algorithm for latent data models , 2007, ArXiv.

[23]  D. Balding,et al.  Approximate Bayesian computation in population genetics. , 2002, Genetics.

[24]  Alex Graves,et al.  Conditional Image Generation with PixelCNN Decoders , 2016, NIPS.

[25]  Dennis Prangle,et al.  Adapting the ABC distance function , 2015, 1507.00874.

[26]  Timo Aila,et al.  A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Mike West,et al.  Sequential Monte Carlo with Adaptive Weights for Approximate Bayesian Computation , 2015, 1503.07791.

[28]  Aapo Hyvärinen,et al.  Nonlinear independent component analysis: Existence and uniqueness results , 1999, Neural Networks.

[29]  M. Beaumont Approximate Bayesian Computation in Evolution and Ecology , 2010 .

[30]  Pierre Baldi,et al.  Parameterized Machine Learning for High-Energy Physics , 2016, ArXiv.

[31]  S. Sisson,et al.  Likelihood-free Markov chain Monte Carlo , 2010, 1001.2058.

[32]  Max Welling,et al.  Emerging Convolutions for Generative Normalizing Flows , 2019, ICML.

[33]  Christian P. Robert,et al.  Stochastic Modelling for Systems Biology (second edition) , 2012 .

[34]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[35]  David J. Nott,et al.  A note on approximating ABC‐MCMC using flexible classifiers , 2014 .

[36]  M. Gutmann,et al.  Fundamentals and Recent Developments in Approximate Bayesian Computation , 2016, Systematic biology.

[37]  Geoffrey E. Hinton,et al.  Attend, Infer, Repeat: Fast Scene Understanding with Generative Models , 2016, NIPS.

[38]  Benjamin D. Wandelt,et al.  Automatic physical inference with information maximising neural networks , 2018, 1802.03537.

[39]  Sungwon Kim,et al.  FloWaveNet : A Generative Flow for Raw Audio , 2018, ICML.

[40]  Radford M. Neal MCMC Using Hamiltonian Dynamics , 2011, 1206.1901.

[41]  Radford M. Neal,et al.  On Bayesian inference for the M/G/1 queue with efficient MCMC sampling , 2014, 1401.5548.

[42]  Hugo Larochelle,et al.  Neural Autoregressive Distribution Estimation , 2016, J. Mach. Learn. Res..

[43]  Iain Murray,et al.  Sequential Neural Methods for Likelihood-free Inference , 2018, ArXiv.

[44]  Christopher K. I. Williams,et al.  Products of Gaussians and Probabilistic Minor Component Analysis , 2002, Neural Computation.

[45]  Sanjiv Kumar,et al.  On the Convergence of Adam and Beyond , 2018 .

[46]  Vikash K. Mansinghka,et al.  Probabilistic programs for inferring the goals of autonomous agents , 2017, ArXiv.

[47]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[48]  Koray Kavukcuoglu,et al.  Pixel Recurrent Neural Networks , 2016, ICML.

[49]  Stefano Ermon,et al.  Flow-GAN: Combining Maximum Likelihood and Adversarial Learning in Generative Models , 2017, AAAI.

[50]  B. Silverman Density estimation for statistics and data analysis , 1986 .

[51]  Michael U. Gutmann,et al.  Bayesian Optimization for Likelihood-Free Inference of Simulator-Based Statistical Models , 2015, J. Mach. Learn. Res..

[52]  Ruslan Salakhutdinov,et al.  On the quantitative analysis of deep belief networks , 2008, ICML '08.

[53]  Paul Fearnhead,et al.  Constructing summary statistics for approximate Bayesian computation: semi‐automatic approximate Bayesian computation , 2012 .

[54]  P. M. Williams,et al.  Using Neural Networks to Model Conditional Multivariate Densities , 1996, Neural Computation.

[55]  Madeleine B. Thompson A Comparison of Methods for Computing Autocorrelation Time , 2010, 1011.0175.

[56]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[57]  D. Gillespie Exact Stochastic Simulation of Coupled Chemical Reactions , 1977 .

[58]  Daniel Wegmann,et al.  Bayesian Computation and Model Selection Without Likelihoods , 2010, Genetics.

[59]  Gilles Louppe,et al.  Mining gold from implicit models to improve likelihood-free inference , 2018, Proceedings of the National Academy of Sciences.

[60]  Matthias Bethge,et al.  A note on the evaluation of generative models , 2015, ICLR.

[61]  Paul Marjoram,et al.  Markov chain Monte Carlo without likelihoods , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[62]  Ning Qian,et al.  On the momentum term in gradient descent learning algorithms , 1999, Neural Networks.

[63]  Peter Dayan,et al.  Comparison of Maximum Likelihood and GAN-based training of Real NVPs , 2017, ArXiv.

[64]  John P. Cunningham,et al.  Maximum Entropy Flow Networks , 2017, ICLR.

[65]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[66]  Geoffrey E. Hinton,et al.  Analysis-by-Synthesis by Learning to Invert Generative Black Boxes , 2008, ICANN.

[67]  Frank D. Wood,et al.  Inference Networks for Sequential Monte Carlo in Graphical Models , 2016, ICML.

[68]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[69]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[70]  Scott A. Sisson,et al.  Extending approximate Bayesian computation methods to high dimensions via a Gaussian copula model , 2015, 1504.04093.

[71]  Thomas Müller,et al.  Neural Importance Sampling , 2018, ACM Trans. Graph..

[72]  Michael E. Tipping,et al.  Probabilistic Principal Component Analysis , 1999 .

[73]  Léon Bottou,et al.  Stochastic Gradient Descent Tricks , 2012, Neural Networks: Tricks of the Trade.

[74]  David T. Frazier,et al.  Bayesian Synthetic Likelihood , 2017, 2305.05120.

[75]  P. Freeman,et al.  Likelihood-Free Inference in Cosmology: Potential for the Estimation of Luminosity Functions , 2012 .

[76]  S. Sisson,et al.  A comparative review of dimension reduction methods in approximate Bayesian computation , 2012, 1202.3819.

[77]  Léon Bottou,et al.  Wasserstein Generative Adversarial Networks , 2017, ICML.

[78]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[79]  D. J. Nott,et al.  Approximate Bayesian computation via regression density estimation , 2012, 1212.1479.

[80]  Iain Murray,et al.  Masked Autoregressive Flow for Density Estimation , 2017, NIPS.

[81]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[82]  Ryan Prenger,et al.  Waveglow: A Flow-based Generative Network for Speech Synthesis , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[83]  Sadique Sheik,et al.  Reservoir computing compensates slow response of chemosensor arrays exposed to fast varying gas concentrations in continuous monitoring , 2015 .

[84]  David Duvenaud,et al.  Invertible Residual Networks , 2018, ICML.

[85]  C. Bishop Mixture density networks , 1994 .

[86]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[87]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[88]  Pieter Abbeel,et al.  Flow++: Improving Flow-Based Generative Models with Variational Dequantization and Architecture Design , 2019, ICML.

[89]  Edward Meeds,et al.  Optimization Monte Carlo: Efficient and Embarrassingly Parallel Likelihood-Free Inference , 2015, Neural Information Processing Systems.

[90]  Joshua B. Tenenbaum,et al.  Picture: A probabilistic programming language for scene perception , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[91]  Carsten Wiuf,et al.  Using Likelihood-Free Inference to Compare Evolutionary Dynamics of the Protein Networks of H. pylori and P. falciparum , 2007, PLoS Comput. Biol..

[92]  Ryan P. Adams,et al.  High-Dimensional Probability Estimation with Deep Density Models , 2013, ArXiv.

[93]  Yun S. Song,et al.  A Likelihood-Free Inference Framework for Population Genetic Data using Exchangeable Neural Networks , 2018, bioRxiv.

[94]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[95]  Aki Vehtari,et al.  Efficient Acquisition Rules for Model-Based Approximate Bayesian Computation , 2017, Bayesian Analysis.

[96]  Nicola De Cao,et al.  Block Neural Autoregressive Flow , 2019, UAI.

[97]  James M. Rehg,et al.  Automatic Variational ABC , 2016, 1606.08549.

[98]  Benjamin Dan Wandelt,et al.  Massive optimal data compression and density estimation for scalable, likelihood-free inference in cosmology , 2018, 1801.01497.

[99]  Alexander M. Rush,et al.  Latent Normalizing Flows for Discrete Sequences , 2019, ICML.

[100]  A. Caticha Relative Entropy and Inductive Inference , 2003, physics/0311093.

[101]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[102]  Max Welling,et al.  GPS-ABC: Gaussian Process Surrogate Approximate Bayesian Computation , 2014, UAI.

[103]  Jukka Corander,et al.  On the Identifiability of Transmission Dynamic Models for Infectious Diseases , 2015, Genetics.

[104]  Benjamin D. Wandelt,et al.  Optimal proposals for Approximate Bayesian Computation. , 2018, 1808.06040.

[105]  Michael U. Gutmann,et al.  Adaptive Gaussian Copula ABC , 2019, AISTATS.

[106]  Tom Charnock,et al.  Fast likelihood-free cosmology with neural density estimators and active learning , 2019, Monthly Notices of the Royal Astronomical Society.

[107]  Nematollah Batmanghelich,et al.  Deep Diffeomorphic Normalizing Flows , 2018, ArXiv.

[108]  Joshua B. Tenenbaum,et al.  Approximate Bayesian Image Interpretation using Generative Probabilistic Graphics Programs , 2013, NIPS.

[109]  Dustin Tran,et al.  Hierarchical Implicit Models and Likelihood-Free Variational Inference , 2017, NIPS.

[110]  Aki Vehtari,et al.  Validating Bayesian Inference Algorithms with Simulation-Based Calibration , 2018, 1804.06788.

[111]  Aapo Hyvärinen,et al.  Estimation of Non-Normalized Statistical Models by Score Matching , 2005, J. Mach. Learn. Res..

[112]  Max Welling,et al.  Improved Variational Inference with Inverse Autoregressive Flow , 2016, NIPS 2016.

[113]  Henry Markram,et al.  Minimal Hodgkin–Huxley type models for different classes of cortical and thalamic neurons , 2008, Biological Cybernetics.

[114]  Heiga Zen,et al.  Parallel WaveNet: Fast High-Fidelity Speech Synthesis , 2017, ICML.

[115]  Tom Minka,et al.  Expectation Propagation for approximate Bayesian inference , 2001, UAI.

[116]  Amos J. Storkey,et al.  Towards a Neural Statistician , 2016, ICLR.

[117]  Mark M. Tanaka,et al.  Sequential Monte Carlo without likelihoods , 2007, Proceedings of the National Academy of Sciences.

[118]  Olivier François,et al.  Non-linear regression models for Approximate Bayesian Computation , 2008, Stat. Comput..

[119]  M. Feldman,et al.  Population growth of human Y chromosomes: a study of Y chromosome microsatellites. , 1999, Molecular biology and evolution.

[120]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[121]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[122]  Darren J. Wilkinson,et al.  Stochastic Modelling for Systems Biology, Third Edition , 2018 .

[123]  Zoubin Ghahramani,et al.  Training generative neural networks via Maximum Mean Discrepancy optimization , 2015, UAI.

[124]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[125]  Andrew Gelman,et al.  Automatic Variational Inference in Stan , 2015, NIPS.

[126]  Matthew J. Johnson,et al.  The LORACs prior for VAEs: Letting the Trees Speak for the Data , 2018, AISTATS.

[127]  Jakob H. Macke,et al.  Likelihood-free inference with emulator networks , 2018, AABI.

[128]  David J. Nott,et al.  Variational Bayes With Intractable Likelihood , 2015, 1503.08621.

[129]  Quaid Morris,et al.  Recognition Networks for Approximate Inference in BN20 Networks , 2001, UAI.

[130]  G. Somjen Ion Regulation in the Brain: Implications for Pathophysiology , 2002, The Neuroscientist : a review journal bringing neurobiology, neurology and psychiatry.

[131]  Radford M. Neal Slice Sampling , 2003, The Annals of Statistics.

[132]  Ariel D. Procaccia,et al.  Variational Dropout and the Local Reparameterization Trick , 2015, NIPS.

[133]  Hugo Larochelle,et al.  A Deep and Tractable Density Estimator , 2013, ICML.

[134]  Pushmeet Kohli,et al.  Vision-as-Inverse-Graphics: Obtaining a Rich 3D Explanation of a Scene from a Single Image , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[135]  Richard E. Turner,et al.  Neural Adaptive Sequential Monte Carlo , 2015, NIPS.

[136]  Hugo Larochelle,et al.  RNADE: The real-valued neural autoregressive density-estimator , 2013, NIPS.

[137]  John Salvatier,et al.  Theano: A Python framework for fast computation of mathematical expressions , 2016, ArXiv.

[138]  Jakob H. Macke,et al.  Flexible statistical inference for mechanistic models of neural dynamics , 2017, NIPS.

[139]  Christopher C. Drovandi,et al.  Variational Bayes with synthetic likelihood , 2016, Statistics and Computing.

[140]  Frank D. Wood,et al.  Inference Compilation and Universal Probabilistic Programming , 2016, AISTATS.

[141]  Andriy Mnih,et al.  Resampled Priors for Variational Autoencoders , 2018, AISTATS.

[142]  Charlie Nash,et al.  Autoregressive Energy Machines , 2019, ICML.

[143]  Radford M. Neal Probabilistic Inference Using Markov Chain Monte Carlo Methods , 2011 .

[144]  David Welch,et al.  Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems , 2009, Journal of The Royal Society Interface.

[145]  Jean-Michel Marin,et al.  Approximate Bayesian computational methods , 2011, Statistics and Computing.

[146]  Valero Laparra,et al.  Density Modeling of Images using a Generalized Normalization Transformation , 2015, ICLR.

[147]  Dustin Tran,et al.  Simple, Distributed, and Accelerated Probabilistic Programming , 2018, NeurIPS.

[148]  Ramesh A. Gopinath,et al.  Gaussianization , 2000, NIPS.

[149]  Fabio Viola,et al.  Learning and Querying Fast Generative Models for Reinforcement Learning , 2018, ArXiv.

[150]  James G. King,et al.  Reconstruction and Simulation of Neocortical Microcircuitry , 2015, Cell.

[151]  David Duvenaud,et al.  FFJORD: Free-form Continuous Dynamics for Scalable Reversible Generative Models , 2018, ICLR.

[152]  Max Welling,et al.  Extreme Components Analysis , 2003, NIPS.

[153]  Bernhard Schölkopf,et al.  A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..

[154]  E Weinan,et al.  Monge-Ampère Flow for Generative Modeling , 2018, ArXiv.

[155]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[156]  Hugo Larochelle,et al.  MADE: Masked Autoencoder for Distribution Estimation , 2015, ICML.

[157]  Alexander A. Alemi,et al.  Fixing a Broken ELBO , 2017, ICML.

[158]  Heiga Zen,et al.  WaveNet: A Generative Model for Raw Audio , 2016, SSW.

[159]  Larry Wasserman,et al.  All of Statistics: A Concise Course in Statistical Inference , 2004 .