Extending Stan for Deep Probabilistic Programming

Deep probabilistic programming combines deep neural networks (for automatic hierarchical representation learning) with probabilistic models (for principled handling of uncertainty). Unfortunately, it is difficult to write deep probabilistic models, because existing programming frameworks lack concise, high-level, and clean ways to express them. To ease this task, we extend Stan, a popular high-level probabilistic programming language, to use deep neural networks written in PyTorch. Training deep probabilistic models works best with variational inference, so we also extend Stan for that. We implement these extensions by translating Stan programs to Pyro. Our translation clarifies the relationship between different families of probabilistic programming languages. Overall, our paper is a step towards making deep probabilistic programming easier.

[1]  Andrew D. Gordon,et al.  SlicStan : Improving Probabilistic Programming using Information Flow Analysis Extended Abstract , 2017 .

[2]  Thomas B. Schön,et al.  Automated learning with a probabilistic programming language: Birch , 2018, Annu. Rev. Control..

[3]  David J. Lunn,et al.  The BUGS Book: A Practical Introduction to Bayesian Analysis , 2013 .

[4]  Ohad Kammar,et al.  Semantics for probabilistic programming: higher-order functions, continuous distributions, and soft constraints , 2016, 2016 31st Annual ACM/IEEE Symposium on Logic in Computer Science (LICS).

[5]  Joseph Hilbe,et al.  Data Analysis Using Regression and Multilevel/Hierarchical Models , 2009 .

[6]  Yura N. Perov,et al.  Venture: a higher-order probabilistic programming platform with programmable inference , 2014, ArXiv.

[7]  Andrew Gelman,et al.  Automatic Variational Inference in Stan , 2015, NIPS.

[8]  Robert E Weiss,et al.  Bayesian methods for data analysis. , 2010, American journal of ophthalmology.

[9]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[10]  Stuart J. Russell,et al.  BLOG: Probabilistic Models with Unknown Objects , 2005, IJCAI.

[11]  John Salvatier,et al.  Probabilistic programming in Python using PyMC3 , 2016, PeerJ Comput. Sci..

[12]  David M. Blei,et al.  Variational Inference: A Review for Statisticians , 2016, ArXiv.

[13]  Luc De Raedt,et al.  ProbLog: A Probabilistic Prolog and its Application in Link Discovery , 2007, IJCAI.

[14]  David Tolpin,et al.  Probabilistic Programming in Anglican , 2015, ECML/PKDD.

[15]  Jun Zhu,et al.  ZhuSuan: A Library for Bayesian Deep Learning , 2017, ArXiv.

[16]  Andrew McCallum,et al.  FACTORIE: Probabilistic Programming via Imperatively Defined Factor Graphs , 2009, NIPS.

[17]  Frank D. Wood,et al.  Learning Disentangled Representations with Semi-Supervised Deep Generative Models , 2017, NIPS.

[18]  Dexter Kozen,et al.  Semantics of probabilistic programs , 1979, 20th Annual Symposium on Foundations of Computer Science (sfcs 1979).

[19]  Thomas A. Henzinger,et al.  Probabilistic programming , 2014, FOSE.

[20]  A. Pfeffer,et al.  Figaro : An Object-Oriented Probabilistic Programming Language , 2009 .

[21]  Andrew D. Gordon,et al.  Probabilistic programming with densities in SlicStan: efficient, flexible, and deterministic , 2018, Proc. ACM Program. Lang..

[22]  Sam Staton,et al.  Commutative Semantics for Probabilistic Programming , 2017, ESOP.

[23]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[24]  Zoubin Ghahramani,et al.  Probabilistic machine learning and artificial intelligence , 2015, Nature.

[25]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[26]  Jiqiang Guo,et al.  Stan: A Probabilistic Programming Language. , 2017, Journal of statistical software.

[27]  Martyn Plummer,et al.  JAGS: A program for analysis of Bayesian graphical models using Gibbs sampling , 2003 .

[28]  Noah D. Goodman,et al.  Pyro: Deep Universal Probabilistic Programming , 2018, J. Mach. Learn. Res..

[29]  Max Welling,et al.  Semi-supervised Learning with Deep Generative Models , 2014, NIPS.

[30]  Andrew Thomas,et al.  The BUGS project: Evolution, critique and future directions , 2009, Statistics in medicine.

[31]  Hongseok Yang,et al.  An Introduction to Probabilistic Programming , 2018, ArXiv.

[32]  Vikash K. Mansinghka,et al.  Gen: a general-purpose probabilistic programming system with programmable inference , 2019, PLDI.

[33]  Dustin Tran,et al.  Deep Probabilistic Programming , 2017, ICLR.

[34]  John K Kruschke,et al.  Bayesian data analysis. , 2010, Wiley interdisciplinary reviews. Cognitive science.

[35]  Prithviraj Banerjee,et al.  Computing Array Shapes in MATLAB , 2001, LCPC.

[36]  Avi Pfeffer,et al.  IBAL: A Probabilistic Rational Programming Language , 2001, IJCAI.

[37]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[38]  David Tolpin,et al.  Design and Implementation of Probabilistic Programming Language Anglican , 2016, IFL 2016.

[39]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[40]  Uri Shalit,et al.  Structured Inference Networks for Nonlinear State Space Models , 2016, AAAI.

[41]  Andrew Gelman,et al.  The No-U-turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo , 2011, J. Mach. Learn. Res..

[42]  Luc De Raedt,et al.  Probabilistic Inductive Logic Programming , 2004, Probabilistic Inductive Logic Programming.

[43]  Joshua B. Tenenbaum,et al.  Church: a language for generative models , 2008, UAI.