Denotational validation of higher-order Bayesian inference

We present a modular semantic account of Bayesian inference algorithms for probabilistic programming languages, as used in data science and machine learning. Sophisticated inference algorithms are often explained in terms of composition of smaller parts. However, neither their theoretical justification nor their implementation reflects this modularity. We show how to conceptualise and analyse such inference algorithms as manipulating intermediate representations of probabilistic programs using higher-order functions and inductive types, and their denotational semantics. Semantic accounts of continuous distributions use measurable spaces. However, our use of higher-order functions presents a substantial technical difficulty: it is impossible to define a measurable space structure over the collection of measurable functions between arbitrary measurable spaces that is compatible with standard operations on those functions, such as function application. We overcome this difficulty using quasi-Borel spaces, a recently proposed mathematical structure that supports both function spaces and continuous distributions. We define a class of semantic structures for representing probabilistic programs, and semantic validity criteria for transformations of these representations in terms of distribution preservation. We develop a collection of building blocks for composing representations. We use these building blocks to validate common inference algorithms such as Sequential Monte Carlo and Markov Chain Monte Carlo. To emphasize the connection between the semantic manipulation and its traditional measure theoretic origins, we use Kock's synthetic measure theory. We demonstrate its usefulness by proving a quasi-Borel counterpart to the Metropolis-Hastings-Green theorem.

[1]  R. Aumann Borel structures for function spaces , 1961 .

[2]  A. Kock Strong functors and monoidal monads , 1972 .

[3]  G. Kelly A unified treatment of transfinite constructions for free algebras, free monoids, colimits, associated sheaves, and so on , 1980, Bulletin of the Australian Mathematical Society.

[4]  Eugenio Moggi,et al.  Computational lambda-calculus and monads , 1989, [1989] Proceedings. Fourth Annual Symposium on Logic in Computer Science.

[5]  Matthias Felleisen,et al.  On the Expressive Power of Programming Languages , 1990, European Symposium on Programming.

[6]  Graham Hutton,et al.  A tutorial on the universality and expressiveness of fold , 1999, Journal of Functional Programming.

[7]  Norman Ramsey,et al.  Stochastic lambda calculus and monads of probability distributions , 2002, POPL '02.

[8]  Sebastian Thrun,et al.  A probabilistic language based upon sampling functions , 2005, POPL '05.

[9]  Philip J. Scott,et al.  A categorical model for the geometry of interaction , 2006, Theor. Comput. Sci..

[10]  J. H. Geuvers,et al.  Iteration and primitive recursion in categorical terms , 2007 .

[11]  A. Doucet,et al.  A Tutorial on Particle Filtering and Smoothing: Fifteen years later , 2008 .

[12]  Joshua B. Tenenbaum,et al.  Church: a language for generative models , 2008, UAI.

[13]  Mauro Javier Jaskelioff Lifting of operations in modular monadic semantics , 2009 .

[14]  Nicolai Schipper Jespersen,et al.  An Introduction to Markov Chain Monte Carlo , 2010 .

[15]  F. Marmolejo,et al.  MONADS AS EXTENSION SYSTEMS |NO ITERATION IS NECESSARY , 2010 .

[16]  Noah D. Goodman,et al.  Lightweight Implementations of Probabilistic Programming Languages Via Transformational Compilation , 2011, AISTATS.

[17]  A. Kock COMMUTATIVE MONADS AS A THEORY OF DISTRIBUTIONS , 2011, 1108.5952.

[18]  David Wingate,et al.  Automated Variational Inference in Probabilistic Programming , 2013, ArXiv.

[19]  Lawrence M. Murray Bayesian State-Space Modelling on High-Performance Hardware Using LibBi , 2013, 1306.3277.

[20]  Claudio V. Russo,et al.  Tabular: a schema-driven probabilistic programming language , 2014, POPL.

[21]  Frank D. Wood,et al.  A New Approach to Probabilistic Programming Inference , 2014, AISTATS.

[22]  Yura N. Perov,et al.  Venture: a higher-order probabilistic programming platform with programmable inference , 2014, ArXiv.

[23]  Chung-Kil Hur,et al.  A Provably Correct Sampler for Probabilistic Programs , 2015, FSTTCS.

[24]  Zoubin Ghahramani,et al.  Practical probabilistic programming with monads , 2015, Haskell.

[25]  Andrew Gelman,et al.  Automatic Variational Inference in Stan , 2015, NIPS.

[26]  Ugo Dal Lago,et al.  A lambda-calculus foundation for universal probabilistic programming , 2015, ICFP.

[27]  Jacques Carette,et al.  Probabilistic Inference by Program Transformation in Hakaru (System Description) , 2016, FLOPS.

[28]  Maciej Piróg,et al.  Eilenberg-Moore Monoids and Backtracking Monad Transformers , 2016, MSFP.

[29]  Marcelo Fiore,et al.  List Objects with Algebraic Structure , 2017, FSCD.

[30]  Dustin Tran,et al.  Deep Probabilistic Programming , 2017, ICLR.

[31]  Jiqiang Guo,et al.  Stan: A Probabilistic Programming Language. , 2017, Journal of statistical software.

[32]  Ohad Kammar,et al.  A convenient category for higher-order probability theory , 2017, 2017 32nd Annual ACM/IEEE Symposium on Logic in Computer Science (LICS).

[33]  Sam Staton,et al.  Commutative Semantics for Probabilistic Programming , 2017, ESOP.

[34]  Frank D. Wood,et al.  Inference Compilation and Universal Probabilistic Programming , 2016, AISTATS.

[35]  Chung-chieh Shan,et al.  Composing Inference Algorithms as Program Transformations , 2016, UAI.

[36]  Bart Jacobs,et al.  From probability monads to commutative effectuses , 2018, J. Log. Algebraic Methods Program..