A lambda-calculus foundation for universal probabilistic programming

We develop the operational semantics of an untyped probabilistic λ-calculus with continuous distributions, and both hard and soft constraints,as a foundation for universal probabilistic programming languages such as Church, Anglican, and Venture. Our first contribution is to adapt the classic operational semantics of λ-calculus to a continuous setting via creating a measure space on terms and defining step-indexed approximations. We prove equivalence of big-step and small-step formulations of this distribution-based semantics. To move closer to inference techniques, we also define the sampling-based semantics of a term as a function from a trace of random samples to a value. We show that the distribution induced by integration over the space of traces equals the distribution-based semantics. Our second contribution is to formalize the implementation technique of trace Markov chain Monte Carlo (MCMC) for our calculus and to show its correctness. A key step is defining sufficient conditions for the distribution induced by trace MCMC to converge to the distribution-based semantics. To the best of our knowledge, this is the first rigorous correctness proof for trace MCMC for a higher-order functional language, or for a language with soft constraints.

[1]  Yura N. Perov,et al.  Venture: a higher-order probabilistic programming platform with programmable inference , 2014, ArXiv.

[2]  Holger Hermanns,et al.  Probabilistic Termination , 2015, POPL.

[3]  Patrick Cousot,et al.  Probabilistic Abstract Interpretation , 2012, ESOP.

[4]  Gerhard Lakemeyer,et al.  Exploring artificial intelligence in the new millennium , 2003 .

[5]  Dexter Kozen,et al.  Semantics of probabilistic programs , 1979, 20th Annual Symposium on Foundations of Computer Science (sfcs 1979).

[6]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[7]  Thomas Ehrhard,et al.  Probabilistic coherence spaces as a model of higher-order probabilistic computation , 2011, Inf. Comput..

[8]  Judea Pearl,et al.  Chapter 2 – BAYESIAN INFERENCE , 1988 .

[9]  Terrence Tao,et al.  An Introduction To Measure Theory , 2011 .

[10]  Prakash Panangaden,et al.  Labelled Markov Processes , 2009 .

[11]  Sebastian Thrun,et al.  Robotic mapping: a survey , 2003 .

[12]  Chung-Kil Hur,et al.  A Provably Correct Sampler for Probabilistic Programs , 2015, FSTTCS.

[13]  Noah D. Goodman The principles and practice of probabilistic programming , 2013, POPL.

[14]  J. Buckley GRAPHS OF MEASURABLE FUNCTIONS , 1974 .

[15]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[16]  Zoubin Ghahramani,et al.  Practical probabilistic programming with monads , 2015, Haskell.

[17]  Norman Ramsey,et al.  Stochastic lambda calculus and monads of probability distributions , 2002, POPL '02.

[18]  Ohad Kammar,et al.  Semantics for probabilistic programming: higher-order functions, continuous distributions, and soft constraints , 2016, 2016 31st Annual ACM/IEEE Symposium on Logic in Computer Science (LICS).

[19]  Sebastian Thrun,et al.  A probabilistic language based upon sampling functions , 2005, POPL '05.

[20]  J. Norris Appendix: probability and measure , 1997 .

[21]  Neil Toronto Trustworthy, Useful Languages for Probabilistic Modeling and Inference , 2014 .

[22]  Silvio Micali,et al.  Probabilistic Encryption , 1984, J. Comput. Syst. Sci..

[23]  Thomas Ehrhard,et al.  Probabilistic coherence spaces are fully abstract for probabilistic PCF , 2014, POPL.

[24]  Problems of the Lightweight Implementation of Probabilistic Programming , 2016 .

[25]  Thomas A. Henzinger,et al.  Probabilistic programming , 2014, FOSE.

[26]  Prakash Panangaden,et al.  The Category of Markov Kernels , 1998, PROBMIV.

[27]  Andrew Gelman,et al.  The No-U-turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo , 2011, J. Mach. Learn. Res..

[28]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[29]  Chung-Kil Hur,et al.  R2: An Efficient MCMC Sampler for Probabilistic Programs , 2014, AAAI.

[30]  A. Gelman,et al.  Stan , 2015 .

[31]  Joshua B. Tenenbaum,et al.  Church: a language for generative models , 2008, UAI.

[32]  Claudio V. Russo,et al.  Deriving Probability Density Functions from Probabilistic Functional Programs , 2017, Log. Methods Comput. Sci..

[33]  C. Jones,et al.  A probabilistic powerdomain of evaluations , 1989, [1989] Proceedings. Fourth Annual Symposium on Logic in Computer Science.

[34]  David Tolpin,et al.  Probabilistic Programming in Anglican , 2015, ECML/PKDD.

[35]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[36]  David Van Horn,et al.  Running Probabilistic Programs Backwards , 2015, ESOP.

[37]  L. Tierney Markov Chains for Exploring Posterior Distributions , 1994 .

[38]  Sebastian Fischer,et al.  Exploring Artificial Intelligence In The New Millennium , 2016 .

[39]  Stuart J. Russell,et al.  Unifying logic and probability , 2015, Commun. ACM.

[40]  Lars Birkedal,et al.  Step-Indexed Logical Relations for Probability , 2015, FoSSaCS.

[41]  Claire Jones,et al.  Probabilistic non-determinism , 1990 .

[42]  Vincent Danos,et al.  Probabilistic game semantics , 2002, TOCL.

[43]  Noah D. Goodman,et al.  Lightweight Implementations of Probabilistic Programming Languages Via Transformational Compilation , 2011, AISTATS.