ωPAP Spaces: Reasoning Denotationally About Higher-Order, Recursive Probabilistic and Differentiable Programs

We introduce a new setting, the category of $\omega$PAP spaces, for reasoning denotationally about expressive differentiable and probabilistic programming languages. Our semantics is general enough to assign meanings to most practical probabilistic and differentiable programs, including those that use general recursion, higher-order functions, discontinuous primitives, and both discrete and continuous sampling. But crucially, it is also specific enough to exclude many pathological denotations, enabling us to establish new results about both deterministic differentiable programs and probabilistic programs. In the deterministic setting, we prove very general correctness theorems for automatic differentiation and its use within gradient descent. In the probabilistic setting, we establish the almost-everywhere differentiability of probabilistic programs' trace density functions, and the existence of convenient base measures for density computation in Monte Carlo inference. In some cases these results were previously known, but required detailed proofs with an operational flavor; by contrast, all our proofs work directly with programs' denotations.

[1]  A. Shaikhha,et al.  Denotationally Correct, Purely Functional, Efficient Reverse-mode Automatic Differentiation , 2022, ArXiv.

[2]  Vikash K. Mansinghka,et al.  ADEV: Sound Automatic Differentiation of Expected Values of Probabilistic Programs , 2022, Proc. ACM Program. Lang..

[3]  Hongseok Yang,et al.  Smoothness Analysis for Probabilistic Programs with Application to Optimised Variational Inference , 2022, Proc. ACM Program. Lang..

[4]  T. Smeding,et al.  Efficient Dual-Numbers Reverse AD via Well-Known Program Transformations , 2022, Proc. ACM Program. Lang..

[5]  Sean K. Moss,et al.  Concrete categories and higher-order recursion: With applications including probability, differentiability, and full abstraction , 2022, LICS.

[6]  Matthew J. Johnson,et al.  You Only Linearize Once: Tangents Transpose to Gradients , 2022, Proc. ACM Program. Lang..

[7]  A. Fitzgibbon,et al.  Provably correct, asymptotically efficient, higher-order reverse-mode automatic differentiation , 2022, Proc. ACM Program. Lang..

[8]  Matthijs V'ak'ar,et al.  CHAD: Combinatory Homomorphic Automatic Differentiation , 2021, ACM Trans. Program. Lang. Syst..

[9]  Damiano Mazza,et al.  Automatic differentiation in PCF , 2020, Proc. ACM Program. Lang..

[10]  Boris Alexeev,et al.  The Base Measure Problem and its Solution , 2020, AISTATS.

[11]  Matthijs V'ak'ar,et al.  Denotational Correctness of Foward-Mode Automatic Differentiation for Iteration and Recursion , 2020, ArXiv.

[12]  Edouard Pauwels,et al.  A mathematical model for automatic differentiation in machine learning , 2020, NeurIPS.

[13]  Hongseok Yang,et al.  On Correctness of Automatic Differentiation for Non-Differentiable Functions , 2020, NeurIPS.

[14]  PRAVEEN NARAYANAN,et al.  Symbolic Disintegration with a Variety of Base Measures , 2020, ACM Trans. Program. Lang. Syst..

[15]  C.-H. Luke Ong,et al.  Densities of Almost Surely Terminating Probabilistic Programs are Differentiable Almost Everywhere , 2020, ESOP.

[16]  S. Staton,et al.  Correctness of Automatic Differentiation via Diffeologies and Categorical Gluing , 2020, FoSSaCS.

[17]  Vikash K. Mansinghka,et al.  Trace types and denotational semantics for sound programmable inference in probabilistic languages , 2019, Proc. ACM Program. Lang..

[18]  J. Robin B. Cockett,et al.  Reverse derivative categories , 2019, CSL.

[19]  Michele Pagani,et al.  Backpropagation in the simply typed lambda-calculus with linear negation , 2019, Proc. ACM Program. Lang..

[20]  Ohad Kammar,et al.  A domain theory for statistical probabilistic programming , 2018, Proc. ACM Program. Lang..

[21]  Luke Ong,et al.  On S-Finite Measures and Kernels , 2018, 1810.01837.

[22]  Hongseok Yang,et al.  Reparameterization Gradient for Non-differentiable Models , 2018, NeurIPS.

[23]  Francis Bach,et al.  On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport , 2018, NeurIPS.

[24]  Ohad Kammar,et al.  Factorisation systems for logical relations and monadic lifting in type-and-effect system semantics , 2018, MFPS.

[25]  Thomas Ehrhard,et al.  Measurable cones and stable, measurable functions: a model for probabilistic higher-order programming , 2017, Proc. ACM Program. Lang..

[26]  Ohad Kammar,et al.  Denotational validation of higher-order Bayesian inference , 2017, Proc. ACM Program. Lang..

[27]  Bart Jacobs,et al.  Disintegration and Bayesian inversion via string diagrams , 2017, Mathematical Structures in Computer Science.

[28]  Sam Staton,et al.  Commutative Semantics for Probabilistic Programming , 2017, ESOP.

[29]  Ohad Kammar,et al.  A convenient category for higher-order probability theory , 2017, 2017 32nd Annual ACM/IEEE Symposium on Logic in Computer Science (LICS).

[30]  Norman Ramsey,et al.  Exact Bayesian inference by symbolic disintegration , 2017, POPL.

[31]  B. Mityagin The Zero Set of a Real Analytic Function , 2015, Mathematical Notes.

[32]  J. Robin B. Cockett,et al.  Differential Structure, Tangent Structure, and SDG , 2014, Appl. Categorical Struct..

[33]  Thomas Ehrhard,et al.  A convenient differential category , 2010, ArXiv.

[34]  Daniel M. Roy,et al.  Computable de Finetti measures , 2009, Ann. Pure Appl. Log..

[35]  John C. Baez,et al.  Convenient Categories of Smooth Spaces , 2008, 0807.1704.

[36]  Pawel Sobocinski,et al.  Quasitoposes, Quasiadhesive Categories and Artin Glueing , 2007, CALCO.

[37]  Shin-ya Katsumata,et al.  A Semantic Formulation of TT-Lifting and Logical Predicates for Computational Metalanguage , 2005, CSL.

[38]  P. Johnstone Sketches of an Elephant: A Topos Theory Compendium Volume 1 , 2002 .

[39]  Bart Jacobs,et al.  Categorical Logic and Type Theory , 2001, Studies in logic and the foundations of mathematics.

[40]  Samson Abramsky,et al.  Call-by-Value Games , 1997, CSL.

[41]  Peter T. Johnstone,et al.  Connected limits, familial representability and Artin glueing , 1995, Mathematical Structures in Computer Science.

[42]  John C. Mitchell,et al.  Notes on Sconing and Relators , 1992, CSL.

[43]  Eugenio Moggi,et al.  Notions of Computation and Monads , 1991, Inf. Comput..

[44]  Shin-ya Katsumata,et al.  Relating computational effects by ⊤⊤-lifting , 2013, Inf. Comput..

[45]  I. Moerdijk,et al.  Sheaves in geometry and logic: a first introduction to topos theory , 1992 .