Tail Bound Analysis for Probabilistic Programs via Central Moments

For probabilistic programs, it is usually not possible to automatically derive exact information about their properties, such as the distribution of states at a given program point. Instead, one can attempt to derive approximations, such as upper bounds on tail probabilities. Such bounds can be obtained via concentration inequalities, which rely on the moments of a distribution, such as the expectation (the first raw moment) or the variance (the second central moment). Tail bounds obtained using central moments are often tighter than the ones obtained using raw moments, but automatically analyzing higher moments is more challenging. This paper presents an analysis for probabilistic programs that automatically derives symbolic over- and under-approximations for variances, as well as higher central moments. To overcome the challenges of higher-moment analysis, it generalizes analyses for expectations with an algebraic abstraction that simultaneously analyzes different moments, utilizing relations between them. The analysis is proved sound with respect to a trace-based, small-step model that maps programs to Markov chains. A key innovation is the notion of semantic optional stopping, and a generalization of the classical optional-stopping theorem. The analysis has been implemented using a template-based technique that reduces the inference of polynomial approximations to linear programming. Experiments with our prototype central-moment analyzer show that, despite the analyzer's over-/under-approximations of various quantities, it obtains tighter tail bounds than a prior system that uses only raw moments, such as expectations.

[1]  Krishnendu Chatterjee,et al.  Termination of Nondeterministic Recursive Probabilistic Programs , 2017, ArXiv.

[2]  Zhong Shao,et al.  Automated Resource Analysis with Coq Proof Objects , 2017, CAV.

[3]  Luke Ong,et al.  On S-Finite Measures and Kernels , 2018, 1810.01837.

[4]  Joost-Pieter Katoen,et al.  Reasoning about Recursive Probabilistic Programs* , 2016, 2016 31st Annual ACM/IEEE Symposium on Logic in Computer Science (LICS).

[5]  Ichiro Hasuo,et al.  Tail Probabilities for Randomized Program Runtimes via Martingales for Higher Moments , 2018, TACAS.

[6]  Zhifei Li,et al.  First- and Second-Order Expectation Semirings with Applications to Minimum-Risk Training on Translation Forests , 2009, EMNLP.

[7]  Annabelle McIver,et al.  Abstraction, Refinement and Proof for Probabilistic Systems , 2004, Monographs in Computer Science.

[8]  Sriram Sankaranarayanan,et al.  Uncertainty Propagation Using Probabilistic Affine Forms and Concentration of Measure Inequalities , 2016, TACAS.

[9]  Joost-Pieter Katoen,et al.  Aiming Low Is Harder - Inductive Proof Rules for Lower Bounds on Weakest Preexpectations in Probabilistic Program Verification , 2019, ArXiv.

[10]  Ugo Dal Lago,et al.  A lambda-calculus foundation for universal probabilistic programming , 2015, ICFP.

[11]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[12]  Yannick Jestin,et al.  An introduction to ACAS Xu and the challenges ahead , 2016, 2016 IEEE/AIAA 35th Digital Avionics Systems Conference (DASC).

[13]  Krishnendu Chatterjee,et al.  Termination Analysis of Probabilistic Programs Through Positivstellensatz's , 2016, CAV.

[14]  Joost-Pieter Katoen,et al.  How long, O Bayesian network, will I sample thee? A program analysis perspective on expected sampling times , 2018, ESOP.

[15]  Dexter Kozen,et al.  Semantics of probabilistic programs , 1979, 20th Annual Symposium on Foundations of Computer Science (sfcs 1979).

[16]  Bertrand Jeannet,et al.  Apron: A Library of Numerical Abstract Domains for Static Analysis , 2009, CAV.

[17]  Eiji Oki,et al.  GLPK (GNU Linear Programming Kit) , 2012 .

[18]  Krishnendu Chatterjee,et al.  Cost analysis of nondeterministic probabilistic programs , 2019, PLDI.

[19]  Holger Hermanns,et al.  Probabilistic Termination , 2015, POPL.

[20]  Reuven Y. Rubinstein,et al.  Simulation and the Monte Carlo method , 1981, Wiley series in probability and mathematical statistics.

[21]  Alessandro Panconesi,et al.  Concentration of Measure for the Analysis of Randomized Algorithms , 2009 .

[22]  Henny B. Sipma,et al.  Synthesis of Linear Ranking Functions , 2001, TACAS.

[23]  Di Wang,et al.  PMAF: an algebraic framework for static analysis of probabilistic programs , 2018, PLDI.

[24]  Thomas A. Henzinger,et al.  Probabilistic programming , 2014, FOSE.

[25]  J. Norris Appendix: probability and measure , 1997 .

[26]  Prakash Panangaden,et al.  The Category of Markov Kernels , 1998, PROBMIV.

[27]  Benjamin Grégoire,et al.  Formal certification of code-based cryptographic proofs , 2009, POPL '09.

[28]  Andreas Podelski,et al.  A Complete Method for the Synthesis of Linear Ranking Functions , 2004, VMCAI.

[29]  Henny B. Sipma,et al.  Linear Ranking with Reachability , 2005, CAV.

[30]  Sriram Sankaranarayanan,et al.  Probabilistic Program Analysis with Martingales , 2013, CAV.

[31]  Gilles Barthe,et al.  Probabilistic Relational Reasoning for Differential Privacy , 2012, TOPL.

[32]  Roman Fric,et al.  A Categorical Approach to Probability Theory , 2010, Stud Logica.

[33]  Allen Van Gelder,et al.  Termination detection in logic programs using argument sizes (extended abstract) , 1991, PODS.

[34]  Van Chan Ngo,et al.  Bounded expectations: resource analysis for probabilistic programs , 2017, PLDI.

[35]  Gregory F. Cooper,et al.  A Bayesian spatio-temporal method for disease outbreak detection , 2010, J. Am. Medical Informatics Assoc..

[36]  Robert W. Floyd,et al.  Assigning Meanings to Programs , 1993 .

[37]  Joost-Pieter Katoen,et al.  Weakest Precondition Reasoning for Expected Run-Times of Probabilistic Programs , 2016, ESOP.

[38]  Krishnendu Chatterjee,et al.  Algorithmic analysis of qualitative and quantitative termination problems for affine probabilistic programs , 2015, POPL.