Slicing probabilistic programs

Probabilistic programs use familiar notation of programming languages to specify probabilistic models. Suppose we are interested in estimating the distribution of the return expression r of a probabilistic program P. We are interested in slicing the probabilistic program P and obtaining a simpler program Sli(P) which retains only those parts of P that are relevant to estimating r, and elides those parts of P that are not relevant to estimating r. We desire that the Sli transformation be both correct and efficient. By correct, we mean that P and Sli(P) have identical estimates on r. By efficient, we mean that estimation over Sli(P) be as fast as possible. We show that the usual notion of program slicing, which traverses control and data dependencies backward from the return expression r, is unsatisfactory for probabilistic programs, since it produces incorrect slices on some programs and sub-optimal ones on others. Our key insight is that in addition to the usual notions of control dependence and data dependence that are used to slice non-probabilistic programs, a new kind of dependence called observe dependence arises naturally due to observe statements in probabilistic programs. We propose a new definition of Sli(P) which is both correct and efficient for probabilistic programs, by including observe dependence in addition to control and data dependences for computing slices. We prove correctness mathematically, and we demonstrate efficiency empirically. We show that by applying the Sli transformation as a pre-pass, we can improve the efficiency of probabilistic inference, not only in our own inference tool R2, but also in other systems for performing inference such as Church and Infer.NET.

[1]  Tom Minka,et al.  TrueSkillTM: A Bayesian Skill Rating System , 2006, NIPS.

[2]  David A. McAllester,et al.  Effective Bayesian Inference for Stochastic Programs , 1997, AAAI/IAAI.

[3]  Sumit Gulwani,et al.  Static analysis for probabilistic programs: inferring whole program properties from finitely many paths , 2013, PLDI.

[4]  Claudio V. Russo,et al.  A model-learner pattern for bayesian reasoning , 2013, POPL.

[5]  Chung-chieh Shan,et al.  Monolingual Probabilistic Programming Using Generalized Coroutines , 2009, UAI.

[6]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[7]  Andrew Gelman,et al.  The No-U-turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo , 2011, J. Mach. Learn. Res..

[8]  Andrew D. Gordon,et al.  Measure Transformer Semantics for Bayesian Machine Learning , 2011, ESOP.

[9]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[10]  Michael Hicks,et al.  Dynamic enforcement of knowledge-based security policies using probabilistic abstract interpretation , 2013, J. Comput. Secur..

[11]  L. Goddard Information Theory , 1962, Nature.

[12]  Thomas Hofmann,et al.  TrueSkill™: A Bayesian Skill Rating System , 2007 .

[13]  Sriram K. Rajamani,et al.  Combining Relational Learning with SMT Solvers Using CEGAR , 2013, CAV.

[14]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[15]  David W. Binkley,et al.  Program slicing , 2008, 2008 Frontiers of Software Maintenance.

[16]  David W. Binkley,et al.  Interprocedural slicing using dependence graphs , 1988, SIGP.

[17]  Avi Pfeffer,et al.  The Design and Implementation of IBAL: A General-Purpose Probabilistic Language , 2005 .

[18]  Sriram K. Rajamani,et al.  Efficiently Sampling Probabilistic Programs via Program Analysis , 2013, AISTATS.

[19]  Mark N. Wegman,et al.  An efficient method of computing static single assignment form , 1989, POPL '89.

[20]  NelsonGreg A generalization of Dijkstra's calculus , 1989 .

[21]  Joe D. Warren,et al.  The program dependence graph and its use in optimization , 1984, TOPL.

[22]  Franz Josef Radermacher,et al.  Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference (Judea Pearl) , 1990, SIAM Rev..

[23]  K. Rustan M. Leino,et al.  Extended static checking , 1998, PROCOMET.

[24]  Matthew Richardson,et al.  The Alchemy System for Statistical Relational AI: User Manual , 2007 .

[25]  Todd Millstein,et al.  Automatic predicate abstraction of C programs , 2001, PLDI '01.

[26]  Walter R. Gilks,et al.  A Language and Program for Complex Bayesian Modelling , 1994 .

[27]  Robert B. Ash,et al.  Information Theory , 2020, The SAGE International Encyclopedia of Mass Media and Society.

[28]  Joshua B. Tenenbaum,et al.  Church: a language for generative models , 2008, UAI.

[29]  James C. Corbett,et al.  A Formal Study of Slicing for Multi-threaded Programs with JVM Concurrency Primitives , 1999, SAS.

[30]  Andrew D. Gordon,et al.  Bayesian Inference for Probabilistic Programs via Symbolic Execution , 2012 .

[31]  Greg Nelson,et al.  A generalization of Dijkstra's calculus , 1989, ACM Trans. Program. Lang. Syst..