Symbolic parallel adaptive importance sampling for probabilistic program analysis

Probabilistic software analysis aims at quantifying the probability of a target event occurring during the execution of a program processing uncertain incoming data or written itself using probabilistic programming constructs. Recent techniques combine classic static analysis methods with inference procedure to obtain accurate quantification of the probability of rare target events, such as failures in a mission-critical system. However, current techniques face several scalability and applicability limitations when analyzing software processing with high-dimensional multivariate distributions. In this paper, we present SYMbolic Parallel Adaptive Importance Sampling (SYMPAIS), a new algorithm that combines symbolic execution with adaptive importance sampling to analyze probabilistic programs. Our method provides a general solution that scales to systems with high-dimensional inputs and demonstrates superior performance in quantifying rare events compared to prior work. Preliminary experimental results support the potential efficacy of our solution.

[1]  D. Dunson,et al.  Discontinuous Hamiltonian Monte Carlo for discrete parameters and discontinuous likelihoods , 2017, 1705.08510.

[2]  Marcelo F. Frias,et al.  Model Counting for Complex Data Structures , 2015, SPIN.

[3]  Matthew B. Dwyer,et al.  Probabilistic Program Analysis , 2015, GTTSE.

[4]  Corina S. Pasareanu,et al.  Symbolic PathFinder: symbolic execution of Java bytecode , 2010, ASE.

[5]  Matthew B. Dwyer,et al.  Probabilistic symbolic execution , 2012, ISSTA 2012.

[6]  Timon Gehr,et al.  PSI: Exact Symbolic Inference for Probabilistic Programs , 2016, CAV.

[7]  Jukka Corander,et al.  Layered adaptive importance sampling , 2015, Statistics and Computing.

[8]  Radford M. Neal MCMC Using Hamiltonian Dynamics , 2011, 1206.1901.

[9]  Sriram K. Rajamani,et al.  Efficiently Sampling Probabilistic Programs via Program Analysis , 2013, AISTATS.

[10]  Frédéric Benhamou,et al.  Algorithm 852: RealPaver: an interval solver using constraint satisfaction techniques , 2006, TOMS.

[11]  Corina S. Pasareanu,et al.  Statistical symbolic execution with informed sampling , 2014, Software Engineering & Management.

[12]  Matthew B. Dwyer,et al.  Exact and approximate probabilistic symbolic execution for nondeterministic programs , 2014, ASE.

[13]  Koushik Sen,et al.  DART: directed automated random testing , 2005, PLDI '05.

[14]  Andrew Gelman,et al.  The No-U-turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo , 2011, J. Mach. Learn. Res..

[15]  Tevfik Bultan,et al.  Symbolic path cost analysis for side-channel detection , 2018, ISSTA.

[16]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[17]  Marcelo d'Amorim,et al.  Iterative distribution-aware sampling for probabilistic symbolic execution , 2015, ESEC/SIGSOFT FSE.

[18]  James C. King,et al.  Symbolic execution and program testing , 1976, CACM.

[19]  Komei Fukuda,et al.  Exact volume computation for polytopes: a practical study , 1996 .

[20]  Yee Whye Teh,et al.  Divide, Conquer, and Combine: a New Inference Strategy for Probabilistic Programs with Stochastic Support , 2019, ICML.

[21]  Corina S. Pasareanu,et al.  Symbolic Side-Channel Analysis for Probabilistic Programs , 2018, 2018 IEEE 31st Computer Security Foundations Symposium (CSF).

[22]  M. Betancourt Nested Sampling with Constrained Hamiltonian Monte Carlo , 2010, 1005.0157.