Formal verification of tail distribution bounds in the HOL theorem prover

Tail distribution bounds play a major role in the estimation of failure probabilities in performance and reliability analysis of systems. They are usually estimated using Markov's and Chebyshev's inequalities, which represent tail distribution bounds for a random variable in terms of its mean or variance. This paper presents the formal verification of Markov's and Chebyshev's inequalities for discrete random variables using a higher-order-logic theorem prover. The paper also provides the formal verification of mean and variance relations for some of the widely used discrete random variables, such as Uniform(m), Bernoulli(p), Geometric(p) and Binomial(m, p) random variables. This infrastructure allows us to precisely reason about the tail distribution properties and thus turns out to be quite useful for the analysis of systems used in safety-critical domains, such as space, medicine or transportation. For illustration purposes, we present the performance analysis of the coupon collector's problem, a well-known commercially used algorithm.

[1]  Walter L. Smith Probability and Statistics , 1959, Nature.

[2]  Christine Paulin-Mohring,et al.  Proofs of randomized algorithms in Coq , 2006, Sci. Comput. Program..

[3]  Stefan Richter,et al.  Formalizing Integration Theory with an Application to Probabilistic Algorithms , 2004, TPHOLs.

[4]  Christel Baier,et al.  Model-Checking Algorithms for Continuous-Time Markov Chains , 2002, IEEE Trans. Software Eng..

[5]  Bruce D. McCullough,et al.  Assessing the Reliability of Statistical Software: Part I , 1998 .

[6]  MA John Harrison PhD Theorem Proving with the Real Numbers , 1998, Distinguished Dissertations.

[7]  Lawrence C. Paulson,et al.  ML for the working programmer , 1991 .

[8]  Joe Hurd,et al.  Formal verification of probabilistic algorithms , 2003 .

[9]  Stephan Merz,et al.  Model Checking , 2000 .

[10]  Andrzej Ne ' dzusiak Fields and Probability , 1990 .

[11]  P. Spreij Probability and Measure , 1996 .

[12]  M. Gordon,et al.  Introduction to HOL: a theorem proving environment for higher order logic , 1993 .

[13]  David J. C. Mackay,et al.  Introduction to Monte Carlo Methods , 1998, Learning in Graphical Models.

[14]  Michael J. C. Gordon,et al.  Mechanizing programming logics in higher order logic , 1989 .

[15]  Mahesh Viswanathan,et al.  VESTA: A statistical model-checker and analyzer for probabilistic systems , 2005, Second International Conference on the Quantitative Evaluation of Systems (QEST'05).

[16]  Józef Bia las The σ-additive Measure Theory , 1990 .

[17]  Sofiène Tahar,et al.  Formalization of Continuous Probability Distributions , 2007, CADE.

[18]  Sofiène Tahar,et al.  Verification of Tail Distribution Bounds in a Theorem Prover , 2007 .

[19]  Ramakant Khazanie Basic probability theory and applications , 1976 .

[20]  Jan J. M. M. Rutten,et al.  Mathematical techniques for analyzing concurrent and probabilistic systems , 2004, CRM monograph series.

[21]  Christine Paulin-Mohring,et al.  Proofs of randomized algorithms in Coq , 2006, Sci. Comput. Program..

[22]  Orieta Celiku Quantitative Temporal Logic Mechanized in HOL , 2005, ICTAC.

[23]  Sofiène Tahar,et al.  Verification of Expectation Properties for Discrete Random Variables in HOL , 2007, TPHOLs.

[24]  Alonzo Church,et al.  A formulation of the simple theory of types , 1940, Journal of Symbolic Logic.

[25]  Sofiène Tahar,et al.  Verification of Probabilistic Properties in HOL Using the Cumulative Distribution Function , 2007, IFM.

[26]  M. Mitzenmacher,et al.  Probability and Computing: Chernoff Bounds , 2005 .

[27]  Annabelle McIver,et al.  Probabilistic guarded commands mechanized in HOL , 2005, Theor. Comput. Sci..

[28]  L. Devroye Non-Uniform Random Variate Generation , 1986 .

[29]  Marta Z. Kwiatkowska,et al.  Quantitative Analysis With the Probabilistic Model Checker PRISM , 2006, QAPL.

[30]  Robin Milner,et al.  A Theory of Type Polymorphism in Programming , 1978, J. Comput. Syst. Sci..