Optimal approximate sampling from discrete probability distributions

This paper addresses a fundamental problem in random variate generation: given access to a random source that emits a stream of independent fair bits, what is the most accurate and entropy-efficient algorithm for sampling from a discrete probability distribution (p1, …, pn), where the probabilities of the output distribution (p̂1, …, p̂n) of the sampling algorithm must be specified using at most k bits of precision? We present a theoretical framework for formulating this problem and provide new techniques for finding sampling algorithms that are optimal both statistically (in the sense of sampling accuracy) and information-theoretically (in the sense of entropy consumption). We leverage these results to build a system that, for a broad family of measures of statistical accuracy, delivers a sampling algorithm whose expected entropy usage is minimal among those that induce the same distribution (i.e., is “entropy-optimal”) and whose output distribution (p̂1, …, p̂n) is a closest approximation to the target distribution (p1, …, pn) among all entropy-optimal sampling algorithms that operate within the specified k-bit precision. This optimal approximate sampler is also a closer approximation than any (possibly entropy-suboptimal) sampler that consumes a bounded amount of entropy with the specified precision, a class which includes floating-point implementations of inversion sampling and related methods found in many software libraries. We evaluate the accuracy, entropy consumption, precision requirements, and wall-clock runtime of our optimal approximate sampling algorithms on a broad set of distributions, demonstrating the ways that they are superior to existing approximate samplers and establishing that they often consume significantly fewer resources than are needed by exact samplers.

[1]  Vikash K. Mansinghka,et al.  Gen: a general-purpose probabilistic programming system with programmable inference , 2019, PLDI.

[2]  Martin C. Rinard,et al.  Bayesian synthesis of probabilistic programs for automatic data modeling , 2019, Proc. ACM Program. Lang..

[3]  Dexter Kozen,et al.  Coalgebraic Tools for Randomness-Conserving Protocols , 2018, RAMiCS.

[4]  Feras Saad,et al.  Probabilistic Data Analysis with Probabilistic Programming , 2016, ArXiv.

[5]  Ohad Kammar,et al.  Semantics for probabilistic programming: higher-order functions, continuous distributions, and soft constraints , 2016, 2016 31st Annual ACM/IEEE Symposium on Logic in Computer Science (LICS).

[6]  Thomas Prest,et al.  Gaussian Sampling in Lattice-Based Cryptography , 2015 .

[7]  Chaohui Du,et al.  Towards efficient discrete Gaussian sampling for lattice-based cryptography , 2015, 2015 25th International Conference on Field Programmable Logic and Applications (FPL).

[8]  Sriram K. Rajamani,et al.  Efficient synthesis of probabilistic programs , 2015, PLDI.

[9]  Luc Devroye,et al.  Sampling with arbitrary precision , 2015, ArXiv.

[10]  János Folláth Gaussian Sampling in Lattice Based Cryptography , 2014 .

[11]  Nagarjun C. Dwarakanath,et al.  Sampling from discrete Gaussians for lattice-based cryptography on a constrained device , 2014, Applicable Algebra in Engineering, Communication and Computing.

[12]  Thomas A. Henzinger,et al.  Probabilistic programming , 2014, FOSE.

[13]  Vikash K. Mansinghka,et al.  Building fast Bayesian computing machines out of intentionally stochastic, digital parts , 2014, ArXiv.

[14]  Frederik Vercauteren,et al.  High Precision Discrete Gaussian Sampling on FPGAs , 2013, Selected Areas in Cryptography.

[15]  Tobias Friedrich,et al.  Exact and Efficient Generation of Geometric Random Variates and Random Graphs , 2013, ICALP.

[16]  Jérémie O. Lumbroso Optimal Discrete Uniform Generation from Coin Flips, and Applications , 2013, ArXiv.

[17]  李 鎔範,et al.  数値計算のためのGNU Scientific Libraryの紹介(教育講座) , 2012 .

[18]  Konstantinos Panagiotou,et al.  Efficient Sampling Methods for Discrete Distributions , 2012, Algorithmica.

[19]  Milena Mihail,et al.  Efficient Generation ε-close to G(n,p) and Generalizations , 2012, ArXiv.

[20]  Anthony J. C. Ladd,et al.  A fast random number generator for stochastic simulations , 2009, Comput. Phys. Commun..

[21]  Karl Pearson F.R.S. X. On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling , 2009 .

[22]  Michael C. Loui,et al.  Randomizing Functions: Simulation of a Discrete Probability Distribution Using a Source of Unknown Distribution , 2006, IEEE Transactions on Information Theory.

[23]  Igor Vajda,et al.  On Divergences and Informations in Statistics and Information Theory , 2006, IEEE Transactions on Information Theory.

[24]  Luisa Gargano,et al.  A Note on Approximation of Uniform Distributions From Variable-to-Fixed Length Codes , 2006, IEEE Transactions on Information Theory.

[25]  Tomohiko Uyematsu,et al.  Two Algorithms for Random Number Generation Implemented by Using Arithmetic of Limited Precision , 2003, IEICE Trans. Fundam. Electron. Commun. Comput. Sci..

[26]  Paul Glasserman,et al.  Monte Carlo Methods in Financial Engineering , 2003 .

[27]  K. Binder Monte‐Carlo Methods , 2003 .

[28]  Inderjit S. Dhillon,et al.  A Divisive Information-Theoretic Feature Clustering Algorithm for Text Classification , 2003, J. Mach. Learn. Res..

[29]  Ziv Bar-Yossef,et al.  An information statistics approach to data stream and communication complexity , 2002, The 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002. Proceedings..

[30]  Tim Hesterberg,et al.  Monte Carlo Strategies in Scientific Computing , 2002, Technometrics.

[31]  Wojciech Szpankowski,et al.  On the average redundancy rate of the Lempel-Ziv code with k-error protocol , 2000, Proceedings DCC 2000. Data Compression Conference.

[32]  Lenore Blum,et al.  Complexity and Real Computation , 1997, Springer New York.

[33]  Mamoru Hoshi,et al.  Interval algorithm for random number generation , 1997, IEEE Trans. Inf. Theory.

[34]  Huaiyu Zhu On Information and Sufficiency , 1997 .

[35]  Julia Abrahams,et al.  Generation of discrete distributions from biased coins , 1996, IEEE Trans. Inf. Theory.

[36]  Sergio Verdú,et al.  Generating random bits from an arbitrary source: fundamental limits , 1995, IEEE Trans. Inf. Theory.

[37]  T. Han,et al.  Approximation Theory of Output Statistics , 1993, Proceedings. IEEE International Symposium on Information Theory.

[38]  Y. Peres Iterating Von Neumann's Procedure for Extracting Random Bits , 1992 .

[39]  Michael D. Vose,et al.  A Linear Algorithm For Generating Random Numbers With a Given Distribution , 1991, IEEE Trans. Software Eng..

[40]  J. R. Roche,et al.  Efficient Generation Of Random Variables From Biased Coins , 1991, Proceedings. 1991 IEEE International Symposium on Information Theory.

[41]  L. Devroye Non-Uniform Random Variate Generation , 1986 .

[42]  John F. Monahan,et al.  Accuracy in random number generation , 1985 .

[43]  Manuel Blum,et al.  Independent unbiased coin flips from a correlated biased source—A finite state markov chain , 1984, Comb..

[44]  Quentin F. Stout,et al.  TREE ALGORITHMS FOR UNBIASED COIN TOSSING WITH A BIASED COIN , 1984 .

[45]  Luc Devroye,et al.  A note on approximations in random variate generation , 1982 .

[46]  Alastair J. Walker,et al.  An Efficient Method for Generating Discrete Random Variables with General Distributions , 1977, TOMS.

[47]  A. J. Walker New fast method for generating discrete random numbers with arbitrary frequency distributions , 1974 .

[48]  P. Elias The Efficient Construction of an Unbiased Random Sequence , 1972 .

[49]  John Harling,et al.  Simulation Techniques in Operations Research---A Review , 1958 .

[50]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[51]  Claude E. Shannon,et al.  The mathematical theory of communication , 1950 .

[52]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[53]  K. Pearson On the Criterion that a Given System of Deviations from the Probable in the Case of a Correlated System of Variables is Such that it Can be Reasonably Supposed to have Arisen from Random Sampling , 1900 .

[54]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[55]  Eric Jonas,et al.  Stochastic architectures for probabilistic computation , 2014 .

[56]  Dexter Kozen,et al.  Optimal Coin Flipping , 2014, Horizons of the Mind.

[57]  John P. Steinberger,et al.  Improved Security Bounds for Key-Alternating Ciphers via Hellinger Distance , 2012, IACR Cryptol. ePrint Arch..

[58]  Norbert Wehn,et al.  A Hardware Efficient Random Number Generator for Nonuniform Distributions with Arbitrary Precision , 2012, Int. J. Reconfigurable Comput..

[59]  Thomas M. Cover,et al.  Elements of information theory (2. ed.) , 2006 .

[60]  Jun S. Liu,et al.  Monte Carlo strategies in scientific computing , 2001 .

[61]  Andrew Chi-Chih Yao,et al.  The complexity of nonuniform random number generation , 1976 .

[62]  S. M. Ali,et al.  A General Class of Coefficients of Divergence of One Distribution from Another , 1966 .