Distribution Compression in Near-linear Time

In distribution compression, one aims to accurately summarize a probability distribution P using a small number of representative points. Near-optimal thinning procedures achieve this goal by sampling n points from a Markov chain and identifying √ n points with Õ(1/ √ n) discrepancy to P. Unfortunately, these algorithms suffer from quadratic or super-quadratic runtime in the sample size n. To address this deficiency, we introduce Compress++, a simple meta-procedure for speeding up any thinning algorithm while suffering at most a factor of 4 in error. When combined with the quadratic-time kernel halving and kernel thinning algorithms of Dwivedi and Mackey (2021), Compress++ delivers √ n points with O( √ log n/n) integration error and better-than-Monte-Carlo maximum mean discrepancy in O(n log n) time and O( √ n log n) space. Moreover, Compress++ enjoys the same near-linear runtime given any quadratic-time input and reduces the runtime of super-quadratic algorithms by a square-root factor. In our benchmarks with high-dimensional Monte Carlo samples and Markov chains targeting challenging differential equation posteriors, Compress++ matches or nearly matches the accuracy of its input algorithm in orders of magnitude less time.

[1]  R. Tweedie,et al.  Exponential convergence of Langevin distributions and their discrete approximations , 1996 .

[2]  Lawrence Mitchell,et al.  Simulating Human Cardiac Electrophysiology on Clinical Time-Scales , 2011, Front. Physio..

[3]  B. Goodwin Oscillatory behavior in enzymatic control processes. , 1965, Advances in enzyme regulation.

[4]  Krikamol Muandet,et al.  Minimax Estimation of Kernel Mean Embeddings , 2016, J. Mach. Learn. Res..

[5]  Jirí Matousek,et al.  Approximations and optimal geometric divide-and-conquer , 1991, STOC '91.

[6]  Gábor Lugosi,et al.  Concentration Inequalities - A Nonasymptotic Theory of Independence , 2013, Concentration Inequalities.

[7]  Jeff M. Phillips,et al.  Near-Optimal Coresets of Kernel Density Estimates , 2018, Discrete & Computational Geometry.

[8]  Art B. Owen,et al.  Statistically Efficient Thinning of a Markov Chain Sampler , 2015, ArXiv.

[9]  Gernot Plank,et al.  Simulating ventricular systolic motion in a four-chamber heart model with spatially varying robin boundary conditions to model the effect of the pericardium , 2020, Journal of biomechanics.

[10]  Heikki Haario,et al.  Adaptive proposal distribution for random walk Metropolis algorithm , 1999, Comput. Stat..

[11]  J. M. Phillips Algorithms for ε-approximations of Terrains ? , 2008 .

[12]  Oluwasanmi Koyejo,et al.  Examples are not enough, learn to criticize! Criticism for Interpretability , 2016, NIPS.

[13]  A. Berlinet,et al.  Reproducing kernel Hilbert spaces in probability and statistics , 2004 .

[14]  Bernhard Schölkopf,et al.  A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..

[15]  J. Tropp FREEDMAN'S INEQUALITY FOR MATRIX MARTINGALES , 2011, 1101.3039.

[16]  Jon Cockayne,et al.  Optimal thinning of MCMC output , 2020, Journal of the Royal Statistical Society: Series B (Statistical Methodology).

[17]  Alexander J. Smola,et al.  Super-Samples from Kernel Herding , 2010, UAI.

[18]  M. Girolami,et al.  Riemann manifold Langevin and Hamiltonian Monte Carlo methods , 2011, Journal of the Royal Statistical Society: Series B (Statistical Methodology).

[19]  Joel A. Tropp,et al.  User-Friendly Tail Bounds for Sums of Random Matrices , 2010, Found. Comput. Math..

[20]  Manfred Liebmann,et al.  Anatomically accurate high resolution modeling of human whole heart electromechanics: A strongly scalable algebraic multigrid solver method for nonlinear deformation , 2016, J. Comput. Phys..

[21]  A. J. Lotka Elements of Physical Biology. , 1925, Nature.

[22]  Raaz Dwivedi,et al.  Generalized Kernel Thinning , 2021, ArXiv.

[23]  Andrew S. Glassner,et al.  MONTE CARLO INTEGRATION , 1995 .

[24]  Lester Mackey,et al.  Kernel Thinning , 2021, COLT.

[25]  A. Tanskanen,et al.  A simplified local control model of calcium-induced calcium release in cardiac ventricular myocytes. , 2004, Biophysical journal.

[26]  Bernard Chazelle,et al.  On linear-time deterministic algorithms for optimization problems in fixed dimension , 1996, SODA '93.