Bayesian Pseudocoresets

Standard Bayesian inference algorithms are prohibitively expensive in the regime of modern large-scale data. Recent work has found that a small, weighted subset of data (a coreset) may be used in place of the full dataset during inference, taking advantage of data redundancy to reduce computational cost. However, this approach has limitations in the increasingly common setting of sensitive, high-dimensional data. Indeed, we prove that there are situations in which the Kullback-Leibler (KL) divergence between the optimal coreset and the true posterior grows with data dimension; and as coresets include a subset of the original data, they cannot be constructed in a manner that preserves individual privacy. We address both of these issues with a single unified solution, Bayesian pseudocoresets—a small weighted collection of synthetic “pseudodata”—along with a variational optimization method to select both pseudodata and weights. The use of pseudodata (as opposed to the original datapoints) enables both the summarization of high-dimensional data and the differentially private summarization of sensitive data. Real and synthetic experiments on high-dimensional data demonstrate that Bayesian pseudocoresets achieve significant improvements in posterior approximation error compared to traditional coresets, and that pseudocoresets provide privacy without a significant loss in approximation quality.

[1]  Trevor Campbell,et al.  Validated Variational Inference via Practical Posterior Error Bounds , 2019, AISTATS.

[2]  James R. Foulds,et al.  Variational Bayes In Private Settings (VIPS) , 2016, J. Artif. Intell. Res..

[3]  Trevor Campbell,et al.  Sparse Variational Inference: Bayesian Coresets from Scratch , 2019, NeurIPS.

[4]  Rémi Gribonval,et al.  Differentially Private Compressive K-means , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[5]  Trevor Campbell,et al.  Data-dependent compression of random features for large-scale kernel approximation , 2019, AISTATS.

[6]  Yu-Xiang Wang,et al.  Subsampled Rényi Differential Privacy and Analytical Moments Accountant , 2018, AISTATS.

[7]  Trevor Campbell,et al.  Automated Scalable Bayesian Inference via Hilbert Coresets , 2017, J. Mach. Learn. Res..

[8]  David B. Dunson,et al.  Robust Bayesian Inference via Coarsening , 2015, Journal of the American Statistical Association.

[9]  Trevor Campbell,et al.  Bayesian Coreset Construction via Greedy Iterative Geodesic Ascent , 2018, ICML.

[10]  Bernhard Schölkopf,et al.  Differentially Private Database Release via Kernel Mean Embeddings , 2017, ICML.

[11]  Max Welling,et al.  VAE with a VampPrior , 2017, AISTATS.

[12]  Ryan P. Adams,et al.  PASS-GLM: polynomial approximate sufficient statistics for scalable Bayesian GLM inference , 2017, NIPS.

[13]  John O'Leary,et al.  Unbiased Markov chain Monte Carlo with couplings , 2017, 1708.03625.

[14]  Dan Feldman,et al.  Coresets for Differentially Private K-Means Clustering and Applications to Privacy in Mobile Sensor Networks , 2017, 2017 16th ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN).

[15]  Antti Honkela,et al.  Differentially Private Variational Inference for Non-conjugate Models , 2016, UAI.

[16]  David M. Blei,et al.  Robust Probabilistic Modeling with Bayesian Data Reweighting , 2016, ICML.

[17]  Dustin Tran,et al.  Automatic Differentiation Variational Inference , 2016, J. Mach. Learn. Res..

[18]  Cameron Musco,et al.  Recursive Sampling for the Nystrom Method , 2016, NIPS.

[19]  Vladimir Braverman,et al.  New Frameworks for Offline and Streaming Coreset Constructions , 2016, ArXiv.

[20]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[21]  Trevor Campbell,et al.  Coresets for Scalable Bayesian Logistic Regression , 2016, NIPS.

[22]  Andreas Krause,et al.  Strong Coresets for Hard and Soft Bregman Clustering with Applications to Exponential Family Mixtures , 2015, AISTATS.

[23]  Dan Feldman,et al.  Dimensionality Reduction of Massive Sparse Datasets Using Coresets , 2015, NIPS.

[24]  Andreas Krause,et al.  Coresets for Nonparametric Estimation - the Case of DP-Means , 2015, ICML.

[25]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[26]  Sean Gerrish,et al.  Black Box Variational Inference , 2013, AISTATS.

[27]  Andrew Gelman,et al.  The No-U-turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo , 2011, J. Mach. Learn. Res..

[28]  David B. Dunson,et al.  Bayesian Compressed Regression , 2013, ArXiv.

[29]  David Duvenaud,et al.  Optimally-Weighted Herding is Bayesian Quadrature , 2012, UAI.

[30]  Andreas Krause,et al.  Scalable Training of Mixture Models via Coresets , 2011, NIPS.

[31]  Andrew Gelman,et al.  Handbook of Markov Chain Monte Carlo , 2011 .

[32]  Radford M. Neal MCMC Using Hamiltonian Dynamics , 2011, 1206.1901.

[33]  Alexander J. Smola,et al.  Super-Samples from Kernel Herding , 2010, UAI.

[34]  Max Welling,et al.  Herding dynamical weights to learn , 2009, ICML '09.

[35]  Haim Kaplan,et al.  Private coresets , 2009, STOC '09.

[36]  Michalis K. Titsias,et al.  Variational Learning of Inducing Variables in Sparse Gaussian Processes , 2009, AISTATS.

[37]  Larry A. Wasserman,et al.  Compressed Regression , 2007, NIPS.

[38]  Kasturi R. Varadarajan,et al.  Geometric Approximation via Coresets , 2007 .

[39]  Moni Naor,et al.  Our Data, Ourselves: Privacy Via Distributed Noise Generation , 2006, EUROCRYPT.

[40]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[41]  Zoubin Ghahramani,et al.  Sparse Gaussian Processes using Pseudo-inputs , 2005, NIPS.

[42]  Petros Drineas,et al.  On the Nyström Method for Approximating a Gram Matrix for Improved Kernel-Based Learning , 2005, J. Mach. Learn. Res..

[43]  Christian Posse,et al.  Likelihood-Based Data Squashing: A Modeling Approach to Instance Construction , 2002, Data Mining and Knowledge Discovery.

[44]  Theodore Johnson,et al.  Squashing flat files flatter , 1999, KDD '99.