Sampling from a k-DPP without looking at all items

Determinantal point processes (DPPs) are a useful probabilistic model for selecting a small diverse subset out of a large collection of items, with applications in summarization, stochastic optimization, active learning and more. Given a kernel function and a subset size $k$, our goal is to sample $k$ out of $n$ items with probability proportional to the determinant of the kernel matrix induced by the subset (a.k.a. $k$-DPP). Existing $k$-DPP sampling algorithms require an expensive preprocessing step which involves multiple passes over all $n$ items, making it infeasible for large datasets. A naive heuristic addressing this problem is to uniformly subsample a fraction of the data and perform $k$-DPP sampling only on those items, however this method offers no guarantee that the produced sample will even approximately resemble the target distribution over the original dataset. In this paper, we develop an algorithm which adaptively builds a sufficiently large uniform sample of data that is then used to efficiently generate a smaller set of $k$ items, while ensuring that this set is drawn exactly from the target distribution defined on all $n$ items. We show empirically that our algorithm produces a $k$-DPP sample after observing only a small fraction of all elements, leading to several orders of magnitude faster performance compared to the state-of-the-art.

[1]  Michal Derezinski,et al.  Fast determinantal point processes via distortion-free intermediate sampling , 2018, COLT.

[2]  Pushmeet Kohli,et al.  Batched Gaussian Process Bandit Optimization via Determinantal Point Processes , 2016, NIPS.

[3]  Ben Taskar,et al.  k-DPPs: Fixed-Size Determinantal Point Processes , 2011, ICML.

[4]  Michael W. Mahoney,et al.  Bayesian experimental design using regularized determinantal point processes , 2019, AISTATS.

[5]  J. Darroch On the Distribution of the Number of Successes in Independent Trials , 1964 .

[6]  Suvrit Sra,et al.  Elementary Symmetric Polynomials for Optimal Experimental Design , 2017, NIPS.

[7]  Jack Poulson,et al.  High-performance sampling of generic determinantal point processes , 2019, Philosophical Transactions of the Royal Society A.

[8]  Michael W. Mahoney,et al.  Improved guarantees and a multiple-descent curve for the Column Subset Selection Problem and the Nyström method , 2020, ArXiv.

[9]  Manfred K. Warmuth,et al.  Leveraged volume sampling for linear regression , 2018, NeurIPS.

[10]  Ulrich Paquet,et al.  Bayesian Low-Rank Determinantal Point Processes , 2016, RecSys.

[11]  Y. Peres,et al.  Determinantal Processes and Independence , 2005, math/0503110.

[12]  Barbara Plank,et al.  Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies , 2011 .

[13]  Nima Anari,et al.  Monte Carlo Markov Chain Algorithms for Sampling Strongly Rayleigh Distributions and Determinantal Point Processes , 2016, COLT.

[14]  Wenpin Tang,et al.  The Poisson Binomial Distribution— Old & New , 2019, Statistical Science.

[15]  Michael W. Mahoney,et al.  Determinantal Point Processes in Randomized Numerical Linear Algebra , 2020, Notices of the American Mathematical Society.

[16]  Ben Taskar,et al.  Discovering Diverse and Salient Threads in Document Collections , 2012, EMNLP.

[17]  Daniele Calandriello,et al.  Distributed Adaptive Sampling for Kernel Matrix Approximation , 2017, AISTATS.

[18]  Laming Chen,et al.  Fast Greedy MAP Inference for Determinantal Point Process to Improve Recommendation Diversity , 2017, NeurIPS.

[19]  S. Canu,et al.  Training Invariant Support Vector Machines using Selective Sampling , 2005 .

[20]  Daniele Calandriello,et al.  Exact sampling of determinantal point processes with sublinear time preprocessing , 2019, NeurIPS.

[21]  Daniele Calandriello,et al.  On Fast Leverage Score Sampling and Optimal Learning , 2018, NeurIPS.

[22]  Rémi Bardenet,et al.  On a few statistical applications of determinantal point processes , 2017 .

[23]  Venkatesan Guruswami,et al.  Optimal column-based low-rank matrix reconstruction , 2011, SODA.

[24]  Manfred K. Warmuth,et al.  Correcting the bias in least squares regression with volume-rescaled sampling , 2018, AISTATS.

[25]  Michal Valko,et al.  DPPy: Sampling Determinantal Point Processes with Python , 2018, ArXiv.

[26]  A. Caponnetto,et al.  Optimal Rates for the Regularized Least-Squares Algorithm , 2007, Found. Comput. Math..

[27]  Nisheeth K. Vishnoi,et al.  Fair and Diverse DPP-based Data Summarization , 2018, ICML.

[28]  Suvrit Sra,et al.  Efficient Sampling for k-Determinantal Point Processes , 2015, AISTATS.

[29]  Ben Taskar,et al.  Determinantal Point Processes for Machine Learning , 2012, Found. Trends Mach. Learn..

[30]  Bruno Galerne,et al.  Exact sampling of determinantal point processes without eigendecomposition , 2018, Journal of Applied Probability.

[31]  Santosh S. Vempala,et al.  Matrix approximation and projective clustering via volume sampling , 2006, SODA '06.

[32]  Jennifer Gillenwater Approximate inference for determinantal point processes , 2014 .

[33]  Michael W. Mahoney,et al.  Fast Randomized Kernel Ridge Regression with Statistical Guarantees , 2015, NIPS.

[34]  Kristen Grauman,et al.  Diverse Sequential Subset Selection for Supervised Video Summarization , 2014, NIPS.

[35]  Ben Taskar,et al.  Nystrom Approximation for Large-Scale Determinantal Processes , 2013, AISTATS.