Asymptotic equivalence of fixed-size and varying-size determinantal point processes

Determinantal Point Processes (DPPs) are popular models for point processes with repulsion. They appear in numerous contexts, from physics to graph theory, and display appealing theoretical properties. On the more practical side of things, since DPPs tend to select sets of points that are some distance apart (repulsion), they have been advocated as a way of producing random subsets with high diversity. DPPs come in two variants: fixed-size and varying-size. A sample from a varying-size DPP is a subset of random cardinality, while in fixed-size "$k$-DPPs" the cardinality is fixed. The latter makes more sense in many applications, but unfortunately their computational properties are less attractive, since, among other things, inclusion probabilities are harder to compute. In this work we show that as the size of the ground set grows, $k$-DPPs and DPPs become equivalent, meaning that their inclusion probabilities converge. As a by-product, we obtain saddlepoint formulas for inclusion probabilities in $k$-DPPs. These turn out to be extremely accurate, and suffer less from numerical difficulties than exact methods do. Our results also suggest that $k$-DPPs and DPPs also have equivalent maximum likelihood estimators. Finally, we obtain results on asymptotic approximations of elementary symmetric polynomials which may be of independent interest.

[1]  H. Daniels Saddlepoint Approximations in Statistics , 1954 .

[2]  O. Macchi The coincidence approach to stochastic point processes , 1975, Advances in Applied Probability.

[3]  H. Künsch Gaussian Markov random fields , 1979 .

[4]  Jun S. Liu,et al.  Weighted finite population sampling to maximize entropy , 1994 .

[5]  P. Spreij Probability and Measure , 1996 .

[6]  Александр Борисович Сошников,et al.  Детерминантные случайные точечные поля@@@Determinantal random point fields , 2000 .

[7]  Santosh S. Vempala,et al.  Matrix approximation and projective clustering via volume sampling , 2006, SODA '06.

[8]  A. Dasgupta Asymptotic Theory of Statistics and Probability , 2008 .

[9]  Luis Rademacher,et al.  Efficient Volume Sampling for Row/Column Subset Selection , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[10]  R. Couillet,et al.  Random Matrix Methods for Wireless Communications: Estimation , 2011 .

[11]  Ben Taskar,et al.  k-DPPs: Fixed-Size Determinantal Point Processes , 2011, ICML.

[12]  Ben Taskar,et al.  Determinantal Point Processes for Machine Learning , 2012, Found. Trends Mach. Learn..

[13]  Richard Jozsa,et al.  Symmetric polynomials in information theory: entropy and subentropy , 2014, ArXiv.

[14]  S. Sra Inequalities via symmetric polynomial majorization , 2015 .

[15]  Hugo Touchette,et al.  Equivalence and Nonequivalence of Ensembles: Thermodynamic, Macrostate, and Measure Levels , 2014, 1403.6608.

[16]  Suvrit Sra,et al.  Efficient Sampling for k-Determinantal Point Processes , 2015, AISTATS.

[17]  Suvrit Sra,et al.  Elementary Symmetric Polynomials for Optimal Experimental Design , 2017, NIPS.

[18]  Michal Valko,et al.  Zonotope Hit-and-run for Efficient Sampling from Projection DPPs , 2017, ICML.

[19]  Pierre-Olivier Amblard,et al.  Determinantal Point Processes for Coresets , 2018, J. Mach. Learn. Res..