论文信息 - On Sampling and Greedy MAP Inference of Constrained Determinantal Point Processes

On Sampling and Greedy MAP Inference of Constrained Determinantal Point Processes

Subset selection problems ask for a small, diverse yet representative subset of the given data. When pairwise similarities are captured by a kernel, the determinants of submatrices provide a measure of diversity or independence of items within a subset. Matroid theory gives another notion of independence, thus giving rise to optimization and sampling questions about Determinantal Point Processes (DPPs) under matroid constraints. Partition constraints, as a special case, arise naturally when incorporating additional labeling or clustering information, besides the kernel, in DPPs. Finding the maximum determinant submatrix under matroid constraints on its row/column indices has been previously studied. However, the corresponding question of sampling from DPPs under matroid constraints has been unresolved, beyond the simple cardinality constrained k-DPPs. We give the first polynomial time algorithm to sample exactly from DPPs under partition constraints, for any constant number of partitions. We complement this by a complexity theoretic barrier that rules out such a result under general matroid constraints. Our experiments indicate that partition-constrained DPPs offer more flexibility and more diversity than k-DPPs and their naive extensions, while being reasonably efficient in running time. We also show that a simple greedy initialization followed by local search gives improved approximation guarantees for the problem of MAP inference from k- DPPs on well-conditioned kernels. Our experiments show that this improvement is significant for larger values of k, supporting our theoretical result.

Amit Deshpande | Tarun Kathuria | Tarun Kathuria | A. Deshpande

[1] T. Shirai,et al. Random point fields associated with certain Fredholm determinants I: fermion, Poisson and boson point processes , 2003 .

[2] Kenneth L. Clarkson,et al. Coresets, sparse greedy approximation, and the Frank-Wolfe algorithm , 2008, SODA '08.

[3] Aleksandar Nikolov. Randomized Rounding for the Largest Simplex Problem , 2015, STOC.

[4] Philipp Birken,et al. Numerical Linear Algebra , 2011, Encyclopedia of Parallel Computing.

[5] G LoweDavid,et al. Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[6] David J. Kriegman,et al. From Few to Many: Illumination Cone Models for Face Recognition under Variable Lighting and Pose , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[7] Luis Rademacher,et al. Efficient Volume Sampling for Row/Column Subset Selection , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[8] Joseph Naor,et al. A Tight Linear Time (1/2)-Approximation for Unconstrained Submodular Maximization , 2012, 2012 IEEE 53rd Annual Symposium on Foundations of Computer Science.

[9] Andreas Krause,et al. Near-Optimal Sensor Placements in Gaussian Processes: Theory, Efficient Algorithms and Empirical Studies , 2008, J. Mach. Learn. Res..

[10] Ben Taskar,et al. k-DPPs: Fixed-Size Determinantal Point Processes , 2011, ICML.

[11] Malik Magdon-Ismail,et al. On selecting a maximum volume sub-matrix of a matrix and related problems , 2009, Theor. Comput. Sci..

[12] Ben Taskar,et al. Learning the Parameters of Determinantal Point Process Kernels , 2014, ICML.

[13] David A. Forsyth,et al. Matching Words and Pictures , 2003, J. Mach. Learn. Res..

[14] David J. Kriegman,et al. Acquiring linear subspaces for face recognition under variable lighting , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15] Richard A. Brualdi,et al. Determinantal identities: Gauss, Schur, Cauchy, Sylvester, Kronecker, Jacobi, Binet, Laplace, Muir, and Cayley , 1983 .

[16] Scott Aaronson,et al. The computational complexity of linear optics , 2010, STOC '11.

[17] Leslie G. Valiant,et al. The Complexity of Computing the Permanent , 1979, Theor. Comput. Sci..

[18] Ben Taskar,et al. Determinantal Point Processes for Machine Learning , 2012, Found. Trends Mach. Learn..

[19] Ben Taskar,et al. Near-Optimal MAP Inference for Determinantal Point Processes , 2012, NIPS.

[20] Malik Magdon-Ismail,et al. Exponential Inapproximability of Selecting a Maximum Volume Sub-matrix , 2011, Algorithmica.

[21] R. Tennant. Algebra , 1941, Nature.

[22] D. R. Fulkerson,et al. Transversals and Matroid Partition , 1965 .

[23] E. Rains,et al. Eynard–Mehta Theorem, Schur Process, and their Pfaffian Analogs , 2004, math-ph/0409059.

[24] Hui Lin,et al. Learning Mixtures of Submodular Shells with Application to Document Summarization , 2012, UAI.

[25] Santosh S. Vempala,et al. Matrix approximation and projective clustering via volume sampling , 2006, SODA '06.

[26] Rishabh K. Iyer,et al. Submodular Point Processes with Applications to Machine learning , 2015, AISTATS.

[27] M. L. Fisher,et al. An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..

[28] Friedrich Eisenbrand,et al. On largest volume simplices and sub-determinants , 2014, SODA.

[29] Alexei Borodin,et al. Determinantal point processes , 2009, 0911.1153.

[30] R. Lyons. Determinantal probability measures , 2002, math/0204325.

[31] Severnyi Kavkaz. Pseudo-Skeleton Approximations by Matrices of Maximal Volume , 2022 .

[32] Mohit Singh,et al. Maximizing determinants under partition constraints , 2016, STOC.