Using The Matrix Ridge Approximation to Speedup Determinantal Point Processes Sampling Algorithms

Determinantal point process (DPP) is an important probabilistic model that has extensive applications in artificial intelligence. The exact sampling algorithm of DPP requires the full eigenvalue decomposition of the kernel matrix which has high time and space complexities. This prohibits the applications of DPP from large-scale datasets. Previous work has applied the Nystrom method to speedup the sampling algorithm of DPP, and error bounds have been established for the approximation. In this paper we employ the matrix ridge approximation (MRA) to speedup the sampling algorithm of DPP, showing that our approach MRA-DPP has stronger error bound than the Nystrom-DPP. In certain circumstances our MRA-DPP is provably exact, whereas the Nystrom-DPP is far from the ground truth. Finally, experiments on several real-world datasets show that our MRA-DPP is more accurate than the other approximation approaches.

[1]  Arild Stubhaug Acta Mathematica , 1886, Nature.

[2]  E. Nyström Über Die Praktische Auflösung von Integralgleichungen mit Anwendungen auf Randwertaufgaben , 1930 .

[3]  Charles R. Johnson,et al.  Topics in Matrix Analysis , 1991 .

[4]  David J. Spiegelhalter,et al.  Machine Learning, Neural and Statistical Classification , 2009 .

[5]  Tobias Scheffer,et al.  International Conference on Machine Learning (ICML-99) , 1999, Künstliche Intell..

[6]  Christopher K. I. Williams,et al.  Using the Nyström Method to Speed Up Kernel Machines , 2000, NIPS.

[7]  Jitendra Malik,et al.  Spectral grouping using the Nystrom method , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Petros Drineas,et al.  On the Nyström Method for Approximating a Gram Matrix for Improved Kernel-Based Learning , 2005, J. Mach. Learn. Res..

[9]  Santosh S. Vempala,et al.  Matrix approximation and projective clustering via volume sampling , 2006, SODA '06.

[10]  Y. Peres,et al.  Determinantal Processes and Independence , 2005, math/0503110.

[11]  S. Muthukrishnan,et al.  Relative-Error CUR Matrix Decompositions , 2007, SIAM J. Matrix Anal. Appl..

[12]  Ivor W. Tsang,et al.  Improved Nyström low-rank approximation and error analysis , 2008, ICML '08.

[13]  James T. Kwok,et al.  Clustered Nyström Method for Large Scale Manifold Learning and Dimension Reduction , 2010, IEEE Transactions on Neural Networks.

[14]  Ben Taskar,et al.  Structured Determinantal Point Processes , 2010, NIPS.

[15]  Ameet Talwalkar,et al.  On the Impact of Kernel Approximation on Learning Accuracy , 2010, AISTATS.

[16]  Christos Boutsidis,et al.  Near Optimal Column-Based Matrix Reconstruction , 2011, 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science.

[17]  James T. Kwok,et al.  Time and space efficient spectral clustering via column sampling , 2011, CVPR 2011.

[18]  Ben Taskar,et al.  Learning Determinantal Point Processes , 2011, UAI.

[19]  Nathan Halko,et al.  Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions , 2009, SIAM Rev..

[20]  Ben Taskar,et al.  k-DPPs: Fixed-Size Determinantal Point Processes , 2011, ICML.

[21]  Ben Taskar,et al.  Determinantal Point Processes for Machine Learning , 2012, Found. Trends Mach. Learn..

[22]  Ameet Talwalkar,et al.  Sampling Methods for the Nyström Method , 2012, J. Mach. Learn. Res..

[23]  Venkatesan Guruswami,et al.  Optimal column-based low-rank matrix reconstruction , 2011, SODA.

[24]  Rong Jin,et al.  Nyström Method vs Random Fourier Features: A Theoretical and Empirical Comparison , 2012, NIPS.

[25]  J. M. Aldaz Sharp bounds for the difference between the arithmetic and geometric means , 2012, 1203.4454.

[26]  Ben Taskar,et al.  Nystrom Approximation for Large-Scale Determinantal Processes , 2013, AISTATS.

[27]  Ben Taskar,et al.  Approximate Inference in Continuous Determinantal Processes , 2013, NIPS.

[28]  Ameet Talwalkar,et al.  Large-scale SVD and manifold learning , 2013, J. Mach. Learn. Res..

[29]  Zhihua Zhang,et al.  The matrix ridge approximation: algorithms and applications , 2013, Machine Learning.

[30]  Zhihua Zhang,et al.  Improving CUR matrix decomposition and the Nyström approximation via adaptive sampling , 2013, J. Mach. Learn. Res..

[31]  Zhihua Zhang,et al.  Efficient Algorithms and Error Analysis for the Modified Nystrom Method , 2014, AISTATS.

[32]  Michael W. Mahoney,et al.  Revisiting the Nystrom Method for Improved Large-scale Machine Learning , 2013, J. Mach. Learn. Res..