Hashing-Based-Estimators for Kernel Density in High Dimensions

Given a set of points P⊄ R^d and a kernel k, the Kernel Density Estimate at a point x∊R^d is defined as \mathrm{KDE}_{P}(x)=\frac{1}{|P|}\sum_{y\in P} k(x,y). We study the problem of designing a data structure that given a data set P and a kernel function, returns approximations to the kernel density} of a query point in sublinear time}. We introduce a class of unbiased estimators for kernel density implemented through locality-sensitive hashing, and give general theorems bounding the variance of such estimators. These estimators give rise to efficient data structures for estimating the kernel density in high dimensions for a variety of commonly used kernels. Our work is the first to provide data-structures with theoretical guarantees that improve upon simple random sampling in high dimensions.

[1]  A. Rinaldo,et al.  Generalized density clustering , 2009, 0907.3454.

[2]  Eli Upfal,et al.  Probability and Computing: Randomized Algorithms and Probabilistic Analysis , 2005 .

[3]  Larry A. Wasserman,et al.  Nonparametric Ridge Estimation , 2012, ArXiv.

[4]  Ilias Diakonikolas,et al.  Sample-Optimal Density Estimation in Nearly-Linear Time , 2015, SODA.

[5]  Noga Alon,et al.  The space complexity of approximating the frequency moments , 1996, STOC '96.

[6]  Rocco A. Servedio,et al.  Explorer Efficient Density Estimation via Piecewise Polynomial Approximation , 2013 .

[7]  Anshumali Shrivastava,et al.  A New Unbiased and Efficient Class of LSH-Based Samplers and Estimators for Partition Function Computation in Log-Linear Models , 2017, ArXiv.

[8]  Leslie Greengard,et al.  The Fast Gauss Transform , 1991, SIAM J. Sci. Comput..

[9]  Daniel M. Kane,et al.  Robust Estimators in High Dimensions without the Computational Intractability , 2016, 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS).

[10]  Alexandr Andoni,et al.  Optimal Hashing-based Time-Space Trade-offs for Approximate Near Neighbors , 2016, SODA.

[11]  Jeff M. Phillips,et al.  Є-Samples for Kernels , 2013, SODA.

[12]  William B. March,et al.  ASKIT: Approximate Skeletonization Kernel-Independent Treecode in High Dimensions , 2014, SIAM J. Sci. Comput..

[13]  Andrew W. Moore,et al.  Dual-Tree Fast Gauss Transforms , 2005, NIPS.

[14]  Nicole Immorlica,et al.  Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.

[15]  Harish Karnick,et al.  Random Feature Maps for Dot Product Kernels , 2012, AISTATS.

[16]  Jianqing Fan Local Polynomial Modelling and Its Applications: Monographs on Statistics and Applied Probability 66 , 1996 .

[17]  Santosh S. Vempala,et al.  Agnostic Estimation of Mean and Covariance , 2016, 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS).

[18]  Sivaraman Balakrishnan,et al.  Statistical Inference for Cluster Trees , 2016, NIPS.

[19]  Santosh S. Vempala,et al.  A spectral algorithm for learning mixture models , 2004, J. Comput. Syst. Sci..

[20]  Yen-Chi Chen,et al.  Density Level Sets: Asymptotics, Inference, and Visualization , 2015, 1504.05438.

[21]  Benjamin Recht,et al.  Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[22]  David Mason,et al.  On the Estimation of the Gradient Lines of a Density and the Consistency of the Mean-Shift Algorithm , 2016, J. Mach. Learn. Res..

[23]  Alexandr Andoni,et al.  Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[24]  K. Böröczky,et al.  Covering the Sphere by Equal Spherical Balls , 2003 .

[25]  Cameron Musco,et al.  Provably Useful Kernel Matrix Approximation in Linear Time , 2016, ArXiv.

[26]  Ryan Williams,et al.  Probabilistic Polynomials and Hamming Nearest Neighbors , 2015, 2015 IEEE 56th Annual Symposium on Foundations of Computer Science.

[27]  Cameron Musco,et al.  Recursive Sampling for the Nystrom Method , 2016, NIPS.

[28]  Luc Devroye,et al.  Combinatorial methods in density estimation , 2001, Springer series in statistics.

[29]  Jack J. Dongarra,et al.  Guest Editors Introduction to the top 10 algorithms , 2000, Comput. Sci. Eng..

[30]  Ronitt Rubinfeld,et al.  On the learnability of discrete distributions , 1994, STOC '94.

[31]  Suresh Venkatasubramanian,et al.  Comparing distributions and shapes using the kernel distance , 2010, SoCG '11.

[32]  David P. Woodruff,et al.  Faster Kernel Ridge Regression Using Sketching and Preconditioning , 2016, SIAM J. Matrix Anal. Appl..

[33]  B. Harshbarger An Introduction to Probability Theory and its Applications, Volume I , 1958 .

[34]  Yan Zheng,et al.  Coresets for Kernel Regression , 2017, KDD.

[35]  Gregory Valiant,et al.  Finding Correlations in Subquadratic Time, with Applications to Learning Parities and Juntas , 2012, 2012 IEEE 53rd Annual Symposium on Foundations of Computer Science.

[36]  Piotr Indyk,et al.  On the Fine-Grained Complexity of Empirical Risk Minimization: Kernel Methods and Neural Networks , 2017, NIPS.

[37]  Alexandr Andoni,et al.  Optimal Data-Dependent Hashing for Approximate Near Neighbors , 2015, STOC.

[38]  Leslie Greengard,et al.  A fast algorithm for particle simulations , 1987 .

[39]  Rasmus Pagh,et al.  Fast and scalable polynomial kernels via explicit feature maps , 2013, KDD.

[40]  Rina Panigrahy,et al.  Lower Bounds on Near Neighbor Search via Metric Expansion , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[41]  A. Goldenshluger,et al.  Bandwidth selection in kernel density estimation: Oracle inequalities and adaptive minimax optimality , 2010, 1009.1016.

[42]  S. Bochner Monotone Funktionen, Stieltjessche Integrale und harmonische Analyse , 1933 .

[43]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[44]  Andrew W. Moore,et al.  Nonparametric Density Estimation: Toward Computational Tractability , 2003, SDM.

[45]  Hans-Peter Kriegel,et al.  Generalized Outlier Detection with Flexible Kernel Density Estimates , 2014, SDM.