Nonparametric Density Estimation: Toward Computational Tractability

Density estimation is a core operation of virtually all probabilistic learning methods (as opposed to discriminative methods). Approaches to density estimation can be divided into two principal classes, parametric methods, such as Bayesian networks, and nonparametric methods such as kernel density estimation and smoothing splines. While neither choice should be universally preferred for all situations, a well-known benefit of nonparametric methods is their ability to achieve estimation optimality for ANY input distribution as more data are observed, a property that no model with a parametric assumption can have, and one of great importance in exploratory data analysis and mining where the underlying distribution is decidedly unknown. To date, however, despite a wealth of advanced underlying statistical theory, the use of nonparametric methods has been limited by their computational intractibility for all but the smallest datasets. In this paper, we present an algorithm for kernel density estimation, the chief nonparametric approach, which is dramatically faster than previous algorithmic approaches in terms of both dataset size and dimensionality. Furthermore, the algorithm provides arbitrarily tight accuracy guarantees, provides anytime convergence, works for all common kernel choices, and requires no parameter tuning. The algorithm is an instance of a new principle of algorithm design: multi-recursion, or higher-order

[1]  V. A. Epanechnikov Non-Parametric Estimation of a Multivariate Probability Density , 1969 .

[2]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[3]  Jon Louis Bentley,et al.  An Algorithm for Finding Best Matches in Logarithmic Expected Time , 1976, TOMS.

[4]  B. Silverman,et al.  Kernel Density Estimation Using the Fast Fourier Transform , 1982 .

[5]  Prakasa Rao Nonparametric functional estimation , 1983 .

[6]  Franco P. Preparata,et al.  Computational Geometry , 1985, Texts and Monographs in Computer Science.

[7]  Luc Devroye,et al.  Nonparametric Density Estimation , 1985 .

[8]  A. Bowman A comparative study of some kernel-based nonparametric density estimators , 1985 .

[9]  B. Silverman Density estimation for statistics and data analysis , 1986 .

[10]  Leslie Greengard,et al.  A fast algorithm for particle simulations , 1987 .

[11]  L. Devroye,et al.  Nonparametric density estimation : the L[1] view , 1987 .

[12]  David W. Scott,et al.  Multivariate Density Estimation: Theory, Practice, and Visualization , 1992, Wiley Series in Probability and Statistics.

[13]  Jianqing Fan,et al.  Fast implementations of nonparametric curve estimators , 1993 .

[14]  Jianqing Fan,et al.  Fast Implementations of Nonparametric Curve Estimators , 1994 .

[15]  M. Wand Fast Computation of Multivariate Kernel Estimators , 1994 .

[16]  S. Rao Kosaraju,et al.  A decomposition of multidimensional point sets with applications to k-nearest-neighbors and n-body potential fields , 1995, JACM.

[17]  Andrew W. Moore,et al.  Multiresolution Instance-Based Learning , 1995, IJCAI.

[18]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[19]  M. C. Jones,et al.  A Brief Survey of Bandwidth Selection for Density Estimation , 1996 .

[20]  Andrew W. Moore,et al.  Cached Sufficient Statistics for Efficient Machine Learning with Large Datasets , 1998, J. Artif. Intell. Res..

[21]  Andrew W. Moore,et al.  The Anchors Hierarchy: Using the Triangle Inequality to Survive High Dimensional Data , 2000, UAI.

[22]  Andrew W. Moore,et al.  'N-Body' Problems in Statistical Learning , 2000, NIPS.

[23]  Luc Devroye,et al.  Combinatorial methods in density estimation , 2001, Springer series in statistics.