A tutorial on kernel density estimation and recent advances

ABSTRACT This tutorial provides a gentle introduction to kernel density estimation (KDE) and recent advances regarding confidence bands and geometric/topological features. We begin with a discussion of basic properties of KDE: the convergence rate under various metrics, density derivative estimation, and bandwidth selection. Then, we introduce common approaches to the construction of confidence intervals/bands, and we discuss how to handle bias. Next, we talk about recent advances in the inference of geometric and topological features of a density function using KDE. Finally, we illustrate how one can use KDE to estimate a cumulative distribution function and a receiver operating characteristic curve. We provide R implementations related to this tutorial at the end.

[1]  M. Wand,et al.  ASYMPTOTICS FOR GENERAL MULTIVARIATE KERNEL DENSITY DERIVATIVE ESTIMATORS , 2011 .

[2]  Ulrike von Luxburg,et al.  Consistent Procedures for Cluster Tree Estimation and Pruning , 2014, IEEE Transactions on Information Theory.

[3]  Herbert Edelsbrunner,et al.  Persistent Homology: Theory and Practice , 2013 .

[4]  D. W. Scott,et al.  Biased and Unbiased Cross-Validation in Density Estimation , 1987 .

[5]  Kengo Kato,et al.  Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors , 2012, 1212.6906.

[6]  Yu-Chin Hsu,et al.  (Preliminary: please do not cite or quote without permission.) , 2022 .

[7]  A. Goldenshluger,et al.  Bandwidth selection in kernel density estimation: Oracle inequalities and adaptive minimax optimality , 2010, 1009.1016.

[8]  Sokbae Lee,et al.  Nonparametric Tests of Conditional Treatment Effects , 2009 .

[9]  B. B. Winter Strong uniform consistency of integrals of density estimators , 1973 .

[10]  D. W. Scott,et al.  Multivariate Density Estimation, Theory, Practice and Visualization , 1992 .

[11]  A. Bowman,et al.  Applied smoothing techniques for data analysis : the kernel approach with S-plus illustrations , 1999 .

[12]  H. Edelsbrunner,et al.  Topological data analysis , 2011 .

[13]  M. C. Jones,et al.  A Brief Survey of Bandwidth Selection for Density Estimation , 1996 .

[14]  B. Turnbull,et al.  NONPARAMETRIC AND SEMIPARAMETRIC ESTIMATION OF THE RECEIVER OPERATING CHARACTERISTIC CURVE , 1996 .

[15]  Oliver Rübel,et al.  Morse–Smale Regression , 2013, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[16]  H. Wickham,et al.  Density estimation in R , 2014 .

[17]  A. Tsybakov On nonparametric estimation of density level sets , 1997 .

[18]  Christopher R. Genovese,et al.  Asymptotic theory for density ridges , 2014, 1406.5663.

[19]  Art B. Owen,et al.  Empirical Likelihood Confidence Bands in Density Estimation , 1993 .

[20]  Aurélien Garivier,et al.  On the Complexity of Best-Arm Identification in Multi-Armed Bandit Models , 2014, J. Mach. Learn. Res..

[21]  Enno Mammen,et al.  Confidence regions for level sets , 2013, J. Multivar. Anal..

[22]  Jaroslaw Harezlak,et al.  Comparison of bandwidth selection methods for kernel smoothing of ROC curves , 2002, Statistics in medicine.

[23]  Frédéric Chazal,et al.  Robust Topological Inference: Distance To a Measure and Kernel Distance , 2014, J. Mach. Learn. Res..

[24]  Sivaraman Balakrishnan,et al.  Cluster Trees on Manifolds , 2013, NIPS.

[25]  L. Breiman,et al.  Variable Kernel Estimates of Multivariate Densities , 1977 .

[26]  R. Beran Prepivoting to reduce level error of confidence sets , 1987 .

[27]  Rebecca Nugent,et al.  Stability of density-based clustering , 2010, J. Mach. Learn. Res..

[28]  Larry D. Hostetler,et al.  The estimation of the gradient of a density function, with applications in pattern recognition , 1975, IEEE Trans. Inf. Theory.

[29]  Rob J Hyndman,et al.  Nonparametric confidence intervals for receiver operating characteristic curves , 2004 .

[30]  Alexandre B. Tsybakov,et al.  Introduction to Nonparametric Estimation , 2008, Springer series in statistics.

[31]  Sivaraman Balakrishnan,et al.  Statistical Inference for Cluster Trees , 2016, NIPS.

[32]  Stéphan Clémençon,et al.  On Bootstrapping the ROC Curve , 2008, NIPS.

[33]  Yen-Chi Chen,et al.  Nonparametric inference via bootstrapping the debiased estimator , 2017, Electronic Journal of Statistics.

[34]  Michael H. Neumann Automatic bandwidth choice and confidence intervals in nonparametric regression , 1995 .

[35]  P. Hall On the Bootstrap and Confidence Intervals , 1986 .

[36]  Sofus A. Macskassy,et al.  Confidence Bands for Roc Curves , 2003 .

[37]  Michael H. Neumann Strong approximation of density estimators from weakly dependent observations by density estimators from independent observations , 1998 .

[38]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[39]  Samuel Gerber,et al.  Data Analysis with the Morse-Smale Complex: The msr Package for R , 2012 .

[40]  Joseph P. Romano Bootstrapping the mode , 1988 .

[41]  Larry A. Wasserman,et al.  Optimal Ridge Detection using Coverage Risk , 2015, NIPS.

[42]  Song Xi Chen,et al.  Confidence intervals based on local linear smoother , 2002 .

[43]  Kengo Kato,et al.  Comparison and anti-concentration bounds for maxima of Gaussian random vectors , 2013, 1301.4807.

[44]  T. Duong,et al.  Data-driven density derivative estimation, with applications to nonparametric clustering and bump hunting , 2012, 1204.6160.

[45]  Sokbae Lee,et al.  Nonparametric Tests of Conditional Treatment Effects , 2009 .

[46]  Mikhail Belkin,et al.  Beyond Hartigan Consistency: Merge Distortion Metric for Hierarchical Clustering , 2015, COLT.

[47]  A. Azzalini A note on the estimation of a distribution function and quantiles by a kernel method , 1981 .

[48]  Jörg Polzehl,et al.  Simultaneous bootstrap confidence bands in nonparametric regression , 1998 .

[49]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[50]  Wolfgang Härdle,et al.  Better Bootstrap Confidence Intervals for Regression Curve Estimation , 1995 .

[51]  Peter Hall,et al.  A simple bootstrap method for constructing nonparametric confidence bands for functions , 2013, 1309.4864.

[52]  E. Giné,et al.  Rates of strong uniform consistency for multivariate kernel density estimators , 2002 .

[53]  L. Wasserman,et al.  On the path density of a gradient field , 2008, 0805.4141.

[54]  David Cohen-Steiner,et al.  Stability of Persistence Diagrams , 2005, Discret. Comput. Geom..

[55]  Richard A. Davis,et al.  On Some Global Measures of the Deviations of Density Function Estimates , 2011 .

[56]  M. Rosenblatt,et al.  On the Maximal Deviation of $k$-Dimensional Density Estimates , 1976 .

[57]  A. Banyaga,et al.  Lectures on Morse Homology , 2005 .

[58]  Prakasa Rao Nonparametric functional estimation , 1983 .

[59]  P. J. Green,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[60]  Xin Fu,et al.  Confidence bands in nonparametric regression , 2009 .

[61]  Wanli Qiao,et al.  Submitted to the Annals of Statistics THEORETICAL ANALYSIS OF NONPARAMETRIC FILAMENT ESTIMATION By , 2015 .

[62]  Rolf-Dieter Reiss,et al.  Nonparametric Estimation of Smooth Distribution Functions , 2016 .

[63]  Robert P. Lieli,et al.  Estimating Conditional Average Treatment Effects , 2014 .

[64]  Valerio Pascucci,et al.  Visual Exploration of High Dimensional Scalar Functions , 2010, IEEE Transactions on Visualization and Computer Graphics.

[65]  Joylee Wu,et al.  The National Alzheimer's Coordinating Center (NACC) Database: The Uniform Data Set , 2007, Alzheimer disease and associated disorders.

[66]  Song Xi Chen,et al.  Empirical likelihood confidence intervals for nonparametric density estimation , 1996 .

[67]  O. Lepski,et al.  Structural adaptation via Lp-norm oracle inequalities , 2007, 0704.2492.

[68]  A. Rinaldo,et al.  Generalized density clustering , 2009, 0907.3454.

[69]  C. J. Stone,et al.  An Asymptotically Optimal Window Selection Rule for Kernel Density Estimates , 1984 .

[70]  J. Hanley,et al.  Statistical Approaches to the Analysis of Receiver Operating Characteristic (ROC) Curves , 1984, Medical decision making : an international journal of the Society for Medical Decision Making.

[71]  M. C. Jones,et al.  A reliable data-based bandwidth selection method for kernel density estimation , 1991 .

[72]  C. Loader,et al.  Simultaneous Confidence Bands for Linear Regression and Smoothing , 1994 .

[73]  Xiao-Hua Zhou,et al.  Treatment selection in a randomized clinical trial via covariate-specific treatment effect curves , 2017, Statistical methods in medical research.

[74]  W. Härdle,et al.  Bootstrapping in Nonparametric Regression: Local Adaptive Smoothing and Confidence Bands , 1988 .

[75]  Werner Stuetzle,et al.  Estimating the Cluster Tree of a Density by Analyzing the Minimal Spanning Tree of a Sample , 2003, J. Classif..

[76]  S. Sheather Density Estimation , 2004 .

[77]  E. Nadaraya,et al.  Some New Estimates for Distribution Functions , 1964 .

[78]  W. Loh,et al.  Calibrating Confidence Coefficients , 1987 .

[79]  C. Quesenberry,et al.  A nonparametric estimate of a multivariate density function , 1965 .

[80]  P. Hall On convergence rates of suprema , 1991 .

[81]  Kengo Kato,et al.  Gaussian approximation of suprema of empirical processes , 2012, 1212.6885.

[82]  James Stephen Marron,et al.  BOOTSTRAP SIMULTANEOUS ERROR BARS FOR NONPARAMETRIC REGRESSION , 1991 .

[83]  K. Zou,et al.  Smooth non-parametric receiver operating characteristic (ROC) curves for continuous diagnostic tests. , 1997, Statistics in medicine.

[84]  Hemant Ishwaran,et al.  Evaluating Random Forests for Survival Analysis using Prediction Error Curves. , 2012, Journal of statistical software.

[85]  Yizong Cheng,et al.  Mean Shift, Mode Seeking, and Clustering , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[86]  Foster J. Provost,et al.  Confidence Bands for Roc Curves , 2004, ROCAI.

[87]  M. Rudemo Empirical Choice of Histograms and Kernel Density Estimators , 1982 .

[88]  L. Wasserman,et al.  Statistical Inference using the Morse-Smale Complex , 2015, 1506.08826.

[89]  Marston Morse The foundations of a theory of the calculus of variations in the large in -space. II , 1930 .

[90]  Kjell A. Doksum,et al.  Uniform Confidence Bounds for Regression Based on a Simple Moving Average , 1985 .

[91]  Marston Morse The Foundations of a Theory of the Calculus of Variations in the Large in m-Space (Second Paper) , 1930 .

[92]  Jussi Klemel,et al.  Smoothing of Multivariate Data , 2009 .

[93]  P Ducimetière,et al.  Comparison of receiver operating curves derived from the same population: a bootstrapping approach. , 1985, Computers and biomedical research, an international journal.

[94]  Kengo Kato,et al.  Empirical and multiplier bootstraps for suprema of empirical processes of increasing complexity, and related Gaussian couplings , 2015, 1502.00352.

[95]  Max H. Farrell,et al.  On the Effect of Bias Estimation on Coverage Accuracy in Nonparametric Inference , 2015, Journal of the American Statistical Association.

[96]  Yen-Chi Chen,et al.  Generalized cluster trees and singular measures , 2016, The Annals of Statistics.

[97]  L. Horváth,et al.  Confidence bands for ROC curves , 2008 .

[98]  Uwe Einmahl,et al.  Uniform in bandwidth consistency of kernel-type function estimators , 2005 .

[99]  M. Morse Relations between the critical points of a real function of $n$ independent variables , 1925 .

[100]  M. Woodroofe On Choosing a Delta-Sequence , 1970 .

[101]  Y. P. Mack REMARKS ON SOME SMOOTHED EMPIRICAL DISTRIBUTION FUNCTIONS AND PROCESS , 1984 .

[102]  Sivaraman Balakrishnan,et al.  Confidence sets for persistence diagrams , 2013, The Annals of Statistics.

[103]  Larry A. Wasserman,et al.  Nonparametric Ridge Estimation , 2012, ArXiv.

[104]  L. Wasserman,et al.  Enhanced Mode Clustering , 2014 .

[105]  Stergios B. Fotopoulos,et al.  All of Nonparametric Statistics , 2007, Technometrics.

[106]  David Mason,et al.  On the Estimation of the Gradient Lines of a Density and the Consistency of the Mean-Shift Algorithm , 2016, J. Mach. Learn. Res..

[107]  J. Yukich Laws of large numbers for classes of functions , 1985 .

[108]  Wolfgang Härdle,et al.  BOOTSTRAP INFERENCE IN SEMIPARAMETRIC GENERALIZED ADDITIVE MODELS , 2004, Econometric Theory.

[109]  A. Bowman An alternative method of cross-validation for the smoothing of density estimates , 1984 .

[110]  Thomas M. Stoker Smoothing bias in density derivative estimation , 1993 .

[111]  Larry A. Wasserman,et al.  Risk Bounds For Mode Clustering , 2015, ArXiv.

[112]  Jussi Klemel,et al.  Smoothing of Multivariate Data: Density Estimation and Visualization , 2009 .

[113]  C. J. Stone,et al.  Optimal Global Rates of Convergence for Nonparametric Regression , 1982 .

[114]  Yen-Chi Chen,et al.  Density Level Sets: Asymptotics, Inference, and Visualization , 2015, 1504.05438.

[115]  G. Campbell,et al.  Advances in statistical methodology for the evaluation of diagnostic and laboratory tests. , 1994, Statistics in medicine.

[116]  P. Hall On Bootstrap Confidence Intervals in Nonparametric Regression , 1992 .

[117]  Kengo Kato,et al.  Anti-concentration and honest, adaptive confidence bands , 2013, 1303.7152.

[118]  Joseph P. Romano On weak convergence and optimality of kernel density estimates of the mode , 1988 .

[119]  Eugene Demidenko,et al.  Confidence intervals and bands for the binormal ROC curve revisited , 2012, Journal of applied statistics.

[120]  L. Fernholz,et al.  Almost sure convergence of smoothed empirical distribution functions , 1991 .

[121]  Yingcun Xia,et al.  Bias‐corrected confidence bands in nonparametric regression , 1998 .

[122]  P. Hall EFFECT OF BIAS ESTIMATION ON COVERAGE ACCURACY OF BOOTSTRAP CONFIDENCE INTERVALS FOR A PROBABILITY DENSITY , 1992 .

[123]  Richard Nickl,et al.  Uniform central limit theorems for kernel density estimators , 2008 .

[124]  W. Polonik Measuring Mass Concentrations and Estimating Density Contour Clusters-An Excess Mass Approach , 1995 .

[125]  M. Hazelton Variable kernel density estimation , 2003 .

[126]  Joseph E. Yukich,et al.  Weak Convergence of Smoothed Empirical Processes , 1992 .

[127]  D. Politis,et al.  Bootstrap confidence intervals in nonparametric regression with built-in bias correction , 2008 .

[128]  David Hinkley,et al.  Bootstrap Methods: Another Look at the Jackknife , 2008 .

[129]  José E. Chacón,et al.  A Population Background for Nonparametric Density-Based Clustering , 2014, 1408.1381.

[130]  Deniz Erdogmus,et al.  Locally Defined Principal Curves and Surfaces , 2011, J. Mach. Learn. Res..