Multi-dimensional Density Estimation

Modern data analysis requires a number of tools to undercover hidden structure. For initial exploration of data, animated scatter diagrams and nonparametric density estimation in many forms and varieties are the techniques of choice. This article focuses on the application of histograms and nonparametric kernel methods to explore data. The details of theory, computation, visualization, and presentation are all described.

[1]  Michael G. Schimek,et al.  Smoothing and Regression: Approaches, Computation, and Application , 2000 .

[2]  M. Wand Local Regression and Likelihood , 2001 .

[3]  D. W. Scott,et al.  Cross-Validation of Multivariate Densities , 1994 .

[4]  Peter Hall,et al.  On near neighbour estimates of a multivariate density , 1983 .

[5]  C. D. Kemp,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[6]  Edward J. Wegman,et al.  On Methods of Computer Graphics for Visualizing Densities , 2002 .

[7]  D. W. Scott,et al.  The Mode Tree: A Tool for Visualization of Nonparametric Density Features , 1993 .

[8]  Jeffrey D. Hart,et al.  Efficiency of a Kernel Density Estimator under an Autoregressive Dependence Model , 1984 .

[9]  Ian Abramson On Bandwidth Variation in Kernel Estimates-A Square Root Law , 1982 .

[10]  A. Raftery,et al.  Model-based Gaussian and non-Gaussian clustering , 1993 .

[11]  M. Wand,et al.  EXACT MEAN INTEGRATED SQUARED ERROR , 1992 .

[12]  S. Sain Theory & Methods: A New Characterization and Estimation of the Zero–bias Bandwidth , 2003 .

[13]  Edward J. Wegman,et al.  Fast Multidimensional Density Estimation Based on Random-width Bins. , 1994 .

[14]  David W. Scott,et al.  Feasibility of multivariate density estimates , 1991 .

[15]  S. Sain Multivariate locally adaptive density estimation , 2002 .

[16]  R. Eubank Nonparametric Regression and Spline Smoothing , 1999 .

[17]  Guohua Pan,et al.  Local Regression and Likelihood , 1999, Technometrics.

[18]  C. J. Stone,et al.  A study of logspline density estimation , 1991 .

[19]  B. Silverman,et al.  Using Kernel Density Estimates to Investigate Multimodality , 1981 .

[20]  Stephan R. Sain,et al.  BIAS REDUCTION AND ELIMINATION WITH KERNEL ESTIMATORS , 2001 .

[21]  B. Silverman,et al.  Kernel Density Estimation Using the Fast Fourier Transform , 1982 .

[22]  Herbert A. Sturges,et al.  The Choice of a Class Interval , 1926 .

[23]  William E. Lorensen,et al.  Marching cubes: A high resolution 3D surface construction algorithm , 1987, SIGGRAPH.

[24]  Robert P. W. Duin,et al.  On the Choice of Smoothing Parameters for Parzen Estimators of Probability Density Functions , 1976, IEEE Transactions on Computers.

[25]  K. Taylor,et al.  An overview of results from the Coupled Model Intercomparison Project , 2003 .

[26]  D. W. Scott On optimal and data based histograms , 1979 .

[27]  Edward J. Wegman,et al.  Statistical Image Processing and Graphics , 1986 .

[28]  M. C. Jones,et al.  A Brief Survey of Bandwidth Selection for Density Estimation , 1996 .

[29]  David W. Scott,et al.  INCORPORATING DENSITY ESTIMATION INTO OTHER EXPLORATORY TOOLS1 , 1995 .

[30]  G. Wahba Data-Based Optimal Smoothing of Orthogonal Series Density Estimates , 1981 .

[31]  Stephan R. Sain,et al.  A New Test for Outlier Detection from a Multivariate Mixture Distribution , 1997 .

[32]  Andreas Buja,et al.  Xgobi: Interactive Dynamic Graphics In The X Window System With A Link To S , 1991 .

[33]  L. Wasserman,et al.  Practical Bayesian Density Estimation Using Mixtures of Normals , 1997 .

[34]  F. Yates,et al.  Statistical methods for research workers. 5th edition , 1935 .

[35]  A. Bowman,et al.  Applied smoothing techniques for data analysis : the kernel approach with S-plus illustrations , 1999 .

[36]  Luc Devroye,et al.  Combinatorial methods in density estimation , 2001, Springer series in statistics.

[37]  C. Quesenberry,et al.  A nonparametric estimate of a multivariate density function , 1965 .

[38]  Yuichiro Kanazawa An Optimal Variable Cell Histogram Based on the Sample Spacings , 1992 .

[39]  Atsuyuki Kogure,et al.  Asymptotically Optimal Cells for a Historgram , 1987 .

[40]  Jerome H. Friedman,et al.  On Bias, Variance, 0/1—Loss, and the Curse-of-Dimensionality , 2004, Data Mining and Knowledge Discovery.

[41]  J. Marron,et al.  Improved Variable Window Kernel Estimates of Probability Densities , 1995 .

[42]  Geoffrey S. Watson,et al.  Density Estimation by Orthogonal Series , 1969 .

[43]  M. Wand,et al.  Multivariate plug-in bandwidth selection , 1994 .

[44]  M. Rudemo Empirical Choice of Histograms and Kernel Density Estimators , 1982 .

[45]  P. Fayers,et al.  The Visual Display of Quantitative Information , 1990 .

[46]  D. W. Scott,et al.  Variable Kernel Density Estimation , 1992 .

[47]  D. W. Scott,et al.  Oversmoothed Nonparametric Density Estimates , 1985 .

[48]  M. Rosenblatt,et al.  Multivariate k-nearest neighbor density estimates , 1979 .

[49]  David J. Marchette,et al.  Adaptive mixtures: Recursive nonparametric pattern recognition , 1991, Pattern Recognit..

[50]  Peter L. Brooks,et al.  Visualizing data , 1997 .

[51]  J. Simonoff Smoothing Methods in Statistics , 1998 .

[52]  Luc Devroye,et al.  Variable Kernel Estimates: On the Impossibility of Tuning the Parameters , 1998 .

[53]  Stephan R. Sain,et al.  Outlier detection from a mixture distribution when training data are unlabeled , 1999, Bulletin of the Seismological Society of America.

[54]  Paul H. C. Eilers,et al.  Flexible smoothing with B-splines and penalties , 1996 .

[55]  E. Wegman Nonparametric probability density estimation , 1972 .

[56]  Jianqing Fan,et al.  Local polynomial modelling and its applications , 1994 .

[57]  J. Marron,et al.  SiZer for Exploration of Structures in Curves , 1999 .

[58]  E. Wegman Nonparametric Probability Density Estimation: I. A Summary of Available Methods , 1972 .

[59]  L. Breiman,et al.  Variable Kernel Estimates of Multivariate Densities , 1977 .

[60]  Lawrence D. Jackel,et al.  Handwritten Digit Recognition with a Back-Propagation Network , 1989, NIPS.

[61]  M. C. Jones Variable kernel density estimates and variable kernel density estimates , 1990 .

[62]  Karl Pearson,et al.  ON THE SYSTEMATIC FITTING OF CURVES TO OBSERVATIONS AND MEASUREMENTS , 1902 .

[63]  Stephan R. Sain,et al.  Zero‐Bias Locally Adaptive Density Estimators , 2002 .

[64]  A. Bowman An alternative method of cross-validation for the smoothing of density estimates , 1984 .

[65]  Martin L. Hazelton,et al.  Bias annihilating bandwidths for kernel density estimation at a point , 1998 .

[66]  D. W. Scott,et al.  On Locally Adaptive Density Estimation , 1996 .

[67]  M. Rosenblatt Remarks on Some Nonparametric Estimates of a Density Function , 1956 .

[68]  B. Ripley,et al.  Semiparametric Regression: Preface , 2003 .

[69]  Peter Hall,et al.  On Global Properties of Variable Bandwidth Density Estimators , 1992 .

[70]  R. Kronmal,et al.  An Introduction to the Implementation and Theory of Nonparametric Density Estimation , 1976 .

[71]  Matthew P. Wand,et al.  Kernel Smoothing , 1995 .

[72]  M. C. Jones,et al.  A reliable data-based bandwidth selection method for kernel density estimation , 1991 .

[73]  I. Johnstone,et al.  Density estimation by wavelet thresholding , 1996 .

[74]  David J. Marchette,et al.  Adaptive mixture density estimation , 1993, Pattern Recognit..

[75]  Adrian E. Raftery,et al.  Model-Based Clustering, Discriminant Analysis, and Density Estimation , 2002 .

[76]  J. Friedman,et al.  Projection Pursuit Regression , 1981 .

[77]  B. Marx The Visual Display of Quantitative Information , 1985 .

[78]  M. Hazelton,et al.  Plug-in bandwidth matrices for bivariate kernel density estimation , 2003 .

[79]  David W. Scott,et al.  Parametric Statistical Modeling by Minimum Integrated Square Error , 2001, Technometrics.

[80]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[81]  R. Tapia,et al.  Nonparametric Probability Density Estimation , 1978 .

[82]  M. Hazelton Variable kernel density estimation , 2003 .

[83]  Martin L. Hazelton Bandwidth Selection for Local Density Estimators , 1996 .

[84]  D. W. Scott,et al.  Multivariate Density Estimation, Theory, Practice and Visualization , 1992 .

[85]  Douglas C. Mont Smoothing and Regression , 2001 .

[86]  Michael C. Minnotte,et al.  Nonparametric testing of the existence of modes , 1997 .

[87]  J. Graunt,et al.  Natural and political observations made upon the bills of mortality , 1939 .

[88]  Larry S. Yaeger,et al.  Visualization of natural phenomena , 1993 .

[89]  D. W. Scott Averaged Shifted Histograms: Effective Nonparametric Density Estimators in Several Dimensions , 1985 .

[90]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[91]  Carey E. Priebe,et al.  Mixture structure analysis using the Akaike Information Criterion and the bootstrap , 1998, Stat. Comput..

[92]  M. Kendall Statistical Methods for Research Workers , 1937, Nature.

[93]  Howard Wainer Visual Revelations: Who Was Playfair? , 1997 .

[94]  Ian Mckay A note on bias reduction in variable‐kernel density estimates , 1993 .