Bin width selection in multivariate histograms by the combinatorial method

AbstractWe present several multivariate histogram density estimates that are universallyL1-optimal to within a constant factor and an additive term $$O\left( {\sqrt {\log {n \mathord{\left/ {\vphantom {n n}} \right. \kern-\nulldelimiterspace} n}} } \right)$$ . The bin widths are chosen by the combinatorial method developed by the authors inCombinatorial Methods in Density Estimation (Springer-Verlag, 2001). The present paper solves a problem left open in that book.

[1]  H. P. Annales de l'Institut Henri Poincaré , 1931, Nature.

[2]  L. Schläfli Gesammelte mathematische Abhandlungen , 1950 .

[3]  M. Gessaman A Consistent Nonparametric Multivariate Density Estimator Based on Statistically Equivalent Blocks , 1970 .

[4]  Vladimir Vapnik,et al.  Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .

[5]  J. V. Ryzin,et al.  A histogram method of density estimation , 1973 .

[6]  Helmut Hasse,et al.  Mathematische Abhandlungen 3 , 1975 .

[7]  J. V. Ryzin,et al.  Uniform consistency of a histogram density estimator and modal estimation , 1975 .

[8]  Saab Abou-Jaoudé Conditions nécessaires et suffisantes de convergence L1 en probabilité de l'histogramme pour une densité , 1976 .

[9]  Saab Abou-Jaoudé Sur la convergence L1 et L∞ de l'estimateur de la partition aléatoire pour une densité , 1976 .

[10]  R Collins,et al.  Maximum entropy histograms , 1977 .

[11]  D. W. Scott On optimal and data based histograms , 1979 .

[12]  D. Freedman,et al.  On the histogram as a density estimator:L2 theory , 1981 .

[13]  M. Rudemo Empirical Choice of Histograms and Kernel Density Estimators , 1982 .

[14]  L. Devroye,et al.  Nonparametric Density Estimation: The L 1 View. , 1985 .

[15]  The L2-optimal cell width for the histogram , 1985 .

[16]  John Van Ryzin,et al.  Large sample properties of maximum entropy histograms , 1986, IEEE Trans. Inf. Theory.

[17]  Atsuyuki Kogure,et al.  Asymptotically Optimal Cells for a Historgram , 1987 .

[18]  L. Devroye A Course in Density Estimation , 1987 .

[19]  L. Zhao,et al.  Almost sure L 1 -norm convergence for data-based histogram density estimates , 1987 .

[20]  C. C. Taylor Akaike's information criterion and the histogram , 1987 .

[21]  L. Devroye,et al.  Nonparametric density estimation : the L[1] view , 1987 .

[22]  E. Hannan,et al.  On stochastic complexity and nonparametric density estimation , 1988 .

[23]  Yuichiro Kanazawa An optimal variable cell histogram , 1988 .

[24]  T. Atilgan On derivaton and application of aic as a data-based criterion for histograms , 1990 .

[25]  Peter Hall,et al.  Akaike's information criterion and Kullback-Leibler loss for histogram density estimation , 1990 .

[26]  L. Zhao,et al.  Almost Sure $L_r$-Norm Convergence for Data-Based Histogram Density Estimates , 1991 .

[27]  Yuichiro Kanazawa An Optimal Variable Cell Histogram Based on the Sample Spacings , 1992 .

[28]  T. Speed,et al.  Data compression and histograms , 1992 .

[29]  Yuichiro Kanazawa,et al.  Hellinger distance and Akaike's information criterion for the histogram , 1993 .

[30]  Hellinger distance and Kullback—Leibler loss for the kernel density estimator , 1993 .

[31]  G. Lugosi,et al.  Consistency of Data-driven Histogram Methods for Density Estimation and Classification , 1996 .

[32]  G. Lugosi,et al.  A universally acceptable smoothing factor for kernel density estimates , 1996 .

[33]  G. Lugosi,et al.  Nonasymptotic universal smoothing factors, kernel complexity and yatracos classes , 1997 .

[34]  M. Wand Data-Based Choice of Histogram Bin Width , 1997 .

[35]  P. Massart,et al.  Risk bounds for model selection via penalization , 1999 .

[36]  Jan W. H. Swanepoel,et al.  Simple and effective number-of-bins circumference selectors for a histogram , 1999, Stat. Comput..

[37]  Gwenaelle Castellan Sélection d'histogrammes ou de modèles exponentiels de polynômes par morceaux à l'aide d'un critère de type Akaike , 2000 .

[38]  Luc Devroye,et al.  Combinatorial methods in density estimation , 2001, Springer series in statistics.

[39]  Luc Devroye,et al.  On the risk of estimates for block decreasing densities , 2003 .

[40]  Yves Rozenholc,et al.  How many bins should be put in a regular histogram , 2006 .