An Algorithm for Data-Driven Bandwidth Selection

The analysis of a feature space that exhibits multiscale patterns often requires kernel estimation techniques with locally adaptive bandwidths, such as the variable-bandwidth mean shift. Proper selection of the kernel bandwidth is, however, a critical step for superior space analysis and partitioning. This paper presents a mean shift-based approach for local bandwidth selection in the multimodal, multivariate case. The method is based on a fundamental property of normal distributions regarding the bias of the normalized density gradient. This paper demonstrates that, within the large sample approximation, the local covariance is estimated by the matrix that maximizes the magnitude of the normalized mean shift vector. Using this property, the paper develops a reliable algorithm which takes into account the stability of local bandwidth estimates across scales. The validity of the theoretical results is proven in various space partitioning experiments involving the variable-bandwidth mean shift.

[1]  Ian Abramson On Bandwidth Variation in Kernel Estimates-A Square Root Law , 1982 .

[2]  R. Redner,et al.  Mixture densities, maximum likelihood, and the EM algorithm , 1984 .

[3]  James Stephen Marron,et al.  Comparison of data-driven bandwith selectors , 1988 .

[4]  P. J. Green,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[5]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[6]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[7]  Kohji Fukunaga,et al.  Introduction to Statistical Pattern Recognition-Second Edition , 1990 .

[8]  M. C. Jones,et al.  A reliable data-based bandwidth selection method for kernel density estimation , 1991 .

[9]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.

[10]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[11]  Thomas M. Stoker Smoothing bias in density derivative estimation , 1993 .

[12]  André Hardy,et al.  An examination of procedures for determining the number of clusters in a data set , 1994 .

[13]  Matthew P. Wand,et al.  Kernel Smoothing , 1995 .

[14]  J. Marron,et al.  Improved Variable Window Kernel Estimates of Probability Densities , 1995 .

[15]  M. C. Jones,et al.  A Brief Survey of Bandwidth Selection for Density Estimation , 1996 .

[16]  Narendra Ahuja,et al.  A Transform for Multiscale Image Segmentation by Integrated Edge and Region Detection , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Ran El-Yaniv,et al.  Agnostic Classification of Markovian Sequences , 1997, NIPS.

[18]  J. Simonoff Smoothing Methods in Statistics , 1998 .

[19]  J. S. Marron,et al.  Signi…cance in Scale Space for Density Estimation , 1999 .

[20]  Eric J. Pauwels,et al.  Finding Salient Regions in Images: Nonparametric Clustering for Image Segmentation and Grouping , 1999, Comput. Vis. Image Underst..

[21]  Yee Leung,et al.  Clustering by Scale-Space Filtering , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  Dorin Comaniciu,et al.  The Variable Bandwidth Mean Shift and Data-Driven Scale Selection , 2001, ICCV.

[23]  Theo Gevers Robust Histogram Construction from Color Invariants , 2001, ICCV.

[24]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  Anil K. Jain,et al.  Unsupervised Learning of Finite Mixture Models , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  Tony Lindeberg,et al.  Feature Detection with Automatic Scale Selection , 1998, International Journal of Computer Vision.

[27]  P. Anandan,et al.  Factorization with Uncertainty , 2000, International Journal of Computer Vision.