Space Partitioning and Regression Mode Seeking via a Mean-Shift-Inspired Algorithm

The mean shift (MS) algorithm is a nonparametric method used to cluster sample points and find the local modes of kernel density estimates, using an idea based on iterative gradient ascent. In this paper we develop a mean-shift-inspired algorithm to estimate the modes of regression functions and partition the sample points in the input space. We prove convergence of the sequences generated by the algorithm and derive the non-asymptotic rates of convergence of the estimated local modes for the underlying regression model. We also demonstrate the utility of the algorithm for data-enabled discovery through an application on biomolecular structure data. An extension to subspace constrained mean shift (SCMS) algorithm used to extract ridges of regression functions is briefly discussed.

[1]  Qi Li,et al.  Gradient-based smoothing parameter selection for nonparametric regression estimation , 2015 .

[2]  J. Gaudart,et al.  Oblique decision trees for spatial pattern detection: optimal algorithm and application to malaria risk. , 2005 .

[3]  Nadine Dessay,et al.  SPODT: An R Package to Perform Spatial Partitioning , 2015 .

[4]  Larry D. Hostetler,et al.  The estimation of the gradient of a density function, with applications in pattern recognition , 1975, IEEE Trans. Inf. Theory.

[5]  Yizong Cheng,et al.  Mean Shift, Mode Seeking, and Clustering , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  David Mason,et al.  On the Estimation of the Gradient Lines of a Density and the Consistency of the Mean-Shift Algorithm , 2016, J. Mach. Learn. Res..

[7]  Dorin Comaniciu,et al.  Kernel-Based Object Tracking , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Christopher F. Parmeter,et al.  Applied Nonparametric Econometrics , 2015 .

[9]  Joscha Legewie Living on the Edge: Neighborhood Boundaries and the Spatial Dynamics of Violent Crime , 2018, Demography.

[10]  Toshiyuki Tanaka,et al.  Properties of Mean Shift , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  José E. Chacón,et al.  A Population Background for Nonparametric Density-Based Clustering , 2014, 1408.1381.

[12]  Larry A. Wasserman,et al.  Nonparametric Ridge Estimation , 2012, ArXiv.

[13]  L. Wasserman,et al.  Nonparametric modal regression , 2014, 1412.1716.

[14]  Klaus Ziegler,et al.  On nonparametric kernel estimation of the mode of the regression function in the random design model , 2002 .

[15]  Linglong Kong,et al.  House Price Modeling over Heterogeneous Regions with Hierarchical Spatial Functional Analysis , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[16]  Erion Plaku,et al.  Structure-Guided Protein Transition Modeling with a Probabilistic Roadmap Algorithm , 2018, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[17]  G. Tutz,et al.  Modelling beyond regression functions: an application of multimodal regression to speed–flow data , 2006 .

[18]  Deniz Erdogmus,et al.  Locally Defined Principal Curves and Surfaces , 2011, J. Mach. Learn. Res..

[19]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[20]  Erion Plaku,et al.  Sample-Based Models of Protein Energy Landscapes and Slow Structural Rearrangements , 2018, J. Comput. Biol..

[21]  Ruth Nussinov,et al.  Mapping the Conformation Space of Wildtype and Mutant H-Ras with a Memetic, Cellular, and Multiscale Evolutionary Algorithm , 2015, PLoS Comput. Biol..

[22]  E. Giné,et al.  Rates of strong uniform consistency for multivariate kernel density estimators , 2002 .

[23]  Heinrich Jiang,et al.  Non-Asymptotic Uniform Rates of Consistency for k-NN Regression , 2017, AAAI.

[24]  D. Pollard,et al.  $U$-Processes: Rates of Convergence , 1987 .

[25]  Youness Aliyari Ghassabeh,et al.  A sufficient condition for the convergence of the mean shift algorithm with Gaussian kernel , 2015, J. Multivar. Anal..

[26]  H. Muller Kernel estimators of zeros and of location and size of extrema of regression functions , 1985 .

[27]  Kengo Kato,et al.  Gaussian approximation of suprema of empirical processes , 2012, 1212.6885.

[28]  Wanli Qiao,et al.  Submitted to the Annals of Statistics THEORETICAL ANALYSIS OF NONPARAMETRIC FILAMENT ESTIMATION By , 2015 .

[29]  Hans-Georg Müller,et al.  Adaptive Nonparametric Peak Estimation , 1989 .

[30]  Ingo Steinwart,et al.  Consistency and Rates for Clustering with DBSCAN , 2012, AISTATS.

[31]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[32]  Uwe Einmahl,et al.  Uniform in bandwidth consistency of kernel-type function estimators , 2005 .

[33]  D. Serre Matrices: Theory and Applications , 2002 .