Nonparametric modal regression

Modal regression estimates the local modes of the distribution of $Y$ given $X=x$, instead of the mean, as in the usual regression sense, and can hence reveal important structure missed by usual regression methods. We study a simple nonparametric method for modal regression, based on a kernel density estimate (KDE) of the joint distribution of $Y$ and $X$. We derive asymptotic error bounds for this method, and propose techniques for constructing confidence sets and prediction sets. The latter is used to select the smoothing bandwidth of the underlying KDE. The idea behind modal regression is connected to many others, such as mixture regression and density ridge estimation, and we discuss these ties as well.

[1]  Larry D. Hostetler,et al.  The estimation of the gradient of a density function, with applications in pattern recognition , 1975, IEEE Trans. Inf. Theory.

[2]  B. Efron Bootstrap Methods: Another Look at the Jackknife , 1979 .

[3]  T. Sager,et al.  Maximum Likelihood Estimation of Isotonic Modal Regression , 1982 .

[4]  Joseph P. Romano On weak convergence and optimality of kernel density estimates of the mode , 1988 .

[5]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[6]  D. W. Scott,et al.  Multivariate Density Estimation, Theory, Practice and Visualization , 1992 .

[7]  Yizong Cheng,et al.  Mean Shift, Mode Seeking, and Clustering , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Rob J Hyndman,et al.  Estimating and Visualizing Conditional Densities , 1996 .

[9]  David H. Eberly,et al.  Ridges in Image and Data Analysis , 1996, Computational Imaging and Vision.

[10]  M. Tanner,et al.  Hierarchical mixtures-of-experts for exponential family regression models: approximation and maximum , 1999 .

[11]  Uwe Einmahl,et al.  An Empirical Process Approach to the Uniform Consistency of Kernel-Type Function Estimators , 2000 .

[12]  P. Massart,et al.  About the constants in Talagrand's concentration inequalities for empirical processes , 2000 .

[13]  Kert Viele,et al.  Modeling with Mixtures of Linear Regressions , 2002, Stat. Comput..

[14]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  E. Giné,et al.  Rates of strong uniform consistency for multivariate kernel density estimators , 2002 .

[16]  Uwe Einmahl,et al.  Uniform in bandwidth consistency of kernel-type function estimators , 2005 .

[17]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[18]  Miguel Á. Carreira-Perpiñán,et al.  Gaussian Mean-Shift Is an EM Algorithm , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  B. Lindsay,et al.  Bayesian Mixture Labeling by Highest Posterior Density , 2009 .

[20]  M. Wand,et al.  ASYMPTOTICS FOR GENERAL MULTIVARIATE KERNEL DENSITY DERIVATIVE ESTIMATORS , 2011 .

[21]  Runze Li,et al.  Local modal regression , 2012, Journal of nonparametric statistics.

[22]  T. Duong,et al.  Data-driven density derivative estimation, with applications to nonparametric clustering and bump hunting , 2012, 1204.6160.

[23]  D. Hunter,et al.  Semiparametric mixtures of regressions , 2012 .

[24]  Larry A. Wasserman,et al.  Nonparametric Ridge Estimation , 2012, ArXiv.

[25]  W. Yao,et al.  Mixture of Regression Models With Varying Mixing Proportions: A Semiparametric Approach , 2012 .

[26]  Victor Chernozhukov,et al.  Anti-concentration and honest, adaptive confidence bands , 2013 .

[27]  Runze Li,et al.  Nonparametric Mixture of Regression Models , 2013, Journal of the American Statistical Association.

[28]  Percy Liang,et al.  Spectral Experts for Estimating Mixtures of Linear Regressions , 2013, ICML.

[29]  L. Wasserman,et al.  A Comprehensive Approach to Mode Clustering , 2014, 1406.1780.

[30]  W. Yao,et al.  A New Regression Model: Modal Linear Regression , 2014 .

[31]  Larry A. Wasserman,et al.  Generalized Mode and Ridge Estimation , 2014, ArXiv.

[32]  Kengo Kato,et al.  Gaussian approximation of suprema of empirical processes , 2014 .

[33]  Christopher R. Genovese,et al.  Enhanced Mode Clustering , 2014 .

[34]  Yen-Chi Chen,et al.  Density Level Sets: Asymptotics, Inference, and Visualization , 2015, 1504.05438.

[35]  Comparison and anti-concentration bounds for maxima of Gaussian random vectors , 2015 .

[36]  Christopher R. Genovese,et al.  Asymptotic theory for density ridges , 2014, 1406.5663.

[37]  David Mason,et al.  On the Estimation of the Gradient Lines of a Density and the Consistency of the Mean-Shift Algorithm , 2016, J. Mach. Learn. Res..