Quantile regression with ℓ1—regularization and Gaussian kernels

The quantile regression problem is considered by learning schemes based on ℓ1—regularization and Gaussian kernels. The purpose of this paper is to present concentration estimates for the algorithms. Our analysis shows that the convergence behavior of ℓ1—quantile regression with Gaussian kernels is almost the same as that of the RKHS-based learning schemes. Furthermore, the previous analysis for kernel-based quantile regression usually requires that the output sample values are uniformly bounded, which excludes the common case with Gaussian noise. Our error analysis presented in this paper can give satisfactory convergence rates even for unbounded sampling processes. Besides, numerical experiments are given which support the theoretical results.

[1]  O. Mangasarian,et al.  Massive data discrimination via linear support vector machines , 2000 .

[2]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[3]  Ingo Steinwart,et al.  On the Influence of the Kernel on the Consistency of Support Vector Machines , 2002, J. Mach. Learn. Res..

[4]  Héctor Pomares,et al.  Multiobjective evolutionary optimization of the size, shape, and position parameters of radial basis function networks for function approximation , 2003, IEEE Trans. Neural Networks.

[5]  Shuning Wang,et al.  Nonlinear system identification with continuous piecewise linear neural network , 2012, Neurocomputing.

[6]  Vladimir Cherkassky,et al.  Comparison of adaptive methods for function estimation from samples , 1996, IEEE Trans. Neural Networks.

[7]  Ding-Xuan Zhou,et al.  The covering number in learning theory , 2002, J. Complex..

[8]  Andreas Christmann,et al.  Bouligand Derivatives and Robustness of Support Vector Machines for Regression , 2007, J. Mach. Learn. Res..

[9]  Andreas Christmann,et al.  How SVMs can estimate quantiles and the median , 2007, NIPS.

[10]  Johan A. K. Suykens,et al.  Support Vector Machine Classifier With Pinball Loss , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Jon A. Wellner,et al.  Weak Convergence and Empirical Processes: With Applications to Statistics , 1996 .

[12]  Andreas Christmann,et al.  Support vector machines , 2008, Data Mining and Knowledge Discovery Handbook.

[13]  A. Belloni,et al.  L1-Penalized Quantile Regression in High Dimensional Sparse Models , 2009, 0904.2931.

[14]  N. Aronszajn Theory of Reproducing Kernels. , 1950 .

[15]  E. Stein Singular Integrals and Di?erentiability Properties of Functions , 1971 .

[16]  Federico Girosi,et al.  On the Relationship between Generalization Error, Hypothesis Complexity, and Sample Complexity for Radial Basis Functions , 1996, Neural Computation.

[17]  Cheng Wang,et al.  Optimal learning rates for least squares regularized regression with unbounded sampling , 2011, J. Complex..

[18]  Ingo Steinwart How to Compare Different Loss Functions and Their Risks , 2007 .

[19]  G. Wahba Spline models for observational data , 1990 .

[20]  Peng Zhao,et al.  On Model Selection Consistency of Lasso , 2006, J. Mach. Learn. Res..

[21]  Ding-Xuan Zhou,et al.  Concentration estimates for learning with unbounded sampling , 2013, Adv. Comput. Math..

[22]  Ding-Xuan Zhou,et al.  SVM Soft Margin Classifiers: Linear Programming versus Quadratic Programming , 2005, Neural Computation.

[23]  R. Koenker Quantile Regression: Name Index , 2005 .

[24]  A. Belloni,et al.  L1-Penalised quantile regression in high-dimensional sparse models , 2009 .

[25]  Ingo Steinwart,et al.  Optimal regression rates for SVMs using Gaussian kernels , 2013 .

[26]  R. Koenker,et al.  Reappraising Medfly Longevity , 2001 .

[27]  S. Smale,et al.  ESTIMATING THE APPROXIMATION ERROR IN LEARNING THEORY , 2003 .

[28]  Hui Zou,et al.  Efficient Global Approximation of Generalized Nonlinear ℓ1-Regularized Solution Paths and Its Applications , 2009 .

[29]  Alberto Suárez,et al.  Globally Optimal Fuzzy Decision Trees for Classification and Regression , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[30]  Dao-Hong Xiang,et al.  Classification with Gaussians and Convex Loss , 2009, J. Mach. Learn. Res..

[31]  Alexander J. Smola,et al.  Nonparametric Quantile Estimation , 2006, J. Mach. Learn. Res..

[32]  Joel L. Horowitz,et al.  Binary Response Models: Logits, Probits and Semiparametrics , 2001 .

[33]  Holger Wendland,et al.  Scattered Data Approximation: Conditionally positive definite functions , 2004 .

[34]  Ding-Xuan Zhou,et al.  Concentration estimates for learning with ℓ1-regularizer and data dependent hypothesis spaces , 2011 .

[35]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[36]  Felipe Cucker,et al.  Learning Theory: An Approximation Theory Viewpoint (Cambridge Monographs on Applied & Computational Mathematics) , 2007 .

[37]  Fred J. Hickernell,et al.  Reproducing Kernel Banach Spaces with the l1 Norm , 2011, ArXiv.

[38]  Ingo Steinwart,et al.  Estimating conditional quantiles with the help of the pinball loss , 2011, 1102.2101.

[39]  Patrick J. Heagerty,et al.  Semiparametric estimation of regression quantiles with application to standardizing weight for height and age in US children , 1999 .

[40]  Dao-Hong Xiang,et al.  Conditional quantiles with varying Gaussians , 2013, Adv. Comput. Math..

[41]  Ingo Steinwart,et al.  Fast rates for support vector machines using Gaussian kernels , 2007, 0708.1838.

[42]  G. Bennett Probability Inequalities for the Sum of Independent Random Variables , 1962 .

[43]  Yiming Ying,et al.  Multi-kernel regularized classifiers , 2007, J. Complex..

[44]  Keming Yu,et al.  Quantile regression: applications and current research areas , 2003 .

[45]  Yeung Yam,et al.  A Neural Network of Smooth Hinge Functions , 2010, IEEE Transactions on Neural Networks.

[46]  F. Girosi,et al.  Networks for approximation and learning , 1990, Proc. IEEE.

[47]  Yiming Ying,et al.  Support Vector Machine Soft Margin Classifiers: Error Analysis , 2004, J. Mach. Learn. Res..

[48]  Ding-Xuan Zhou,et al.  Learning with sample dependent hypothesis spaces , 2008, Comput. Math. Appl..

[49]  Ding-Xuan Zhou,et al.  High order Parzen windows and randomized sampling , 2009, Adv. Comput. Math..