Confidence bands for least squares support vector machine classifiers: A regression approach

This paper presents bias-corrected 100([email protected])% simultaneous confidence bands for least squares support vector machine classifiers based on a regression framework. The bias, which is inherently present in every nonparametric method, is estimated using double smoothing. In order to obtain simultaneous confidence bands we make use of the volume-of-tube formula. We also provide extensions of this formula in higher dimensions and show that the width of the bands are expanding with increasing dimensionality. Simulations and data analysis support its usefulness in practical real life classification problems.

[1]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[2]  C. Loader,et al.  Simultaneous Confidence Bands for Linear Regression and Smoothing , 1994 .

[3]  Jiayang Sun Tail probabilities of the maxima of Gaussian random fields , 1993 .

[4]  Alexandre B. Tsybakov,et al.  Introduction to Nonparametric Estimation , 2008, Springer series in statistics.

[5]  Yaonan Wang,et al.  Texture classification using the support vector machines , 2003, Pattern Recognit..

[6]  Richard A. Davis,et al.  On Some Global Measures of the Deviations of Density Function Estimates , 2011 .

[7]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[8]  Jianqing Fan,et al.  Local polynomial modelling and its applications , 1994 .

[9]  S. Bochner Lectures on Fourier Integrals. (AM-42) , 1959 .

[10]  N. Hengartner,et al.  Recursive bias estimation and L2 boosting , 2008, 0801.4629.

[11]  C. Loader,et al.  Robustness of Tube Formula Based Confidence Bands , 1997 .

[12]  Z. Šidák Rectangular Confidence Regions for the Means of Multivariate Normal Distributions , 1967 .

[13]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[14]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[15]  Giulio Sandini,et al.  On-line independent support vector machines , 2010, Pattern Recognit..

[16]  Adam Krzyzak,et al.  A Distribution-Free Theory of Nonparametric Regression , 2002, Springer series in statistics.

[17]  Guohua Pan,et al.  Local Regression and Likelihood , 1999, Technometrics.

[18]  L. Wasserman All of Nonparametric Statistics , 2005 .

[19]  Robert B. Abernethy,et al.  The new Weibull handbook , 1993 .

[20]  W. Cleveland,et al.  Locally Weighted Regression: An Approach to Regression Analysis by Local Fitting , 1988 .

[21]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machines , 2002 .

[22]  László Györfi,et al.  A Probabilistic Theory of Pattern Recognition , 1996, Stochastic Modelling and Applied Probability.

[23]  Huafu Chen,et al.  Two-class support vector data description , 2011, Pattern Recognit..

[24]  James Stephen Marron,et al.  Regression smoothing parameters that are not far from their optimum , 1992 .

[25]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[26]  Johan A. K. Suykens,et al.  Approximate Confidence and Prediction Intervals for Least Squares Support Vector Regression , 2011, IEEE Transactions on Neural Networks.

[27]  Bradley Efron,et al.  The length heuristic for simultaneous hypothesis tests , 1997 .

[28]  M. Wand Local Regression and Likelihood , 2001 .

[29]  Gerda Claeskens,et al.  Simultaneous Confidence Bands for Penalized Spline Estimators , 2009 .

[30]  M. Wand,et al.  Semiparametric Regression: Parametric Regression , 2003 .

[31]  M. C. Jones,et al.  Generalized jackknifing and higher order kernels , 1993 .