Regularization of Case Specific Parameters: A New Approach for Improving Robustness and/or Efficiency of Statistical Methods

Regularization methods allow one to handle a variety of inferential problems where there are more covariates than cases. This allows one to consider a potentially enormous number of covariates for a problem. We exploit the power of these techniques, supersaturating models by augmenting the “natural” covariates in the problem with an additional indicator for each case in the data set. We attach a penalty term for these case-specific indicators which is designed to produce a desired effect. For regression methods with squared error loss, an 1 type penalty for case-specific parameters produces a regression which is robust to outliers and high leverage cases. Through this modification we have devised a robust LASSO which retains desirable property of the LASSO and performs better when outlying observations exist. For quantile regression methods, an 2 type penalty decreases the variance of the fit enough to overcome an increase in bias. The paradigm thus allows us to robustify procedures which lack robustness and to increase the efficiency of procedures which are robust. Including the case-specific parameters can be viewed as a modification of the current loss function to produce better estimator. For the LASSO with the squared error loss, the modification yields Huber’s loss. The check loss function in quantile regression is adjusted to be quadratic near its minimum. This modification produces an averaging effect near the target quantile thus more efficient quantile estimation in various settings. Applications ii to classification procedures such as logistic regression and support vector machines are also considered. Finally, a modification to cross validation through use of a new validation function in quantile regression is investigated. The new validation function makes use of the same adjusted check loss which is used for estimation.

[1]  Yufeng Liu,et al.  VARIABLE SELECTION IN QUANTILE REGRESSION , 2009 .

[2]  Ji Zhu,et al.  L1-Norm Quantile Regression , 2008 .

[3]  Yufeng Liu,et al.  Robust Truncated Hinge Loss Support Vector Machines , 2007 .

[4]  Alexander J. Smola,et al.  Nonparametric Quantile Estimation , 2006, J. Mach. Learn. Res..

[5]  Michael I. Jordan,et al.  Convexity, Classification, and Risk Bounds , 2006 .

[6]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[7]  M. Yuan,et al.  Efficient Empirical Bayes Variable Selection and Estimation in Linear Models , 2005 .

[8]  D. Leung Cross-validation in nonparametric regression with outliers , 2005, math/0602292.

[9]  Michael J Cortese,et al.  Visual word recognition of single-syllable words. , 2004, Journal of experimental psychology. General.

[10]  James Theiler,et al.  Grafting: Fast, Incremental Feature Selection by Gradient Descent in Function Space , 2003, J. Mach. Learn. Res..

[11]  Steve R. Gunn,et al.  Structural Modelling with Sparse Kernels , 2002, Machine Learning.

[12]  Yuh-Jye Lee,et al.  SSVM: A Smooth Support Vector Machine for Classification , 2001, Comput. Optim. Appl..

[13]  M. R. Osborne,et al.  A new approach to variable selection in least squares problems , 2000 .

[14]  M. R. Osborne,et al.  On the LASSO and its Dual , 2000 .

[15]  M. C. Jones,et al.  Local Linear Quantile Regression , 1998 .

[16]  Elvezio Ronchetti,et al.  Robust Linear Model Selection by Cross-Validation , 1997 .

[17]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[18]  R. Tibshirani,et al.  Improvements on Cross-Validation: The 632+ Bootstrap Method , 1997 .

[19]  Doug Nychka,et al.  A Nonparametric Regression Approach to Syringe Grading for Quality Improvement , 1995 .

[20]  Pin T. Ng,et al.  Quantile smoothing splines , 1994 .

[21]  Elvezio Ronchetti,et al.  A Robust Version of Mallows's C P , 1994 .

[22]  J. Friedman,et al.  A Statistical View of Some Chemometrics Regression Tools , 1993 .

[23]  D. Pollard Asymptotics for Least Absolute Deviation Regression Estimators , 1991, Econometric Theory.

[24]  G. Wahba Spline Models for Observational Data , 1990 .

[25]  T. Cole Fitting Smoothed Centile Curves to Reference Data , 1988 .

[26]  W. Newey,et al.  Asymmetric Least Squares Estimation and Testing , 1987 .

[27]  B. Efron Estimating the Error Rate of a Prediction Rule: Improvement on Cross-Validation , 1983 .

[28]  D. Pregibon Resistant fits for some commonly used logistic models with medical application. , 1982, Biometrics.

[29]  R. Koenker,et al.  Asymptotic Theory of Least Absolute Error Regression , 1978 .

[30]  Seymour Geisser,et al.  The Predictive Sample Reuse Method with Applications , 1975 .

[31]  Austin F. Frank,et al.  Analyzing linguistic data: a practical introduction to statistics using R , 2010 .

[32]  D. Madigan Discussion of Least Angle Regression , 2003 .

[33]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[34]  Roger Koenker,et al.  L-estimatton for linear heteroscedastic models , 1994 .

[35]  G. Wahba,et al.  A completely automatic french curve: fitting spline functions by cross validation , 1975 .

[36]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.