Detecting Multiple Influential Observations in High Dimensional Linear Regression

In this paper, we consider the detection of multiple influential observations in high dimensional regression, where the p number of covariates is much larger than sample size n. Detection of influential observations in high dimensional regression is challenging. In the case of single influential observation, Zhao et al. (2013) developed a method called High dimensional Influence Measure (HIM). However, the result of HIM is not applicable to the case of multiple influential observations, where the detection of influential observations is much more complicated than the case of single influential observation. We propose in this paper a new method based on the multiple deletion to detect the multiple influential.

[1]  Ali S. Hadi,et al.  Procedures for the identification of multiple influential observations in linear regression , 2014 .

[2]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[3]  R. Cook Detection of influential observation in linear regression , 2000 .

[4]  Daniel Peña,et al.  A New Statistic for Influence in Linear Regression , 2005, Technometrics.

[5]  Yiyuan She,et al.  Outlier Detection Using Nonconvex Penalized Regression , 2010, ArXiv.

[6]  W Y Zhang,et al.  Discussion on `Sure independence screening for ultra-high dimensional feature space' by Fan, J and Lv, J. , 2008 .

[7]  Hansheng Wang,et al.  Robust Regression Shrinkage and Consistent Variable Selection Through the LAD-Lasso , 2007 .

[8]  A. H. M. Rahmatullah Imon,et al.  Identifying multiple influential observations in linear regression , 2005 .

[9]  A. Hadi,et al.  Identification of multiple high leverage points in logistic regression , 2013 .

[10]  A. Hadi,et al.  Identification of Multiple Outliers in Logistic Regression , 2008 .

[11]  D. Peña Measures of Influence and Sensitivity in Linear Regression , 2006 .

[12]  Norman R. Draper,et al.  Residuals and Their Variance Patterns , 1972 .

[13]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[14]  Jianqing Fan,et al.  Sure independence screening for ultrahigh dimensional feature space , 2006, math/0612857.

[15]  B. K. Nkansah,et al.  On the Detection of Influential Outliers in Linear Regression Analysis , 2014 .

[16]  Kai-Tai Fang,et al.  Multiple outlier detection in multivariate data using projection pursuit techniques , 2000 .

[17]  David A. Belsley,et al.  Regression Analysis and its Application: A Data-Oriented Approach.@@@Applied Linear Regression.@@@Regression Diagnostics: Identifying Influential Data and Sources of Collinearity , 1981 .

[18]  Abdul Nurunnabi,et al.  A Diagnostic Measure for Influential Observations in Linear Regression , 2011 .