A Diagnostic Measure for Influential Observations in Linear Regression

In linear regression it is a common practice of measuring influence of an observation is to delete the case from the analysis and to investigate the change in the parameters or in the vector of forecasts resulting from this deletion. Pena (2005) introduced a new idea to measure the influence of an observation based on how this observation is being influenced by the rest of the data. In this article we propose a new influence measure extending the idea of Pena to group deletion for identifying multiple influential observations in linear regression. We investigate the usefulness of the proposed technique by two well-referred data sets, an artificial large data with high-dimension and heterogeneous sample points and by reporting a Monte Carlo simulation experiment.

[1]  R. Welsch INFLUENCE FUNCTIONS AND REGRESSION DIAGNOSTICS , 1982 .

[2]  D. Pregibon Logistic Regression Diagnostics , 1981 .

[3]  Kenneth Portier,et al.  Robust Diagnostic Regression Analysis , 2002, Technometrics.

[4]  S. Weisberg,et al.  Residuals and Influence in Regression , 1982 .

[5]  Peter J. Rousseeuw,et al.  Robust regression and outlier detection , 1987 .

[6]  A. H. M. Rahmatullah Imon,et al.  Identifying multiple influential observations in linear regression , 2005 .

[7]  Ali S. Hadi,et al.  Regression Analysis by Example: Chatterjee/Regression , 2006 .

[8]  S. Chatterjee,et al.  Regression Analysis by Example , 1979 .

[9]  J. Brian Gray,et al.  Introduction to Linear Regression Analysis , 2002, Technometrics.

[10]  G. V. Kass,et al.  Location of Several Outliers in Multiple-Regression Data Using Elemental Sets , 1984 .

[11]  J. Simonoff,et al.  Procedures for the Identification of Multiple Outliers in Linear Models , 1993 .

[12]  Daniel Zelterman Applied Linear Models with SAS: Introduction to Linear Regression , 2010 .

[13]  P. Rousseeuw Least Median of Squares Regression , 1984 .

[14]  Jammalamadaka Introduction to Linear Regression Analysis (3rd ed.) , 2003 .

[15]  A. Hossain,et al.  A comparative study on detection of influential observations in linear regression , 1991 .

[16]  R. Cook Detection of influential observation in linear regression , 2000 .

[17]  Daniel Peña,et al.  A New Statistic for Influence in Linear Regression , 2005, Technometrics.

[18]  V. Yohai,et al.  The Detection of Influential Subsets in Linear Regression by Using an Influence Matrix , 1995 .

[19]  A. Hadi,et al.  BACON: blocked adaptive computationally efficient outlier nominators , 2000 .

[20]  M. R. Norazan,et al.  The performance of diagnostic-robust generalized potentials for the identification of multiple high leverage points in linear regression , 2009 .

[21]  Peter J. Rousseeuw,et al.  Robust Regression and Outlier Detection , 2005, Wiley Series in Probability and Statistics.

[22]  Jeffrey S. Simonoff,et al.  General Approaches to Stepwise Identification of Unusual Values in Data Analysis , 1991 .

[23]  J. A. John,et al.  Influential Observations and Outliers in Regression , 1981 .

[24]  W. W. Muir,et al.  Regression Diagnostics: Identifying Influential Data and Sources of Collinearity , 1980 .

[25]  Madeleine Walker,et al.  Masking unmasked , 2002, The Journal of audiovisual media in medicine.