Identification of multiple influential observations in logistic regression

The identification of influential observations in logistic regression has drawn a great deal of attention in recent years. Most of the available techniques like Cook's distance and difference of fits (DFFITS) are based on single-case deletion. But there is evidence that these techniques suffer from masking and swamping problems and consequently fail to detect multiple influential observations. In this paper, we have developed a new measure for the identification of multiple influential observations in logistic regression based on a generalized version of DFFITS. The advantage of the proposed method is then investigated through several well-referred data sets and a simulation study.

[1]  J. Simonoff,et al.  Procedures for the Identification of Multiple Outliers in Linear Models , 1993 .

[2]  J. A. Díaz-García,et al.  SENSITIVITY ANALYSIS IN LINEAR REGRESSION , 2022 .

[3]  A. Hadi,et al.  Identification of Multiple Outliers in Logistic Regression , 2008 .

[4]  S. Chatterjee,et al.  Regression Analysis by Example , 1979 .

[5]  N. Draper,et al.  Applied Regression Analysis , 1966 .

[6]  Thomas P. Ryan,et al.  Modern Regression Methods , 1996 .

[7]  Trevor Hastie,et al.  Statistical Models in S , 1991 .

[8]  D. J. Finney,et al.  The estimation from individual records of the relationship between dose and quantal response. , 1947, Biometrika.

[9]  A. Hadi,et al.  BACON: blocked adaptive computationally efficient outlier nominators , 2000 .

[10]  Yat Sun Poon,et al.  Conformal normal curvature and assessment of local influence , 1999 .

[11]  David W. Hosmer,et al.  Applied Logistic Regression , 1991 .

[12]  Peter J. Rousseeuw,et al.  Robust Regression and Outlier Detection , 2005, Wiley Series in Probability and Statistics.

[13]  Ali S. Hadi,et al.  A new measure of overall potential influence in linear regression , 1992 .

[14]  Peter J. Rousseeuw,et al.  Robust regression and outlier detection , 1987 .

[15]  Sanford Weisberg,et al.  Directions in Robust Statistics and Diagnostics , 1991 .

[16]  A. Hossain,et al.  A comparative study on detection of influential observations in linear regression , 1991 .

[17]  A. .. Lawrance Local and deletion influence , 1990 .

[18]  R. Cook Detection of influential observation in linear regression , 2000 .

[19]  D. Pregibon Logistic Regression Diagnostics , 1981 .

[20]  R. Cook Assessment of Local Influence , 1986 .

[21]  A. H. M. Rahmatullah Imon,et al.  Identifying multiple influential observations in linear regression , 2005 .

[22]  Ali S. Hadi,et al.  Regression Analysis by Example: Chatterjee/Regression , 2006 .

[23]  W. W. Muir,et al.  Regression Diagnostics: Identifying Influential Data and Sources of Collinearity , 1980 .

[24]  S. Weisberg,et al.  Residuals and Influence in Regression , 1982 .

[25]  Kenneth Portier,et al.  Robust Diagnostic Regression Analysis , 2002, Technometrics.

[26]  S. Chatterjee Sensitivity analysis in linear regression , 1988 .

[27]  V. Yohai HIGH BREAKDOWN-POINT AND HIGH EFFICIENCY ROBUST ESTIMATES FOR REGRESSION , 1987 .

[28]  Christine H. Müller,et al.  High Breakdown Point and High Efficiency , 1997 .