Outfluence - The impact of missing values

There are numerous measures that assess the effect of an observation, group of observations, a variable, or variables and observations on the regression estimation. Incomplete data is a common difficulty in data analysis. We introduce a new measure which assesses the effect of a missing observation, a group of missing observations, an incomplete variable or any combination of these, on the overall estimation. We call this measurement "outfluence". The outfluence measure can be used in a regression analysis context or any other parametric settings. We illustrate the major benefits of outfluence using two examples. The first example demonstrates the use of outfluence in a marijuana pilot study. The second example records blood cholesterol levels of heart attack victims at three time points after the heart attack. We evaluate the cholesterol level mean at time 3 and the difference between time 1 and time 3 with incomplete data.

[1]  Donald B. Rubin,et al.  Nested multiple imputation of NMES via partially incompatible MCMC , 2003 .

[2]  Jerome P. Reiter,et al.  The Multiple Adaptations of Multiple Imputation , 2007 .

[3]  John Haslett,et al.  Application of ‘delete = replace’ to deletion diagnostics for variance component estimation in the linear mixed model , 2004 .

[4]  Ofer Harel,et al.  Inferences on missing information under multiple imputation and two-stage multiple imputation , 2007 .

[5]  Liang Xu,et al.  Deletion measures for generalized linear mixed effects models , 2006, Comput. Stat. Data Anal..

[6]  Edward W. Frees,et al.  Influence Diagnostics for Linear Longitudinal Models , 1997 .

[7]  N. Pedersen,et al.  Population Inference with Mortality and Attrition in Longitudinal Studies on Aging: A Two-Stage Multiple Imputation Method , 2007, Experimental aging research.

[8]  S. Weisberg,et al.  Residuals and Influence in Regression , 1982 .

[9]  M. Berger,et al.  Detection of Influential Observations in Longitudinal Mixed Effects Regression Models , 2001 .

[10]  Norman R. Draper,et al.  Residuals and Their Variance Patterns , 1972 .

[11]  R. Cook Detection of influential observation in linear regression , 2000 .

[12]  T. Hesterberg,et al.  Analyzing data with missing values in S-PLUS , 2001 .

[13]  A. C. Atkinson,et al.  Two graphical displays for outlying and influential observations in regression , 1981 .

[14]  Ofer Harel,et al.  Strategies for Data Analysis with Two Types of Missing Values , 2009 .

[15]  Joseph L Schafer,et al.  Analysis of Incomplete Multivariate Data , 1997 .

[16]  S. Chatterjee Sensitivity analysis in linear regression , 1988 .

[17]  D. Rubin INFERENCE AND MISSING DATA , 1975 .

[18]  D. Rubin,et al.  Statistical Analysis with Missing Data. , 1989 .

[19]  R. Welsch,et al.  The Hat Matrix in Regression and ANOVA , 1978 .

[20]  Cuthbert Daniel,et al.  Fitting Equations to Data: Computer Analysis of Multifactor Data , 1980 .

[21]  J. M. Nelsen,et al.  Clinical and psychological effects of marihuana in man. , 1968, Science.

[22]  D. Rubin Multiple imputation for nonresponse in surveys , 1989 .

[23]  Wing K. Fung,et al.  Influence diagnostics and outlier tests for semiparametric mixed models , 2002 .