An Analysis of Sample Attrition in Panel Data: The Michigan Panel Study of Income Dynamics

By 1989 the Michigan Panel Study on Income Dynamics (PSID) had experienced approximately 50 percent sample loss from cumulative attrition from its initial 1968 membership. We study the effect of this attrition on the unconditional distributions of several socioeconomic variables and on the estimates of several sets of regression coefficients. We provide a statistical framework for conducting tests for attrition bias that draws a sharp distinction between selection on unobservables and on observables and that shows that weighted least squares can generate consistent parameter estimates when selection is based on observables, even when they are endogenous. Our empirical analysis shows that attrition is highly selective and is concentrated among lower socioeconomic status individuals. We also show that attrition is concentrated among those with more unstable earnings, marriage, and migration histories. Nevertheless, we find that these variables explain very little of the attrition in the sample, and that the selection that occurs is moderated by regression-to-the-mean effects from selection on transitory components that fade over time. Consequently, despite the large amount of attrition, we find no strong evidence that attrition has seriously distorted the representativeness of the PSID through 1989, and considerable evidence that its cross-sectional representativeness has remained roughly intact.

[1]  Steven R. Lerman,et al.  The Estimation of Choice Probabilities from Choice Based Samples , 1977 .

[2]  Martha S. Hill,et al.  The Panel Study of Income Dynamics: A User's Guide , 1991 .

[3]  Tony Lancaster,et al.  Efficient estimation and stratified sampling , 1996 .

[4]  Jerry A. Hausman,et al.  Attrition Bias in Experimental and Panel Data: The Gary Income Maintenance Experiment , 1979 .

[5]  Charles F. Manski,et al.  Censoring of Outcomes and Regressors Due to Survey Nonresponse: Identification and estimation Using Weights and Imputations , 1998 .

[6]  James J. Heckman,et al.  Longitudinal Analysis of Labor Market Data , 1985 .

[7]  F. Windmeijer,et al.  An R-squared measure of goodness of fit for some common nonlinear regression models , 1997 .

[8]  Charles F. Manski,et al.  The Selection Problem , 1990 .

[9]  J. Heckman,et al.  Longitudinal Analysis of Labor Market Data: Alternative methods for evaluating the impact of interventions , 1985 .

[10]  G. Duncan,et al.  Assessing the Quality of Household Panel Data: The Case of the Panel Study of Income Dynamics , 1989 .

[11]  Geert Ridder,et al.  Attrition in longitudinal panel data, and the empirical analysis of dynamic labour market behaviour , 1994 .

[12]  G. Imbens,et al.  Combining Micro and Macro Data in Microeconometric Models , 1994 .

[13]  James J. Heckman,et al.  Choosing Among Alternative Nonexperimental Methods for Estimating the Impact of Social Programs: the Case of Manpower Training , 1989 .

[14]  C. R. Rao,et al.  On discrete distributions arising out of methods of ascertainment , 1965 .

[15]  Marno Verbeek,et al.  Incomplete panels and selection bias , 1995 .

[16]  Marno Verbeek,et al.  Non-response in panel data: The impact on estimates of a life cycle consumption function. , 1992 .

[17]  J. Heckman Sample selection bias as a specification error , 1979 .

[18]  Greg J. Duncan,et al.  The Panel Study of Income Dynamics. , 1985 .

[19]  S. Cosslett,et al.  1 Estimation from endogenously stratified samples , 1993 .

[20]  James J. Heckman,et al.  Alternative methods for evaluating the impact of interventions: An overview , 1985 .

[21]  Guido W. Imbens,et al.  Imposing Moment Restrictions from Auxiliary Data by Weighting , 1996, Review of Economics and Statistics.

[22]  Calyampudi R. Rao,et al.  Weighted Distributions Arising Out of Methods of Ascertainment. , 1984 .

[23]  Donald B. Rubin,et al.  Combining Panel Data Sets with Attrition and Refreshment Samples , 1998 .

[24]  J. Ivey,et al.  Ann Arbor, Michigan , 1969 .

[25]  G. Duncan,et al.  Evidence on the Validity of Cross-Sectional and Longitudinal Labor Market Data , 1994, Journal of Labor Economics.

[26]  P. Schmidt,et al.  Limited-Dependent and Qualitative Variables in Econometrics. , 1984 .

[27]  Sean Becketti,et al.  The Panel Study of Income Dynamics after Fourteen Years: An Evaluation , 1988, Journal of Labor Economics.

[28]  D. Rubin,et al.  Statistical Analysis with Missing Data. , 1989 .

[29]  Geert Ridder,et al.  An empirical evaluation of some models for non-random attrition in panel data , 1992 .

[30]  Thomas MaCurdy,et al.  The use of time series processes to model the error structure of earnings in a longitudinal data analysis , 1982 .

[31]  Guillermina Jasso,et al.  The Panel Study of Income Dynamics: A User's Guide. , 1991 .

[32]  W. DuMouchel,et al.  Using Sample Survey Weights in Multiple Regression Analyses of Stratified Samples , 1983 .