10. Indices of Robustness for Sample Representation

Social scientists are rarely able to gather data from the full range of contexts to which they hope to generalize (Shadish, Cook, and Campbell 2002). Here we suggest that debates about the generality of causal inferences in the social sciences can be informed by quantifying the conditions necessary to invalidate an inference. We begin by differentiating the target population into two sub-populations: a potentially observed subpopulation from which all of a sample is drawn and a potentially unobserved subpopulation from which no members of the sample are drawn but which is part of the population to which policymakers seek to generalize. We then quantify the robustness of an inference in terms of the conditions necessary to invalidate an inference if cases from the potentially unobserved subpopulation were included in the sample. We apply the indices to inferences regarding the positive effect of small classes on achievement from the Tennessee class size study and then consider the breadth of external validity. We use the statistical test for whether there is a difference in effects between two subpopulations as a baseline to evaluate robustness, and we consider a Bayesian motivation for the indices and compare the use of the indices with other procedures. In the discussion we emphasize the value of quantifying robustness, consider the value of different quantitative thresholds, and conclude by extending a metaphor linking statistical and causal inferences.

[1]  M. Sobel,et al.  Causal Inference in Sociological Studies , 2004 .

[2]  W. Shadish,et al.  Foundations of Program Evaluation: Theories of Practice , 1990 .

[3]  R. Fisher 035: The Distribution of the Partial Correlation Coefficient. , 1924 .

[4]  Larry V. Hedges,et al.  Do Low-Achieving Students Benefit More from Small Classes? Evidence from the Tennessee Class Size Experiment , 2002 .

[5]  J. Hunter Needed: A Ban on the Significance Test , 1997 .

[6]  Andrew Abbott,et al.  The Causal Devolution , 1998 .

[7]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[8]  Elizabeth R. Word The State of Tennessee's Student/Teacher Achievement Ratio (STAR) Project: Technical Report (1985-1990). , 1990 .

[9]  P. Rosenbaum Dropping out of High School in the United States: An Observational Study , 1986 .

[10]  R. Rosenthal The file drawer problem and tolerance for null results , 1979 .

[11]  Nicole A. Lazar,et al.  Statistical Analysis With Missing Data , 2003, Technometrics.

[12]  C. Quensel The distribution of the partial correlation coefficient in samples from multivariate universesin a special case of non-normally distributed random variables , 1953 .

[13]  S. Morgan Counterfactuals, Causal Effect Heterogeneity, and the Catholic School Effect on Learning. , 2001 .

[14]  J W Hogan,et al.  Reparameterizing the Pattern Mixture Model for Sensitivity Analyses Under Informative Dropout , 2000, Biometrics.

[15]  T. DiPrete,et al.  7. Assessing Bias in the Estimation of Causal Effects: Rosenbaum Bounds on Matching Estimators and Instrumental Variables Estimation with Imperfect Instruments , 2004 .

[16]  M. Sobel An Introduction to Causal Inference , 1996 .

[17]  E. Barbier,et al.  Impacts of Biodiversity Loss on Ocean Ecosystem Services , 2006, Science.

[18]  M. Browne,et al.  Cross-Validation Of Covariance Structures. , 1983, Multivariate behavioral research.

[19]  M. Kendall Statistical Methods for Research Workers , 1937, Nature.

[20]  L. Cronbach,et al.  Aptitudes and instructional methods: A handbook for research on interactions , 1977 .

[21]  B. Lindsay,et al.  Multivariate Normal Mixtures: A Fast Consistent Method of Moments , 1993 .

[22]  Larry V. Hedges,et al.  The Effects of Small Classes on Academic Achievement: The Results of the Tennessee Class Size Experiment , 2000 .

[23]  Michael H. Birnbaum,et al.  Mediated Models for the Analysis of Confounded Variables and Self-Selected Samples , 1989 .

[24]  Daniel F. McCaffrey,et al.  What We Have Learned About Class Size Reduction in California , 2002 .

[25]  J. Robins A graphical approach to the identification and estimation of causal parameters in mortality studies with sustained exposure periods. , 1987, Journal of chronic diseases.

[26]  W. Shadish,et al.  Experimental and Quasi-Experimental Designs for Generalized Causal Inference , 2001 .

[27]  L. Cronbach,et al.  Designing evaluations of educational and social programs , 1983 .

[28]  Robert G. Orwin,et al.  A Fail-SafeN for Effect Size in Meta-Analysis , 1983 .

[29]  Barbara Entwisle,et al.  Through Thick and Thin: Layers of Social Ties and Urban Settlement among Thai Migrants , 2005 .

[30]  Eric A. Hanushek,et al.  Some Findings From an Independent Investigation of the Tennessee STAR Experiment and From Other Investigations of Class Size Effects , 1999 .

[31]  D. Harding Counterfactual Models of Neighborhood Effects: The Effect of Neighborhood Poverty on Dropping Out and Teenage Pregnancy1 , 2003, American Journal of Sociology.

[32]  K. Frank Impact of a Confounding Variable on a Regression Coefficient , 2000 .

[33]  Roderick J. A. Little Regression with Missing X's: A Review , 1992 .

[34]  L. Hedges Modeling publication selection effects in meta-analysis , 1992 .

[35]  J. Tukey,et al.  AVERAGE VALUES OF MEAN SQUARES IN FACTORIALS , 1956 .

[36]  James M. Robins,et al.  Causal inference for complex longitudinal data: the continuous case , 2001 .

[37]  D. Rubin Estimating causal effects of treatments in randomized and nonrandomized studies. , 1974 .

[38]  Kenneth A. Frank,et al.  A Probability Index of the Robustness of a Causal Inference , 2003 .

[39]  L. Delbeke Quasi-experimentation - design and analysis issues for field settings - cook,td, campbell,dt , 1980 .

[40]  P. Allison Multiple Imputation for Missing Data , 2000 .

[41]  C. Manski Nonparametric Bounds on Treatment Effects , 1989 .

[42]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[43]  J. Marschak Economic Measurements for Policy and Prediction , 1974 .

[44]  George W. Bohrnstedt,et al.  What We Have Learned about Class Size Reduction in California. Capstone Report. , 2002 .

[45]  P. Deb Finite Mixture Models , 2008 .

[46]  J. Schafer,et al.  A comparison of inclusive and restrictive strategies in modern missing data procedures. , 2001, Psychological methods.

[47]  N. E. Day Estimating the components of a mixture of normal distributions , 1969 .

[48]  M. Sobel,et al.  Identification Problems in the Social Sciences. , 1996 .

[49]  P. Suppes A Probabilistic Theory Of Causality , 1970 .

[50]  W. Cole Sovereignty Relinquished? Explaining Commitment to the International Human Rights Covenants, 1966-1999 , 2005 .

[51]  Anthony S. Bryk,et al.  Hierarchical Linear Models: Applications and Data Analysis Methods , 1992 .

[52]  T. Cook,et al.  Quasi-experimentation: Design & analysis issues for field settings , 1979 .

[53]  J. Finn,et al.  Answers and Questions About Class Size: A Statewide Experiment , 1990 .

[54]  J. Robins,et al.  Sensitivity Analysis for Selection bias and unmeasured Confounding in missing Data and Causal inference models , 2000 .

[55]  R. Fisher Statistical methods for research workers , 1927, Protoplasma.

[56]  R. Orwin A fail-safe N for effect size in meta-analysis. , 1983 .

[57]  D. Rubin,et al.  Estimating and Using Propensity Scores with Partially Missing Data , 2000 .

[58]  L. Joseph,et al.  Bayesian Statistics: An Introduction , 1989 .

[59]  Leland Wilkinson,et al.  Statistical Methods in Psychology Journals Guidelines and Explanations , 2005 .

[60]  Kyung-Seok Min,et al.  The Impact of Nonignorable Missing Data on the Inference of Regression Coefficients. , 2002 .

[61]  Paul R Rosenbaum,et al.  Attributing Effects to Treatment in Matched Observational Studies , 2002 .

[62]  D. Harding Counterfactual Models of Neighborhood Effects: The Effect of Neighborhood Poverty on High School Dropout and Teenage Pregnancy* , 2002 .

[63]  P. Holland Statistics and Causal Inference , 1985 .

[64]  Grégoire Mallard Interpreters of the Literary Canon and Their Technical Instruments: The Case of Balzac Criticism , 2005 .

[65]  Jay Brand,et al.  File Drawer Problem , 2022, The SAGE Encyclopedia of Research Design.

[66]  R. Irizarry,et al.  Generalized Additive Selection Models for the Analysis of Studies with Potentially Nonignorable Missing Outcome Data , 2003, Biometrics.

[67]  Howard Wainer,et al.  Shaping Up the Practice of Null Hypothesis Significance Testing , 2003 .

[68]  M. Sobel Causal Inference in Statistical Models of the Process of Socioeconomic Achievement , 1998 .

[69]  S Duval,et al.  Trim and Fill: A Simple Funnel‐Plot–Based Method of Testing and Adjusting for Publication Bias in Meta‐Analysis , 2000, Biometrics.

[70]  J. Copas,et al.  Inference for Non‐random Samples , 1997 .

[71]  George W. Bohrnstedt,et al.  Class Size Reduction in California: The 1998-99 Evaluation Findings , 2000 .

[72]  M. Sobel Causal Inference in the Social and Behavioral Sciences , 1995 .

[73]  Freda Kemp Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences , 2003 .

[74]  T. Cook Randomized Experiments in Education: Why Are They So Rare?. , 2002 .