Exit Polling and Racial Bloc Voting: Combining Individual-Level and R X C Ecological Data

Despite its shortcomings, cross-level or ecological inference remains a necessary part of some areas of quantitative inference, including in United States voting rights litigation. Ecological inference suffers from a lack of identification that, most agree, is best addressed by incorporating individual-level data into the model. In this paper we test the limits of such an incorporation by attempting it in the context of drawing inferences about racial voting patterns using a combination of an exit poll and precinct-level ecological data; accurate information about racial voting patterns is needed to assess triggers in voting rights laws that can determine the composition of United States legislative bodies. Specifically, we extend and study a hybrid model that addresses two-way tables of arbitrary dimension. We apply the hybrid model to an exit poll we administered in the City of Boston in 2008. Using the resulting data as well as simulation, we compare the performance of a pure ecological estimator, pure survey estimators using various sampling schemes and our hybrid. We conclude that the hybrid estimator offers substantial benefits by enabling substantive inferences about voting patterns not practicably available without its use.

[1]  Allan L. McCutcheon,et al.  Cross-Level Inference , 1995 .

[2]  Paul R. Abramson,et al.  Who Overreports Voting? , 1986, American Political Science Review.

[3]  John Aitchison,et al.  The Statistical Analysis of Compositional Data , 1986 .

[4]  Alan C. Acock,et al.  What to do about missing values. , 2012 .

[5]  Robert Chambers,et al.  Analysis of survey data , 2003 .

[6]  J. Wakefield,et al.  Alleviating Ecological Bias in Voter Turnout Models (and other Generalized Linear Models) with Optimal Subsample Design , 2009 .

[7]  S. Issacharoff Polarized Voting and the Political Process: The Transformation of Voting Rights Jurisprudence , 1992 .

[8]  M. Tanner,et al.  Bayesian and Frequentist Inference for Ecological Inference: The R×C Case , 2001 .

[9]  G. W. Hill,et al.  Analysis of survey data , 1996 .

[10]  B. Fisher,et al.  “SECRET BALLOTS” AND SELF-REPORTS IN AN EXIT-POLL EXPERIMENT , 1995 .

[11]  Sebastien J-P A Haneuse,et al.  The Combination of Ecological and Case-Control Data. , 2006, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[12]  W. Wong,et al.  The Calculation of Posterior Distributions by MARTIN A . TANNER and , 2007 .

[13]  P. Peterson Classifying by Race , 1995 .

[14]  D. Freedman,et al.  A solution to the ecological inference problem , 1997 .

[15]  W. S. Robinson,et al.  Ecological correlations and the behavior of individuals. , 1950, International journal of epidemiology.

[16]  D. Greiner Ecological Inference in Voting Rights Act Disputes: Where are We Now, and Where Do We Want to Be? , 2007 .

[17]  Clive Payne,et al.  Aggregate Data, Ecological Regression, and Voting Transitions , 1986 .

[18]  D. Rubin,et al.  Statistical Analysis with Missing Data. , 1989 .

[19]  Carl-Erik Särndal,et al.  Generalized Raking Procedures in Survey Sampling , 1993 .

[20]  P. McCullagh,et al.  Generalized Linear Models , 1992 .

[21]  D. Rubin,et al.  Hierarchical logistic regression models for imputation of unresolved enumeration status in undercount estimation. , 1993, Journal of the American Statistical Association.

[22]  George G. Judge,et al.  An Information Theoretic Approach to Ecological Estimation and Inference , 2003 .

[23]  Philip Heidelberger,et al.  Simulation Run Length Control in the Presence of an Initial Transient , 1983, Oper. Res..

[24]  J. H. Schuenemeyer,et al.  Generalized Linear Models (2nd ed.) , 1992 .

[25]  W. Deming,et al.  On a Least Squares Adjustment of a Sampled Frequency Table When the Expected Marginal Totals are Known , 1940 .

[26]  P. Holland,et al.  Discrete Multivariate Analysis. , 1976 .

[27]  Jon Wakefield,et al.  Ecological inference for 2 × 2 tables , 2004 .

[28]  David G Steel,et al.  Analysis combining survey and geographically aggregated data , 2003 .

[29]  Allen Cheadle,et al.  Combining Aggregate and Individual Level Data to Estimate an Individual Level Correlation Coefficient , 2003 .

[30]  Ruth Salway,et al.  Sources of bias in ecological studiesof non-rare events , 2005, Environmental and Ecological Statistics.

[31]  S. Shen,et al.  The statistical analysis of compositional data , 1983 .

[32]  Ron Johnston,et al.  Review of A Solution to the Ecological Inference Problem: Reconstructing Individual Behaviour from Aggregate Data by King, G , 1998 .

[33]  A. Zaslavsky Combining census, dual-system, and evaluation study data to estimate population shares. , 1993, Journal of the American Statistical Association.

[34]  John A. Nelder,et al.  Generalized Linear Models , 1972, Predictive Analytics.

[35]  Russell V. Lenth,et al.  Statistical Analysis With Missing Data (2nd ed.) (Book) , 2004 .

[36]  Andrew Gelman,et al.  Models, assumptions and model checking in ecological regressions , 2001 .

[37]  G. King,et al.  What to Do about Missing Values in Time‐Series Cross‐Section Data , 2010 .

[38]  R. Little Post-Stratification: A Modeler's Perspective , 1993 .

[39]  Jon Wakefield,et al.  Alleviating linear ecological bias and optimal design with subsample data , 2007, Journal of the Royal Statistical Society. Series A,.

[40]  Kevin M. Quinn,et al.  R×C ecological inference: bounds, correlations, flexibility and transparency of assumptions , 2009 .

[41]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[42]  Otis Dudley Duncan,et al.  An Alternative to Ecological Correlation , 1953 .

[43]  W. Wong,et al.  The calculation of posterior distributions by data augmentation , 1987 .