Latent class analysis of response inconsistencies across modes of data collection.

Latent class analysis (LCA) has been hailed as a promising technique for studying measurement errors in surveys, because the models produce estimates of the error rates associated with a given question. Still, the issue arises as to how accurate these error estimates are and under what circumstances they can be relied on. Skeptics argue that latent class models can understate the true error rates and at least one paper (Kreuter et al., 2008) demonstrates such underestimation empirically. We applied latent class models to data from two waves of the National Survey of Family Growth (NSFG), focusing on a pair of similar items about abortion that are administered under different modes of data collection. The first item is administered by computer-assisted personal interviewing (CAPI); the second, by audio computer-assisted self-interviewing (ACASI). Evidence shows that abortions are underreported in the NSFG and the conventional wisdom is that ACASI item yields fewer false negatives than the CAPI item. To evaluate these items, we made assumptions about the error rates within various subgroups of the population; these assumptions were needed to achieve an identifiable LCA model. Because there are external data available on the actual prevalence of abortion (by subgroup), we were able to form subgroups for which the identifying restrictions were likely to be (approximately) met and other subgroups for which the assumptions were likely to be violated. We also ran more complex models that took potential heterogeneity within subgroups into account. Most of the models yielded implausibly low error rates, supporting the argument that, under specific conditions, LCA models underestimate the error rates.

[1]  L. A. Goodman Exploratory latent structure analysis using both identifiable and unidentifiable models , 1974 .

[2]  Bruce D. Spencer,et al.  When Do Latent Class Models Overstate Accuracy for Binary Classifiers?: With Applications to Jury Accuracy, Survey Response Error, and Diagnostic Error , 2009 .

[3]  Paul P. Biemer,et al.  Latent Class Analysis of Survey Error , 2011 .

[4]  C. Clogg Latent Class Models , 1995 .

[5]  W. Mosher Design and operation of the 1995 National Survey of Family Growth. , 1998, Family planning perspectives.

[6]  Neil Henry Latent structure analysis , 1969 .

[7]  Paul F. Lazarsfeld,et al.  Latent Structure Analysis. , 1969 .

[8]  Karen E Davis,et al.  National Survey of Family Growth, Cycle 6: sample design, weighting, imputation, and variance estimation. , 2006, Vital and health statistics. Series 2, Data evaluation and methods research.

[9]  R. Tourangeau,et al.  Sensitive questions in surveys. , 2007, Psychological bulletin.

[10]  Rachel K. Jones,et al.  Underreporting of induced and spontaneous abortion in the United States: an analysis of the 2002 National Survey of Family Growth. , 2007, Studies in family planning.

[11]  Maria Hewitt,et al.  Attitudes toward Interview Mode and Comparability of Reporting Sexual Behavior by Personal Interview and Audio Computer-assisted Self-interviewing , 2002 .

[12]  Peter Granda,et al.  Plan and operation of Cycle 6 of the National Survey of Family Growth. , 2005, Vital and health statistics. Ser. 1, Programs and collection procedures.

[13]  S. Henshaw,et al.  Abortion services in the United States 1991 and 1992. , 1994 .

[14]  Duane F. Alwin Margins of Error: A Study of Reliability in Survey Measurement , 2007 .

[15]  Paul P. Biemer,et al.  Measurement error evaluation of self‐reported drug use: a latent class analysis of the US National Household Survey on Drug Abuse , 2002 .

[16]  Paul P. Biemer,et al.  Latent Class Analysis of Survey Error: Biemer/Latent Class Analysis , 2010 .

[17]  E. Jones,et al.  Underreporting of abortion in surveys of U.S. women: 1976 to 1988 , 1992, Demography.

[18]  P. Biemer,et al.  Estimation of measurement bias in self-reports of drug use with applications to the national household survey on drug abuse , 1996 .

[19]  Tom W. Smith,et al.  ASKING SENSITIVE QUESTIONS THE IMPACT OF DATA COLLECTION MODE, QUESTION FORMAT, AND QUESTION CONTEXT , 1996 .

[20]  Paul P. Biemer,et al.  Modeling Measurement Error to Identify Flawed Questions , 2004 .

[21]  Clifford C. Clogg,et al.  8 – Assessing Reliability of Categorical Measurements Using Latent Class Models , 1996 .

[22]  Jeroen K. Vermunt,et al.  'EM: A general program for the analysis of categorical data 1 , 1997 .

[23]  A. McCutcheon,et al.  Latent Class Analysis , 2021, Encyclopedia of Autism Spectrum Disorders.

[24]  S. Walter,et al.  Estimating the error rates of diagnostic tests. , 1980, Biometrics.

[25]  Michael D. Sinclair,et al.  On procedures for evaluating the effectiveness of reinterview survey methods : Application to labor force data , 1996 .

[26]  Roger Tourangeau,et al.  Good item or bad—can latent class analysis tell?: the utility of latent class analysis for the evaluation of survey questions , 2008 .

[27]  P M Vacek,et al.  The effect of conditional dependence on the evaluation of diagnostic tests. , 1985, Biometrics.

[28]  J. Darroch,et al.  Measuring the extent of abortion underreporting in the 1995 National Survey of Family Growth. , 1998, Family planning perspectives.

[29]  W. Mosher,et al.  Plan and operation of the 1995 National Survey of Family Growth. , 1997, Vital and health statistics. Ser. 1, Programs and collection procedures.

[30]  V. Iannacchione,et al.  Sample design, sampling weights, imputation, and variance estimation in the 1995 National Survey of Family Growth. , 1998, Vital and health statistics. Series 2, Data evaluation and methods research.

[31]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.