Nested partially latent class models for dependent binary data; estimating disease etiology.

The Pneumonia Etiology Research for Child Health (PERCH) study seeks to use modern measurement technology to infer the causes of pneumonia for which gold-standard evidence is unavailable. Based on case-control data, the article describes a latent variable model designed to infer the etiology distribution for the population of cases, and for an individual case given her measurements. We assume each observation is drawn from a mixture model for which each component represents one disease class. The model conisidered here addresses a major limitation of the traditional latent class approach by taking account of residual dependence among multivariate binary outcomes given disease class, hence reducing estimation bias, retaining efficiency and offering more valid inference. Such "local dependence" on each subject is induced in the model by nesting latent subclasses within each disease class. Measurement precision and covariation can be estimated using the control sample for whom the class is known. In a Bayesian framework, we use stick-breaking priors on the subclass indicators for model-averaged inference across different numbers of subclasses. Assessment of model fit and individual diagnosis are done using posterior samples drawn by Gibbs sampling. We demonstrate the utility of the method on simulated and on the motivating PERCH data.

[1]  L. A. Goodman Exploratory latent structure analysis using both identifiable and unidentifiable models , 1974 .

[2]  Maria Deloria-Knoll,et al.  Identification and Selection of Cases and Controls in the Pneumonia Etiology Research for Child Health Project , 2012, Clinical infectious diseases : an official publication of the Infectious Diseases Society of America.

[3]  Nandini Dendukuri,et al.  Modeling conditional dependence between diagnostic tests: A multiple latent variable model , 2009, Statistics in medicine.

[4]  Peter D. Hoff,et al.  Subset Clustering of Binary Sequences, with an Application to Genomic Abnormality Data , 2005, Biometrics.

[5]  Dustin G Gibson,et al.  A Preliminary Study of Pneumonia Etiology Among Hospitalized Children in Kenya , 2012, Clinical infectious diseases : an official publication of the Infectious Diseases Society of America.

[6]  Montserrat Fuentes,et al.  Spatial‐Temporal Modeling of the Association between Air Pollution Exposure and Preterm Birth: Identifying Critical Windows of Exposure , 2012, Biometrics.

[7]  S. Zeger,et al.  Latent Class Model Diagnosis , 2000, Biometrics.

[8]  Radford M. Neal Markov Chain Sampling Methods for Dirichlet Process Mixture Models , 2000 .

[9]  David Knoke,et al.  Analysis of Qualitative Data, Vol. 2: New Developments. , 1981 .

[10]  James O. Berger,et al.  Modularization in Bayesian analysis, with emphasis on analysis of computer models , 2009 .

[11]  Wesley O Johnson,et al.  Identifiability of Models for Multiple Diagnostic Testing in the Absence of a Gold Standard , 2010, Biometrics.

[12]  Xiao-Li Meng,et al.  POSTERIOR PREDICTIVE ASSESSMENT OF MODEL FITNESS VIA REALIZED DISCREPANCIES , 1996 .

[13]  J. Berger,et al.  Testing a Point Null Hypothesis: The Irreconcilability of P Values and Evidence , 1987 .

[14]  Frederic M. Lord THE RELATION OF TEST SCORE TO THE TRAIT UNDERLYING THE TEST , 1952 .

[15]  Scott L. Zeger,et al.  Partially latent class models for case–control studies of childhood pneumonia aetiology , 2015, Journal of the Royal Statistical Society. Series C, Applied statistics.

[16]  Maria Deloria-Knoll,et al.  The Pneumonia Etiology Research for Child Health Project: A 21st Century Childhood Pneumonia Etiology Study , 2012, Clinical infectious diseases : an official publication of the Infectious Diseases Society of America.

[17]  D. Dunson,et al.  Nonparametric Bayes Modeling of Multivariate Categorical Data , 2009, Journal of the American Statistical Association.

[18]  Paul Gustafson,et al.  Bayesian Inference for Partially Identified Models: Exploring the Limits of Limited Data , 2015 .

[19]  Lancelot F. James,et al.  Approximate Dirichlet Process Computing in Finite Normal Mixtures , 2002 .

[20]  Beat Neuenschwander,et al.  Combining MCMC with ‘sequential’ PKPD modelling , 2009, Journal of Pharmacokinetics and Pharmacodynamics.

[21]  Bradford D. Gessner,et al.  Use of vaccines as probes to define disease burden , 2014, The Lancet.

[22]  R. Spitzer,et al.  The PHQ-9: A new depression diagnostic and severity measure , 2002 .

[23]  J. Besag Spatial Interaction and the Statistical Analysis of Lattice Systems , 1974 .

[24]  Jukka Jokinen,et al.  Estimating the Proportion of Pneumonia Attributable to Pneumococcus in Kenyan Adults: Latent Class Analysis , 2010, Epidemiology.

[25]  P S Albert,et al.  Latent Class Modeling Approaches for Assessing Diagnostic Error without a Gold Standard: With Applications to p53 Immunohistochemical Assays in Bladder Tumors , 2001, Biometrics.

[26]  Thomas Hofmann,et al.  Unsupervised Learning by Probabilistic Latent Semantic Analysis , 2004, Machine Learning.

[27]  S. Haberman Analysis of qualitative data , 1978 .

[28]  H. White Maximum Likelihood Estimation of Misspecified Models , 1982 .

[29]  Corwin M Zigler,et al.  Uncertainty in Propensity Score Estimation: Bayesian Methods for Variable Selection and Model-Averaged Causal Effects , 2014, Journal of the American Statistical Association.

[30]  Yinsheng Qu,et al.  A Model for Evaluating Sensitivity and Specificity for Correlated Diagnostic Tests in Efficacy Studies with an Imperfect Reference Test , 1998 .

[31]  D. Blei Bayesian Nonparametrics I , 2016 .

[32]  Margaret Sullivan Pepe,et al.  Insights into latent class analysis of diagnostic test performance. , 2007, Biostatistics.

[33]  Dean Harper,et al.  Local dependence latent structure models , 1972 .