Non‐parametric Bayes models for mixed scale longitudinal surveys

Modeling and computation for multivariate longitudinal surveys have proven challenging, particularly when data are not all continuous and Gaussian but contain discrete measurements. In many social science surveys, study participants are selected via complex survey designs such as stratified random sampling, leading to discrepancies between the sample and population, which are further compounded by missing data and loss to follow up. Survey weights are typically constructed to address these issues, but it is not clear how to include them in models. Motivated by data on sexual development, we propose a novel nonparametric approach for mixed-scale longitudinal data in surveys. In the proposed approach, the mixed-scale multivariate response is expressed through an underlying continuous variable with dynamic latent factors inducing time-varying associations. Bias from the survey design is adjusted for in posterior computation relying on a Markov chain Monte Carlo algorithm. The approach is assessed in simulation studies, and applied to the National Longitudinal Study of Adolescent to Adult Health.

[1]  M. Daniels,et al.  Joint Models for the Association of Longitudinal Binary and Continuous Processes With Application to a Smoking Cessation Trial , 2009, Journal of the American Statistical Association.

[2]  Geert Molenberghs,et al.  The analysis of multivariate longitudinal data: A review , 2014, Statistical methods in medical research.

[3]  D. Dunson,et al.  Bayesian latent variable models for mixed discrete outcomes. , 2005, Biostatistics.

[4]  Kiranmoy Das,et al.  A semiparametric approach to simultaneous covariance estimation for bivariate sparse longitudinal data , 2014, Biometrics.

[5]  Athanasios Kottas,et al.  A Fully Nonparametric Modeling Approach to Binary Regression , 2014, 1404.5097.

[6]  P. Müller,et al.  Random Partition Models with Regression on Covariates. , 2010, Journal of statistical planning and inference.

[7]  L. Tardella,et al.  Approximating distributions of random functionals of Ferguson‐Dirichlet priors , 1998 .

[8]  A. Kottas,et al.  Mixture Modeling for Marked Poisson Processes , 2010, 1012.2105.

[9]  R V Gueorguieva,et al.  Joint analysis of repeatedly observed continuous and ordinal measures of disease severity , 2006, Statistics in medicine.

[10]  S. Luo,et al.  Bayesian hierarchical model for multiple repeated measures and survival data: an application to Parkinson's disease , 2014, Statistics in medicine.

[11]  Pulak Ghosh,et al.  A SEMIPARAMETRIC BAYESIAN APPROACH TO MULTIVARIATE LONGITUDINAL DATA , 2010, Australian & New Zealand journal of statistics.

[12]  M. Tanner,et al.  Facilitating the Gibbs Sampler: The Gibbs Stopper and the Griddy-Gibbs Sampler , 1992 .

[13]  Elizabeth Miller,et al.  Sexual and Reproductive Health Indicators and Intimate Partner Violence Victimization Among Female Family Planning Clinic Patients Who Have Sex with Women and Men. , 2015, Journal of women's health.

[14]  S Bandyopadhyay,et al.  A review of multivariate longitudinal data analysis , 2011, Statistical methods in medical research.

[15]  P. Müller,et al.  Bayesian curve fitting using multivariate normal mixtures , 1996 .

[16]  Damien McParland,et al.  CLUSTERING SOUTH AFRICAN HOUSEHOLDS BASED ON THEIR ASSET STATUS USING LATENT VARIABLE MODELS. , 2014, The annals of applied statistics.

[17]  Peter D. Hoff,et al.  A Covariance Regression Model , 2011, 1102.5721.

[18]  R. Little,et al.  Penalized Spline Model-Based Estimation of the Finite Populations Total from Probability-Proportional-to-Size Samples , 2003 .

[19]  Robert D. Tortora,et al.  Response rate and measurement differences in mixed-mode surveys using mail, telephone, interactive voice response (IVR) and the Internet , 2009 .

[20]  M. Knott,et al.  Generalized latent trait models , 2000 .

[21]  R. Durant,et al.  The association between health risk behaviors and sexual orientation among a school-based sample of adolescents. , 1998, Pediatrics.

[22]  B. Muthén A general structural equation model with dichotomous, ordered categorical, and continuous latent variable indicators , 1984 .

[23]  L. A. Goodman,et al.  Measures of association for cross classifications , 1979 .

[24]  D. Dunson Dynamic Latent Trait Models for Multidimensional Longitudinal Data , 2003 .

[25]  David B. Dunson,et al.  Bayesian nonparametric covariance regression , 2011, J. Mach. Learn. Res..

[26]  Jerome P. Reiter,et al.  Multiple Imputation of Missing Categorical and Continuous Values via Bayesian Mixture Models With Local Dependence , 2014, 1410.0438.

[27]  Lancelot F. James,et al.  Gibbs Sampling Methods for Stick-Breaking Priors , 2001 .

[28]  Geert Verbeke,et al.  Pairwise Fitting of Mixed Models for the Joint Modeling of Multivariate Longitudinal Profiles , 2006, Biometrics.

[29]  Andrew Gelman,et al.  Struggles with survey weighting and regression modeling , 2007, 0710.5005.

[30]  R. Little,et al.  Bayesian penalized spline model-based inference for finite population proportion in unequal probability sampling. , 2010, Survey methodology.

[31]  L. A. Goodman,et al.  Measures of Association for Cross Classifications, IV: Simplification of Asymptotic Variances , 1972 .

[32]  Heather L. Corliss,et al.  Reproductive health screening disparities and sexual orientation in a cohort study of U.S. adolescent and young adult females. , 2011, The Journal of adolescent health : official publication of the Society for Adolescent Medicine.

[33]  Yajuan Si,et al.  Bayesian Nonparametric Weighted Sampling Inference , 2013, 1309.1799.

[34]  L. A. Goodman,et al.  Measures of Association for Cross Classifications III: Approximate Sampling Theory , 1963 .

[35]  Jared S. Murray,et al.  Bayesian Gaussian Copula Factor Models for Mixed Data , 2011, Journal of the American Statistical Association.

[36]  D. Dunson,et al.  Bayesian latent variable models for clustered mixed outcomes , 2000 .

[37]  Peter E. Rossi,et al.  An exact likelihood analysis of the multinomial probit model , 1994 .

[38]  Janice McCabe,et al.  Patterns and correlates of same-sex sexual activity among U.S. teenagers and young adults. , 2011, Perspectives on sexual and reproductive health.

[39]  D. Horvitz,et al.  A Generalization of Sampling Without Replacement from a Finite Universe , 1952 .

[40]  Brian T. Ratchford,et al.  A Bayesian multivariate probit for ordinal data with semiparametric random-effects , 2013, Comput. Stat. Data Anal..

[41]  M. Escobar,et al.  Bayesian Density Estimation and Inference Using Mixtures , 1995 .

[42]  J. N. K. Rao,et al.  Impact of Frequentist and Bayesian Methods on Survey Sampling Practice: A Selective Appraisal , 2011, 1108.2356.

[43]  J. Sethuraman A CONSTRUCTIVE DEFINITION OF DIRICHLET PRIORS , 1991 .

[44]  Antonio Canale,et al.  Bayesian Kernel Mixtures for Counts , 2011, Journal of the American Statistical Association.

[45]  Michael R Elliott,et al.  A nonparametric method to generate synthetic populations to adjust for complex sampling design features. , 2014, Survey methodology.

[46]  Hanzhi Zhou,et al.  Accounting for Complex Sample Designs in Multiple Imputation Using the Finite Population Bayesian Bootstrap. , 2014 .

[47]  Elizabeth Miller,et al.  Differences by sexual minority status in relationship abuse and sexual and reproductive health among adolescent females. , 2014, The Journal of adolescent health : official publication of the Society for Adolescent Medicine.

[48]  D. V. Dyk,et al.  A Bayesian analysis of the multinomial probit model using marginal data augmentation , 2005 .

[49]  Stanley Lemeshow,et al.  Sampling of Populations: Methods and Applications , 1991 .

[50]  T. Kunihama,et al.  Nonparametric Bayes modeling with sample survey weights. , 2014, Statistics & probability letters.

[51]  Roderick J. A. Little,et al.  Approaches to Improving Survey-Weighted Estimates , 2017 .

[52]  Lisa Kuhns,et al.  Demographic, Psychosocial, and Contextual Factors Associated With Sexual Risk Behaviors Among Young Sexual Minority Women , 2013, Journal of the American Psychiatric Nurses Association.

[53]  R. Little To Model or Not To Model? Competing Modes of Inference for Finite Population Sampling , 2004 .

[54]  R. Little,et al.  Bayesian Inference for the Finite Population Total from a Heteroscedastic Probability Proportional to Size Sample , 2015 .

[55]  R. Little,et al.  Inference for the Population Total from Probability-Proportional-to-Size Samples Based on Predictions from a Penalized Spline Nonparametric Model , 2003 .

[56]  Thomas Lumley,et al.  Two-sample rank tests under complex sampling , 2013 .

[57]  Michael,et al.  On a Class of Bayesian Nonparametric Estimates : I . Density Estimates , 2008 .

[58]  A. Kottas,et al.  Bayesian Nonparametric Modeling for Multivariate Ordinal Regression , 2014, 1408.1027.

[59]  Michael A. West,et al.  Hierarchical priors and mixture models, with applications in regression and density estimation , 2006 .

[60]  L. Ryan,et al.  Latent Variable Models for Mixed Discrete and Continuous Outcomes , 1997 .

[61]  Lane F Burgette,et al.  The Trace Restriction: An Alternative Identification Strategy for the Bayesian Multinomial Probit Model , 2012 .

[62]  P. Müller,et al.  Nonparametric Bayesian Modeling for Multivariate Ordinal Data , 2005 .

[63]  Elena A. Erosheva,et al.  A semiparametric approach to mixed outcome latent variable models: Estimating the association between cognition and regional brain volumes , 2013, 1401.2728.

[64]  Warren B. Powell,et al.  Dirichlet Process Mixtures of Generalized Linear Models , 2009, J. Mach. Learn. Res..

[65]  N Hens,et al.  Model-based inference for small area estimation with sampling weights. , 2016, Spatial statistics.

[66]  C. Patterson,et al.  Sexual identity, partner gender, and sexual health among adolescent girls in the United States. , 2014, American journal of public health.

[67]  L. A. Goodman,et al.  Measures of Association for Cross Classifications. II: Further Discussion and References , 1959 .

[68]  Donna Spiegelman,et al.  Sexual orientation, health risk factors, and physical functioning in the Nurses' Health Study II. , 2004, Journal of women's health.

[69]  Peter D. Hoff Extending the rank likelihood for semiparametric copula estimation , 2006, math/0610413.

[70]  Michael R Elliott,et al.  Bayesian inference for finite population quantiles from unequal probability samples. , 2012, Survey methodology.

[71]  Michael R Elliott,et al.  A two‐step semiparametric method to accommodate sampling weights in multiple imputation , 2016, Biometrics.

[72]  Damon Berridge,et al.  Joint modeling of multivariate longitudinal mixed measurements and time to event data using a Bayesian approach , 2014 .