Respondent‐driven sampling as Markov chain Monte Carlo

Respondent-driven sampling (RDS) is a recently introduced, and now widely used, technique for estimating disease prevalence in hidden populations. RDS data are collected through a snowball mechanism, in which current sample members recruit future sample members. In this paper we present RDS as Markov chain Monte Carlo importance sampling, and we examine the effects of community structure and the recruitment procedure on the variance of RDS estimates. Past work has assumed that the variance of RDS estimates is primarily affected by segregation between healthy and infected individuals. We examine an illustrative model to show that this is not necessarily the case, and that bottlenecks anywhere in the networks can substantially affect estimates. We also show that variance is inflated by a common design feature in which the sample members are encouraged to recruit multiple future sample members. The paper concludes with suggestions for implementing and evaluating RDS studies.

[1]  Matthew J. Salganik Variance Estimation, Design Effects, and Sample Size Calculations for Respondent-Driven Sampling , 2006, Journal of Urban Health.

[2]  Mohsen Malekinejad,et al.  Using Respondent-Driven Sampling Methodology for HIV Biological and Behavioral Surveillance in International Settings: A Systematic Review , 2008, AIDS and Behavior.

[3]  Tom A. B. Snijders,et al.  Estimation On the Basis of Snowball Samples: How To Weight? , 1992 .

[4]  Philippe Flajolet,et al.  Adaptive Sampling , 1997 .

[5]  Faming Liang,et al.  Markov Chain Monte Carlo: Innovations And Applications , 2006 .

[6]  Erik M. Volz,et al.  Probability based estimation theory for respondent driven sampling , 2008 .

[7]  Greg Scott,et al.  " They Got Their Program, and I Got Mine " : a Cautionary Tale concerning the Ethical Implications of Using Respondent-driven Sampling to Study Injection Drug Users , 2007 .

[8]  Ove Frank,et al.  Models and Methods in Social Network Analysis: Network Sampling and Model Fitting , 2005 .

[9]  Peter Green,et al.  Markov chain Monte Carlo in Practice , 1996 .

[10]  Dimitri Prybylski,et al.  Application of Respondent Driven Sampling to Collect Baseline Data on FSWs and MSM for HIV Risk Reduction Interventions in Two Urban Centres in Papua New Guinea , 2006, Journal of Urban Health.

[11]  L. Saloff-Coste,et al.  Lectures on finite Markov chains , 1997 .

[12]  Courtney McKnight,et al.  Respondent-Driven Sampling in a Study of Drug Users in New York City: Notes from the Field , 2006, Journal of Urban Health.

[13]  Rebeca Ramos,et al.  Respondent-Driven Sampling of Injection Drug Users in Two U.S.–Mexico Border Cities: Recruitment Dynamics and Impact on Estimates of HIV and Syphilis Prevalence , 2006, Journal of Urban Health.

[14]  Mohsen Malekinejad,et al.  Implementation Challenges to Using Respondent-Driven Sampling Methodology for HIV Biological and Behavioral Surveillance: Field Experiences in International Settings , 2008, AIDS and Behavior.

[15]  Douglas D. Heckathorn,et al.  Respondent-driven sampling : A new approach to the study of hidden populations , 1997 .

[16]  O. Geoffrey Okogbaa,et al.  A review of: “Adaptive Sampling” S. Thompson and G. Seber Wiley, 1996 , 1997 .

[17]  A. Brix Bayesian Data Analysis, 2nd edn , 2005 .

[18]  Sharon L. Lohr,et al.  Sampling: Design and Analysis , 1999 .

[19]  S. Vermund,et al.  Network-related mechanisms may help explain long-term HIV-1 seroprevalence levels that remain high but do not approach population-group saturation. , 2000, American journal of epidemiology.

[20]  Thomas Rehle,et al.  Second-generation HIV surveillance: better data for decision-making. , 2004, Bulletin of the World Health Organization.

[21]  Andrew Gelman,et al.  General methods for monitoring convergence of iterative simulations , 1998 .

[22]  D. Heckathorn 6. Extensions of Respondent-Driven Sampling: Analyzing Continuous Variables and Controlling for Differential Recruitment , 2007 .

[23]  J. Cheeger A lower bound for the smallest eigenvalue of the Laplacian , 1969 .

[24]  J. Coleman Relational Analysis: The Study of Social Organizations with Survey Methods , 1958 .

[25]  D. Heckathorn,et al.  A Methodology for Reducing Respondent Duplication and Impersonation in Samples of Hidden Populations , 2001 .

[26]  Willi McFarland,et al.  Gay and Bisexual Men in Kampala, Uganda , 2008, AIDS and Behavior.

[27]  Hoon Kim,et al.  Monte Carlo Statistical Methods , 2000, Technometrics.

[28]  Lisa G. Johnston,et al.  Methods to Recruit Hard-to-Reach Groups: Comparing Two Chain Referral Sampling Methods of Recruiting Injecting Drug Users Across Nine Studies in Russia and Estonia , 2006, Journal of Urban Health.

[29]  John K Kruschke,et al.  Bayesian data analysis. , 2010, Wiley interdisciplinary reviews. Cognitive science.

[30]  Mark S Handcock,et al.  MODELING SOCIAL NETWORKS FROM SAMPLED DATA. , 2010, The annals of applied statistics.

[31]  M. Newman,et al.  Finding community structure in networks using the eigenvectors of matrices. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[32]  Bonnie H. Erickson,et al.  Some Problems of Inference from Chain Data , 1979 .

[33]  M. Spreen Rare Populations, Hidden Populations, and Link-Tracing Designs: What and Why? , 1992 .

[34]  Violeta Andjelkovic,et al.  Exploring Barriers to ‘Respondent Driven Sampling’ in Sex Worker and Drug-Injecting Sex Worker Populations in Eastern Europe , 2006, Journal of Urban Health.

[35]  Douglas D. Heckathorn,et al.  Respondent-driven sampling II: deriving valid population estimates from chain-referral samples of hi , 2002 .

[36]  Persi Diaconis,et al.  Examples comparing importance sampling and the Metropolis algorithm , 2006 .

[37]  L. Johnston,et al.  Assessment of Respondent Driven Sampling for Recruiting Female Sex Workers in Two Vietnamese Cities: Reaching the Unseen Sex Worker , 2006, Journal of Urban Health.

[38]  Robert G Carlson,et al.  Respondent-driven sampling in the recruitment of illicit stimulant drug users in a rural setting: findings and technical issues. , 2007, Addictive behaviors.

[39]  Stephanie Tortu,et al.  Recruiting Injection Drug Users: A Three-Site Comparison of Results and Experiences with Respondent-Driven and Targeted Sampling Procedures , 2006, Journal of Urban Health.

[40]  Douglas D. Heckathorn,et al.  Effectiveness of Respondent-Driven Sampling for Recruiting Drug Users in New York City: Findings from a Pilot Study , 2006, Journal of Urban Health.

[41]  M. McPherson,et al.  Birds of a Feather: Homophily in Social Networks , 2001 .

[42]  Tobi Saidel,et al.  Review of sampling hard-to-reach and hidden populations for HIV surveillance. , 2005, AIDS.

[43]  S. Berg Snowball Sampling—I , 2006 .

[44]  Tian Zheng,et al.  How Many People Do You Know in Prison? , 2006 .

[45]  Jennifer Lauby,et al.  Street and network sampling in evaluation studies of HIV risk-reduction interventions. , 2002, AIDS reviews.

[46]  Charles J. Geyer,et al.  Practical Markov Chain Monte Carlo , 1992 .

[47]  G. W. Snedecor Statistical Methods , 1964 .

[48]  Guidelines for second generation HIV Surveillance , 2000 .

[49]  D. Rubin,et al.  Inference from Iterative Simulation Using Multiple Sequences , 1992 .

[50]  Mark Jerrum,et al.  Approximating the Permanent , 1989, SIAM J. Comput..

[51]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994, Structural analysis in the social sciences.

[52]  Linda M Collins,et al.  Adaptive sampling in research on risk-related behaviors. , 2002, Drug and alcohol dependence.

[53]  A. V. D. Vaart,et al.  Lectures on probability theory and statistics , 2002 .

[54]  Ellis Barnett,et al.  Abdominal Ultrasonography, 2nd edn., Barry B. Goldberg (Ed.). John Wiley & Sons, New York (1984) , 1985 .

[55]  Juan Diaz,et al.  Assessment of risk factors for HIV infection among men who have sex with men in the metropolitan area of Campinas City, Brazil, using respondent-driven sampling , 2019 .

[56]  Tim Hesterberg,et al.  Monte Carlo Strategies in Scientific Computing , 2002, Technometrics.

[57]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[58]  Robert G Carlson,et al.  Respondent-driven sampling to recruit MDMA users: a methodological assessment. , 2005, Drug and alcohol dependence.

[59]  P. V. Marsden,et al.  NETWORK DATA AND MEASUREMENT , 1990 .

[60]  Mohsen Malekinejad,et al.  Implementation Challenges to Using Respondent-Driven Sampling Methodology for HIV Biological and Behavioral Surveillance: Field Experiences in International Settings , 2008, AIDS and Behavior.

[61]  C. McCarty,et al.  Comparing Two Methods for Estimating Network Size , 2001 .

[62]  Masud Reza,et al.  The Effectiveness of Respondent Driven Sampling for Recruiting Males Who have Sex with Males in Dhaka, Bangladesh , 2008, AIDS and Behavior.

[63]  Stefano Lazzari,et al.  Second-generation HIV surveillance: better data for , 2004 .

[64]  Douglas D. Heckathorn,et al.  From Networks to Populations: The Development and Application of Respondent-Driven Sampling Among IDUs and Latino Gay Men , 2005, AIDS and Behavior.

[65]  Scott M. Lynch,et al.  Introduction to Applied Bayesian Statistics and Estimation for Social Scientists , 2007 .

[66]  R. Garfein,et al.  Ethical and regulatory considerations in HIV prevention studies employing respondent-driven sampling. , 2009, The International journal on drug policy.

[67]  Neff Walker,et al.  HIV Surveillance: A Global Perspective , 2003, Journal of acquired immune deficiency syndromes.

[68]  A. Lansky,et al.  Developing an HIV Behavioral Surveillance System for Injecting Drug Users: The National HIV Behavioral Surveillance System , 2007, Public health reports.