Analytical methods used in estimating the prevalence of HIV/AIDS from demographic and cross-sectional surveys with missing data: a systematic review

Background Sero- prevalence studies often have a problem of missing data. Few studies report the proportion of missing data and even fewer describe the methods used to adjust the results for missing data. The objective of this review was to determine the analytical methods used for analysis in HIV surveys with missing data. Methods We searched for population, demographic and cross-sectional surveys of HIV published from January 2000 to April 2018 in Pub Med/Medline, Web of Science core collection, Latin American and Caribbean Sciences Literature, Africa-Wide Information and Scopus, and by reviewing references of included articles. All potential abstracts were imported into Covidence and abstracts screened by two independent reviewers using pre-specified criteria. Disagreements were resolved through discussion. A piloted data extraction tool was used to extract data and assess the risk of bias of the eligible studies. Data were analysed through a quantitative approach; variables were presented and summarised using figures and tables. Results A total of 3426 citations where identified, 194 duplicates removed, 3232 screened and 69 full articles were obtained. Twenty-four studies were included. The response rate for an HIV test of the included studies ranged from 32 to 96% with the major reason for the missing data being refusal to consent for an HIV test. Complete case analysis was the primary method of analysis used, multiple imputations 11(46%) was the most advanced method used, followed by the Heckman’s selection model 9(38%). Single Imputation and Instrumental variables method were used in only two studies each, with 13(54%) other different methods used in several studies. Forty-two percent of the studies applied more than two methods in the analysis, with a maximum of 4 methods per study. Only 6(25%) studies conducted a sensitivity analysis, while 11(46%) studies had a significant change of estimates after adjusting for missing data. Conclusion Missing data in survey studies is still a problem in disease estimation. Our review outlined a number of methods that can be used to adjust for missing data on HIV studies; however, more information and awareness are needed to allow informed choices on which method to be applied for the estimates to be more reliable and representative.

[1]  J. Chow,et al.  Nitric oxide synthase expression in bone cells. , 1998, Bone.

[2]  I. Olkin,et al.  Meta-analysis of observational studies in epidemiology - A proposal for reporting , 2000 .

[3]  英语-翻译-Internet , 2000 .

[4]  L Wu,et al.  A multiple imputation method for missing covariates in non‐linear mixed‐effects models with application to HIV dynamics , 2001, Statistics in medicine.

[5]  Geert Molenberghs,et al.  Sensitivity analysis for incomplete categorical data , 2001 .

[6]  Therese D. Pigott,et al.  A Review of Methods for Missing Data , 2001 .

[7]  C. Struchiner,et al.  The estimated magnitude of AIDS in Brazil: a delay correction applied to cases with lost dates. , 2002, Cadernos de saude publica.

[8]  P. Patrician Multiple imputation for missing data. , 2002, Research in nursing & health.

[9]  Herbert Thijs Sensitivity Analysis for Incomplete Data , 2002 .

[10]  J. Schafer,et al.  Missing data: our view of the state of the art. , 2002, Psychological methods.

[11]  Hilde van der Togt,et al.  Publisher's Note , 2003, J. Netw. Comput. Appl..

[12]  R. Irizarry,et al.  Generalized Additive Selection Models for the Analysis of Studies with Potentially Nonignorable Missing Outcome Data , 2003, Biometrics.

[13]  J Ties Boerma,et al.  Estimates of HIV-1 prevalence from national population-based surveys as a new gold standard , 2003, The Lancet.

[14]  J Stover,et al.  Uncertainty in estimates of HIV/AIDS: the estimation and application of plausibility bounds , 2004, Sexually Transmitted Infections.

[15]  A. Cross,et al.  HIV testing in national population-based surveys: experience from the Demographic and Health Surveys. , 2006, Bulletin of the World Health Organization.

[16]  J. Garcia-Calleja,et al.  National population based HIV prevalence surveys in sub-Saharan Africa: results and implications for HIV and AIDS estimates , 2006, Sexually Transmitted Infections.

[17]  Tshilidzi Marwala,et al.  Missing data: A comparison of neural network and expectation maximization techniques , 2007 .

[18]  Guo-Liang Tian,et al.  An exact non‐iterative sampling procedure for discrete missing data problems , 2007 .

[19]  J. Zenilman,et al.  Relative prevalence of different sexually transmitted infections in HIV-discordant sexual partnerships: data from a risk network study in a high-risk New York neighbourhood , 2007, Sexually Transmitted Infections.

[20]  E. Gouws,et al.  Comparison of adult HIV prevalence from national population-based surveys and antenatal clinic surveillance in countries with generalised epidemics: implications for calibrating surveillance data , 2008, Sexually Transmitted Infections.

[21]  Douglas G Altman,et al.  [The Strengthening the Reporting of Observational Studies in Epidemiology [STROBE] statement: guidelines for reporting observational studies]. , 2007, Gaceta sanitaria.

[22]  A. Raftery,et al.  Bayesian melding for estimating uncertainty in national HIV prevalence estimates , 2008, Sexually Transmitted Infections.

[23]  Tshilidzi Marwala,et al.  Estimating Missing Data and Determining the Confidence of the Estimate Data , 2008, 2008 Seventh International Conference on Machine Learning and Applications.

[24]  M. Marston,et al.  Non-response bias in estimates of HIV prevalence due to the mobility of absentees in national population-based surveys: a study of nine national surveys , 2008, Sexually Transmitted Infections.

[25]  N. McGrath,et al.  Individual, household and community factors associated with HIV test refusal in rural Malawi , 2008, Tropical medicine & international health : TM & IH.

[26]  V. Mishra,et al.  Comparison of HIV prevalence estimates from antenatal care surveillance and population-based surveys in sub-Saharan Africa , 2008, Sexually Transmitted Infections.

[27]  Geert Molenberghs,et al.  Sensitivity analysis for incomplete data , 2008 .

[28]  Tshilidzi Marwala,et al.  Imputation of Missing Data Using PCA, Neuro-Fuzzy and Genetic Algorithms , 2009, ICONIP.

[29]  V. Mishra,et al.  Evaluation of bias in HIV seroprevalence estimates from national household surveys , 2008, Sexually Transmitted Infections.

[30]  A. DiRienzo Flexible Regression Model Selection for Survival Probabilities: With Application to AIDS , 2009, Biometrics.

[31]  G. Davey,et al.  Implications of the HIV testing protocol for refusal bias in seroprevalence surveys , 2009, BMC public health.

[32]  M. I. Santiago-Pérez,et al.  Fagerstrom test for nicotine dependence vs heavy smoking index in a general population survey , 2009, BMC public health.

[33]  Tshilidzi Marwala,et al.  Missing Data Imputation Through the Use of the Random Forest Algorithm , 2009 .

[34]  Viet Chi Tran,et al.  HIV with contact tracing: a case study in approximate Bayesian computation. , 2008, Biostatistics.

[35]  A. Cohen,et al.  The Summary Index of Malaria Surveillance (SIMS): a stable index of malaria within India , 2010, Population health metrics.

[36]  N. Madise,et al.  The effect of participant nonresponse on HIV prevalence estimates in a population-based survey in two informal settlements in Nairobi city , 2010, Population health metrics.

[37]  Ron Brookmeyer,et al.  Measuring the HIV/AIDS epidemic: approaches and challenges. , 2010, Epidemiologic reviews.

[38]  T. Bärnighausen,et al.  Adjusting HIV Prevalence for Survey Non-Response Using Mortality Rates: An Application of the Method Using Surveillance Data from Rural South Africa , 2010, PloS one.

[39]  E. T. Tchetgen Tchetgen,et al.  Adjustment for Missing Data in Complex Surveys Using Doubly Robust Estimation: Application to Commercial Sexual Contact Among Indian Men , 2010, Epidemiology.

[40]  R. Kane,et al.  A systematic review of tools used to assess the quality of observational studies that examine incidence or prevalence and risk factors for diseases. , 2010, Journal of clinical epidemiology.

[41]  F. Obare Nonresponse in repeat population-based voluntary counseling and testing for HIV in rural Malawi , 2010, Demography.

[42]  Ofer Harel,et al.  Are We Missing the Importance of Missing Values in HIV Prevention Randomized Clinical Trials? Review and Recommendations , 2012, AIDS and Behavior.

[43]  M. Alary,et al.  Use of routine data collected by the prevention of mother-to-child transmission program for HIV surveillance among pregnant women in Rwanda: opportunities and limitations , 2011, AIDS care.

[44]  David Canning,et al.  Correcting HIV Prevalence Estimates for Survey Nonparticipation Using Heckman-type Selection Models , 2011, Epidemiology.

[45]  H. D. de Vet,et al.  Missing Data: A Systematic Review of How They Are Reported and Handled , 2012, Epidemiology.

[46]  F. Blyth,et al.  Assessing risk of bias in prevalence studies: modification of an existing tool and evidence of interrater agreement. , 2012, Journal of clinical epidemiology.

[47]  T. Bärnighausen,et al.  HIV status and participation in HIV surveillance in the era of antiretroviral treatment: a study of linked population-based and clinical data in rural South Africa , 2012, Tropical medicine & international health : TM & IH.

[48]  Daniel Westreich,et al.  Berkson's bias, selection bias, and missing data. , 2012, Epidemiology.

[49]  D. Greenwood,et al.  Meta-analysis of Observational Studies , 2012 .

[50]  Stephen S. Lim,et al.  A comparison of missing data procedures for addressing selection bias in HIV sentinel surveillance data , 2013, Population Health Metrics.

[51]  E. Gouws,et al.  HIV prevalence measurement in household surveys: is awareness of HIV status complicating the gold standard? , 2013, AIDS.

[52]  A. Hartz,et al.  Why is greater medication adherence associated with better outcomes , 2013, Emerging Themes in Epidemiology.

[53]  N Woodford,et al.  Carbapenemase-producing Enterobacteriaceae in Europe: a survey among national experts from 39 countries, February 2013. , 2013, Euro surveillance : bulletin Europeen sur les maladies transmissibles = European communicable disease bulletin.

[54]  N. French,et al.  Underestimation of HIV prevalence in surveys when some people already know their status, and ways to reduce the bias , 2013, AIDS.

[55]  C. Kendall,et al.  HIV among MSM in a large middle-income country , 2013, AIDS.

[56]  Estimating HIV prevalence from surveys with low individual consent rates: annealing individual and pooled samples , 2013, Emerging Themes in Epidemiology.

[57]  M. Rosińska,et al.  Increase of new HIV diagnoses among men who have sex with men in Poland, 2000 to 2011. , 2013, Euro surveillance : bulletin Europeen sur les maladies transmissibles = European communicable disease bulletin.

[58]  Douglas G Altman,et al.  The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) Statement: guidelines for reporting observational studies. , 2014, International journal of surgery.

[59]  J. Carpenter,et al.  Practice of Epidemiology Comparison of Random Forest and Parametric Imputation Models for Imputing Missing Data Using MICE: A CALIBER Study , 2014 .

[60]  C. Kendall,et al.  Population Size, HIV, and Behavior Among MSM in Luanda, Angola: Challenges and Findings in the First Ever HIV and Syphilis Biological and Behavioral Survey , 2014, Journal of acquired immune deficiency syndromes.

[61]  Franco Peracchi,et al.  Using panel data for partial identification of human immunodeficiency virus prevalence when infection status is missing not at random , 2014 .

[62]  J. Larmarange,et al.  HIV estimates at second subnational level from national population-based surveys , 2014, AIDS.

[63]  T. F. Rinke de Wit,et al.  Refusal Bias in the Estimation of HIV Prevalence , 2014, Demography.

[64]  S. Clark,et al.  Validation, Replication, and Sensitivity Testing of Heckman-Type Selection Models to Adjust Estimates of HIV Prevalence , 2014, PloS one.

[65]  M. Brownell,et al.  What factors contribute to positive early childhood health and development in Australian Aboriginal children? Protocol for a population-based cohort study using linked administrative data (The Seeding Success Study) , 2015, BMJ Open.

[66]  Hai Lin,et al.  Lessons Learned from Whole Exome Sequencing in Multiplex Families Affected by a Complex Genetic Disorder, Intracranial Aneurysm , 2015, PloS one.

[67]  D. Canning,et al.  Adjusting HIV prevalence estimates for non-participation: an application to demographic surveillance , 2015, Journal of the International AIDS Society.

[68]  S. Houghton,et al.  Virtually impossible: limiting Australian children and adolescents daily screen based media use , 2015, BMC Public Health.

[69]  H. Mwambi,et al.  Multiple imputation for non-response when estimating HIV prevalence using survey data , 2015, BMC Public Health.

[70]  Yangxin Huang,et al.  Bayesian Two-Part Tobit Models with Left-Censoring, Skewness, and Nonignorable Missingness , 2015, Journal of biopharmaceutical statistics.

[71]  Gabriel Erion,et al.  Viral Genetic Linkage Analysis in the Presence of Missing Data , 2015, PloS one.

[72]  Thomas Agoritsas,et al.  Handling trial participants with missing outcome data when conducting a meta-analysis: a systematic survey of proposed approaches , 2015, Systematic Reviews.

[73]  Yang Liu,et al.  Multiple Imputation by Fully Conditional Specification for Dealing with Missing Data in a Large Epidemiologic Study , 2015, International journal of statistics in medical research.

[74]  Carol Jagger,et al.  Assessing the validity of the Global Activity Limitation Indicator in fourteen European countries , 2015, BMC Medical Research Methodology.

[75]  G. Guyatt,et al.  Reporting, handling and assessing the risk of bias associated with missing participant data in systematic reviews: a methodological survey , 2022 .

[76]  Rosalba Radice,et al.  On the Assumption of Bivariate Normality in Selection Models: A Copula Approach Applied to Estimating HIV Prevalence , 2015, Epidemiology.

[77]  D. Canning,et al.  Using interviewer random effects to remove selection bias from HIV prevalence estimates , 2015, BMC Medical Research Methodology.

[78]  Eric J Tchetgen Tchetgen,et al.  A general instrumental variable framework for regression analysis with outcome missing not at random , 2017, Biometrics.

[79]  Ian R White,et al.  Analyses of Sensitivity to the Missing-at-Random Assumption Using Multiple Imputation With Delta Adjustment: Application to a Tuberculosis/HIV Prevalence Survey With Incomplete HIV-Status Data , 2017, American journal of epidemiology.

[80]  National South African HIV prevalence estimates robust despite substantial test non-participation. , 2017, South African medical journal = Suid-Afrikaanse tydskrif vir geneeskunde.

[81]  Rosalba Radice,et al.  A Simultaneous Equation Approach to Estimating HIV Prevalence With Nonignorable Missing Responses , 2017 .

[82]  Empirical likelihood method for non-ignorable missing data problems , 2017, Lifetime data analysis.

[83]  Eric J Tchetgen Tchetgen,et al.  On Inverse Probability Weighting for Nonmonotone Missing at Random Data , 2014, Journal of the American Statistical Association.

[84]  M. Pagano,et al.  Role of survey response rates on valid inference: an application to HIV prevalence estimates , 2018, Emerging Themes in Epidemiology.

[85]  L. Johnston,et al.  HIV prevalence among men who have sex with men in Brazil: results of the 2nd national survey using respondent-driven sampling , 2018, Medicine.

[86]  Michael Schomaker,et al.  Bootstrap inference when using multiple imputation , 2016, Statistics in medicine.

[87]  Emmanuel Grellety,et al.  Change in quality of malnutrition surveys between 1986 and 2015 , 2018, Emerging Themes in Epidemiology.

[88]  Stefan Walter,et al.  Implementation of Instrumental Variable Bounds for Data Missing Not at Random , 2018, Epidemiology.