A comparison of bivariate, multivariate random-effects, and Poisson correlated gamma-frailty models to meta-analyze individual patient data of ordinal scale diagnostic tests.

Individual patient data (IPD) meta-analyses are increasingly common in the literature. In the context of estimating the diagnostic accuracy of ordinal or semi-continuous scale tests, sensitivity and specificity are often reported for a given threshold or a small set of thresholds, and a meta-analysis is conducted via a bivariate approach to account for their correlation. When IPD are available, sensitivity and specificity can be pooled for every possible threshold. Our objective was to compare the bivariate approach, which can be applied separately at every threshold, to two multivariate methods: the ordinal multivariate random-effects model and the Poisson correlated gamma-frailty model. Our comparison was empirical, using IPD from 13 studies that evaluated the diagnostic accuracy of the 9-item Patient Health Questionnaire depression screening tool, and included simulations. The empirical comparison showed that the implementation of the two multivariate methods is more laborious in terms of computational time and sensitivity to user-supplied values compared to the bivariate approach. Simulations showed that ignoring the within-study correlation of sensitivity and specificity across thresholds did not worsen inferences with the bivariate approach compared to the Poisson model. The ordinal approach was not suitable for simulations because the model was highly sensitive to user-supplied starting values. We tentatively recommend the bivariate approach rather than more complex multivariate methods for IPD diagnostic accuracy meta-analyses of ordinal scale tests, although the limited type of diagnostic data considered in the simulation study restricts the generalization of our findings.

[1]  Theo Stijnen,et al.  The binomial distribution of meta-analysis was preferred to model within-study variability. , 2008, Journal of clinical epidemiology.

[2]  R. Spitzer,et al.  Validation and utility of a self-report version of PRIME-MD: the PHQ primary care study. Primary Care Evaluation of Mental Disorders. Patient Health Questionnaire. , 1999, JAMA.

[3]  B. Thombs,et al.  Optimizing Detection of Major Depression Among Patients with Coronary Artery Disease Using the Patient Health Questionnaire: Data from the Heart and Soul Study , 2008, Journal of General Internal Medicine.

[4]  C Gatsonis,et al.  Meta‐analysis of Diagnostic Test Accuracy Assessment Studies with Varying Number of Thresholds , 2003, Biometrics.

[5]  A new serially correlated gamma-frailty process for longitudinal count data. , 2009, Biostatistics.

[6]  S. Zipfel,et al.  Screening psychischer Störungen mit dem "Gesundheitsfragebogen für Patienten (PHQ-D)" Ergebnisse der deutschen Validierungsstudie , 2004 .

[7]  D. Bates,et al.  Linear Mixed-Effects Models using 'Eigen' and S4 , 2015 .

[8]  Haitao Chu,et al.  Bivariate meta-analysis of sensitivity and specificity with sparse data: a generalized linear mixed model approach. , 2006, Journal of clinical epidemiology.

[9]  A. Schene,et al.  The accuracy of Patient Health Questionnaire-9 in detecting depression and measuring depression severity in high-risk groups in primary care. , 2009, General hospital psychiatry.

[10]  Dan Jackson,et al.  Multivariate meta-analysis: Potential and promise , 2011, Statistics in medicine.

[11]  U. Palm,et al.  Screening for Psychiatric Disorders in Bariatric Surgery Candidates with the German Version of the Patient Health Questionnaire , 2014 .

[12]  Paolo Eusebi,et al.  Diagnostic Accuracy Measures , 2013, Cerebrovascular Diseases.

[13]  Rodney X. Sturdivant,et al.  Applied Logistic Regression: Hosmer/Applied Logistic Regression , 2005 .

[14]  William N. Venables,et al.  Modern Applied Statistics with S , 2010 .

[15]  H Putter,et al.  Meta‐analysis of pairs of survival curves under heterogeneity: A Poisson correlated gamma‐frailty approach , 2009, Statistics in medicine.

[16]  Haitao Chu,et al.  A unification of models for meta-analysis of diagnostic accuracy studies. , 2009, Biostatistics.

[17]  H C Van Houwelingen,et al.  A bivariate approach to meta-analysis. , 1993, Statistics in medicine.

[18]  Wanzhu Tu,et al.  Performance of the PHQ-9 as a Screening Tool for Depression After Stroke , 2005, Stroke.

[19]  M. Barkham,et al.  Diagnosing depression in primary care using self-completed instruments: UK validation of PHQ-9 and CORE-OM. , 2007, The British journal of general practice : the journal of the Royal College of General Practitioners.

[20]  Liang Zhu,et al.  On fitting generalized linear mixed‐effects models for binary responses using different statistical packages , 2011, Statistics in medicine.

[21]  M. Lotrakul,et al.  Reliability and validity of the Thai version of the PHQ-9 , 2008, BMC psychiatry.

[22]  M. Fava,et al.  Validation of the Patient Health Questionnaire-9 for depression screening among Chinese Americans. , 2008, Comprehensive psychiatry.

[23]  Richard D Riley,et al.  Borrowing of strength and study weights in multivariate and network meta-analysis , 2017, Statistical methods in medical research.

[24]  R. Riley,et al.  Meta-analysis of individual participant data: rationale, conduct, and reporting , 2010, BMJ : British Medical Journal.

[25]  Richard D Riley,et al.  Multivariate meta‐analysis: the effect of ignoring within‐study correlation , 2009 .

[26]  M. M. Shah,et al.  Validation of the Malay Version Brief Patient Health Questionnaire (PHQ-9) among Adult Attending Family Medicine Clinics. , 2005 .

[27]  Theo Stijnen,et al.  Multivariate random effects meta-analysis of diagnostic tests with multiple thresholds , 2009, BMC medical research methodology.

[28]  C M Rutter,et al.  A hierarchical regression approach to meta‐analysis of diagnostic test accuracy evaluations , 2001, Statistics in medicine.

[29]  Mary A Whooley,et al.  Selective Cutoff Reporting in Studies of Diagnostic Test Accuracy: A Comparison of Conventional and Individual-Patient-Data Meta-Analyses of the Patient Health Questionnaire-9 Depression Screening Tool , 2017, American journal of epidemiology.

[30]  David W. Hosmer,et al.  Applied Logistic Regression , 1991 .

[31]  Hein Putter,et al.  The bootstrap: a tutorial , 2000 .

[32]  Gerta Rücker,et al.  Modelling multiple thresholds in meta-analysis of diagnostic test accuracy studies , 2016, BMC Medical Research Methodology.

[33]  C. Hewitt,et al.  Screening for Depression in Medical Settings with the Patient Health Questionnaire (PHQ): A Diagnostic Meta-Analysis , 2007, Journal of General Internal Medicine.

[34]  S. Dikmen,et al.  Validity of the Patient Health Questionnaire‐9 in Assessing Depression Following Traumatic Brain Injury , 2005, The Journal of head trauma rehabilitation.

[35]  Johannes B Reitsma,et al.  Bivariate analysis of sensitivity and specificity produces informative summary measures in diagnostic reviews. , 2005, Journal of clinical epidemiology.

[36]  F. Buntinx,et al.  The value of the CAGE in screening for alcohol abuse and alcohol dependence in general clinical populations: a diagnostic meta-analysis. , 2004, Journal of clinical epidemiology.

[37]  H Putter,et al.  Meta‐Analysis of Diagnostic Test Accuracy Studies with Multiple Thresholds using Survival Methods , 2009, Biometrical journal. Biometrische Zeitschrift.

[38]  J. Knottnerus,et al.  Summed score of the Patient Health Questionnaire-9 was a reliable and valid method for depression screening in chronically ill elderly patients. , 2008, Journal of clinical epidemiology.

[39]  Laura Manea,et al.  Optimal cut-off score for diagnosing depression with the Patient Health Questionnaire (PHQ-9): a meta-analysis , 2012, Canadian Medical Association Journal.

[40]  R. Spitzer,et al.  The PHQ-9: A new depression diagnostic and severity measure , 2002 .

[41]  Roger M Harbord,et al.  A unification of models for meta-analysis of diagnostic accuracy studies. , 2007, Biostatistics.

[42]  S. Crow,et al.  Postpartum Depression Screening at Well-Child Visits: Validity of a 2-Question Screen and the PHQ-9 , 2009, The Annals of Family Medicine.

[43]  R. Spitzer,et al.  The PHQ-9 , 2001, Journal of General Internal Medicine.

[44]  Richard D Riley,et al.  Rejoinder to commentaries on ‘Multivariate meta‐analysis: Potential and promise’ , 2011 .

[45]  Brian D. Ripley,et al.  Modern applied statistics with S, 4th Edition , 2002, Statistics and computing.

[46]  T. Hothorn,et al.  Multivariate Normal and t Distributions , 2016 .

[47]  Richard D Riley,et al.  Meta-analysis of test accuracy studies with multiple and missing thresholds : a multivariate-normal model , 2014 .

[48]  A. Schene,et al.  Diagnostic accuracy of the mood module of the Patient Health Questionnaire: a systematic review. , 2007, General hospital psychiatry.

[49]  M. Berk,et al.  Validity of the Hospital Anxiety and Depression Scale and Patient Health Questionnaire-9 to screen for depression in patients with coronary artery disease. , 2007, General hospital psychiatry.

[50]  J. Crippa,et al.  Study of the discriminative validity of the PHQ-9 and PHQ-2 in a sample of Brazilian women in the context of primary health care. , 2009, Perspectives in psychiatric care.

[51]  L E Moses,et al.  Combining independent studies of a diagnostic test into a summary ROC curve: data-analytic approaches and some additional considerations. , 1993, Statistics in medicine.

[52]  Carlo Gaetan,et al.  Composite likelihood methods for space-time data , 2006 .

[53]  S. Fuchs,et al.  Postnatal depression in Southern Brazil: prevalence and its demographic and socioeconomic determinants , 2008, BMC psychiatry.