A diagnostic meta-analysis of the Patient Health Questionnaire-9 (PHQ-9) algorithm scoring method as a screen for depression.

BACKGROUND The depression module of the Patient Health Questionnaire-9 (PHQ-9) is a widely used depression screening instrument in nonpsychiatric settings. The PHQ-9 can be scored using different methods, including an algorithm based on Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition criteria and a cut-off based on summed-item scores. The algorithm was the originally proposed scoring method to screen for depression. We summarized the diagnostic test accuracy of the PHQ-9 using the algorithm scoring method across a range of validation studies and compared the diagnostic properties of the PHQ-9 using the algorithm and summed scoring method at the proposed cut-off point of 10. METHODS We performed a systematic review of diagnostic accuracy studies of the PHQ-9 using the algorithm scoring method to detect major depressive disorder (MDD). We used meta-analytic methods to calculate summary sensitivity, specificity, likelihood ratios and diagnostic odds ratios for diagnosing MDD of the PHQ-9 using algorithm scoring method. In studies that reported both scoring methods (algorithm and summed-item scoring at proposed cut-off point of ≥10), we compared the diagnostic properties of the PHQ-9 using these methods. RESULTS We found 27 validation studies that validated the algorithm scoring method of the PHQ-9 in various settings. There was substantial heterogeneity across studies, which makes the pooled results difficult to interpret. In general, sensitivity was low whereas specificity was good. Thirteen studies reported the diagnostic properties of the PHQ-9 for both scoring methods. Pooled sensitivity for algorithm scoring method was lower while specificities were good for both scoring methods. Heterogeneity was consistently high; therefore, caution should be used when interpreting these results. INTERPRETATION This review shows that, if the algorithm scoring method is used, the PHQ-9 has a low sensitivity for detecting MDD. This could be due to the rating scale categories of the measure, higher specificity or other factors that warrant further research. The summed-item score method at proposed cut-off point of ≥10 has better diagnostic performance for screening purposes or where a high sensitivity is needed.

[1]  I. Nazareth,et al.  The Patient Health Questionnaire-9 for detection of major depressive disorder in primary care: consequences of current thresholds in a crosssectional study , 2010, BMC family practice.

[2]  A. Schene,et al.  Diagnostic accuracy of the mood module of the Patient Health Questionnaire: a systematic review. , 2007, General hospital psychiatry.

[3]  R. Spitzer,et al.  The PHQ-9 , 2001, Journal of General Internal Medicine.

[4]  R. Spitzer,et al.  Validation and utility of a self-report version of PRIME-MD: the PHQ primary care study. Primary Care Evaluation of Mental Disorders. Patient Health Questionnaire. , 1999, JAMA.

[5]  C. Hewitt,et al.  Screening for Depression in Medical Settings with the Patient Health Questionnaire (PHQ): A Diagnostic Meta-Analysis , 2007, Journal of General Internal Medicine.

[6]  S. Cañizares,et al.  Depressive and anxiety disorders in chronic hepatitis C patients: reliability and validity of the Patient Health Questionnaire. , 2012, Journal of affective disorders.

[7]  M. Berk,et al.  Validity of the Hospital Anxiety and Depression Scale and Patient Health Questionnaire-9 to screen for depression in patients with coronary artery disease. , 2007, General hospital psychiatry.

[8]  Douglas G. Altman,et al.  Systematic Reviews in Health Care , 2001 .

[9]  Victor M Montori,et al.  Conducting systematic reviews of diagnostic studies: didactic guidelines , 2002, BMC medical research methodology.

[10]  Patrick M M Bossuyt,et al.  Exploring sources of heterogeneity in systematic reviews of diagnostic tests , 2002, Statistics in medicine.

[11]  H. Möller,et al.  Use of brief depression screening tools in primary care: consideration of heterogeneity in performance in different patient groups. , 2004, General hospital psychiatry.

[12]  D. Altman,et al.  Measuring inconsistency in meta-analyses , 2003, BMJ : British Medical Journal.

[13]  T. Hyphantis,et al.  Diagnostic accuracy, internal consistency, and convergent validity of the Greek version of the patient health questionnaire 9 in diagnosing depression in rheumatologic disorders , 2011, Arthritis care & research.

[14]  S. Dikmen,et al.  Validity of the Patient Health Questionnaire‐9 in Assessing Depression Following Traumatic Brain Injury , 2005, The Journal of head trauma rehabilitation.

[15]  Johannes B Reitsma,et al.  Bivariate analysis of sensitivity and specificity produces informative summary measures in diagnostic reviews. , 2005, Journal of clinical epidemiology.

[16]  P. Bech,et al.  ‘Do you think you suffer from depression?’ Reevaluating the use of a single item question for the screening of depression in older primary care patients , 2010, International journal of geriatric psychiatry.

[17]  R. Perera,et al.  Adaptation and validation of the Charlson Index for Read/OXMIS coded databases , 2010, BMC family practice.

[18]  A. Heinemann,et al.  Improving measurement properties of the Patient Health Questionnaire-9 with rating scale analysis. , 2009, Rehabilitation psychology.

[19]  N. Kerse,et al.  Validation of PHQ-2 and PHQ-9 to Screen for Major Depression in the Primary Care Population , 2010, The Annals of Family Medicine.

[20]  Karina Lovell,et al.  Patient Health Questionnaire , 2013 .

[21]  K. Kroenke,et al.  Screening for major depression in cancer outpatients , 2011, Cancer.

[22]  S G Thompson,et al.  Systematic Review: Why sources of heterogeneity in meta-analysis should be investigated , 1994, BMJ.

[23]  Alex J Sutton,et al.  Asymmetric funnel plots and publication bias in meta-analyses of diagnostic accuracy. , 2002, International journal of epidemiology.

[24]  K. Bungay,et al.  Screening for depressive disorders in patients with skin diseases: a comparison of three screeners. , 2005, Acta dermato-venereologica.

[25]  Shaun M. Eack,et al.  Limitations of the Patient Health Questionnaire in Identifying Anxiety and Depression in Community Mental Health: Many Cases are Undetected , 2006 .

[26]  Susan Mallett,et al.  QUADAS-2: A Revised Tool for the Quality Assessment of Diagnostic Accuracy Studies , 2011, Annals of Internal Medicine.

[27]  J. Vandenberghe,et al.  Anxiety and mood disorders in otorhinolaryngology outpatients presenting with dizziness: validation of the self-administered PRIME-MD Patient Health Questionnaire and epidemiology. , 2003, General hospital psychiatry.

[28]  M. Lotrakul,et al.  Reliability and validity of the Thai version of the PHQ-9 , 2008, BMC psychiatry.

[29]  T. Furukawa,et al.  Validity of the Patient Health Questionnaire (PHQ)-9 and PHQ-2 in general internal medicine primary care at a Japanese rural hospital: a cross-sectional study. , 2013, General hospital psychiatry.

[30]  Stephan Zipfel,et al.  Comparative validity of three screening questionnaires for DSM-IV depressive disorders and physicians' diagnoses. , 2004, Journal of affective disorders.

[31]  S. Crow,et al.  Postpartum Depression Screening at Well-Child Visits: Validity of a 2-Question Screen and the PHQ-9 , 2009, The Annals of Family Medicine.

[32]  K Kroenke,et al.  Validation and Utility of the Patient Health Questionnaire in Diagnosing Mental Disorders in 1003 General Hospital Spanish Inpatients , 2001, Psychosomatic medicine.

[33]  J. Knottnerus,et al.  Summed score of the Patient Health Questionnaire-9 was a reliable and valid method for depression screening in chronically ill elderly patients. , 2008, Journal of clinical epidemiology.

[34]  Laura Manea,et al.  Optimal cut-off score for diagnosing depression with the Patient Health Questionnaire (PHQ-9): a meta-analysis , 2012, Canadian Medical Association Journal.

[35]  David Haber,et al.  Guide to clinical preventive services: a challenge to physician resourcefulness , 1993 .

[36]  A. Turner,et al.  Depression Screening in Stroke: A Comparison of Alternative Measures With the Structured Diagnostic Interview for the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (Major Depressive Episode) as Criterion Standard , 2012, Stroke.

[37]  C. Hewitt,et al.  Assessing the quality of diagnostic studies using psychometric instruments: applying QUADAS , 2009, Social Psychiatry and Psychiatric Epidemiology.

[38]  Kumiko Muramatsu,et al.  The Patient Health Questionnaire, Japanese Version: Validity According to the Mini-International Neuropsychiatric Interview–Plus , 2007, Psychological reports.

[39]  B. Thombs,et al.  Optimizing Detection of Major Depression Among Patients with Coronary Artery Disease Using the Patient Health Questionnaire: Data from the Heart and Soul Study , 2008, Journal of General Internal Medicine.

[40]  R. Hays,et al.  Diagnostic accuracy and agreement across three depression assessment measures for Parkinson's disease. , 2011, Parkinsonism & related disorders.

[41]  K. Pottie,et al.  Recommendations on screening for depression in adults , 2013, Canadian Medical Association Journal.

[42]  M. Von Korff,et al.  Case Management for Depression by Health Care Assistants in Small Primary Care Practices , 2009, Annals of Internal Medicine.

[43]  A. Beekman,et al.  Validation of the PHQ-9 as a screening instrument for depression in diabetes patients in specialized outpatient clinics , 2010, BMC health services research.

[44]  D. Moher,et al.  Preferred reporting items for systematic reviews and meta-analyses: the PRISMA Statement , 2009, BMJ : British Medical Journal.

[45]  S. Zipfel,et al.  Screening psychischer Störungen mit dem "Gesundheitsfragebogen für Patienten (PHQ-D)" Ergebnisse der deutschen Validierungsstudie , 2004 .

[46]  Ann Vander Stoep,et al.  Validity of the patient health questionnaire-9 for depression screening and diagnosis in East Africa , 2013, Psychiatry Research.

[47]  S. Thompson,et al.  How should meta‐regression analyses be undertaken and interpreted? , 2002, Statistics in medicine.

[48]  E. Tacconelli Systematic reviews: CRD's guidance for undertaking reviews in health care , 2010 .

[49]  M. Malek,et al.  Comparison of the CES-D and PHQ-9 depression scales in people with type 2 diabetes in Tehran, Iran , 2011, BMC psychiatry.