Differential Treatment Benefit Prediction For Treatment Selection in Depression: A Deep Learning Analysis of STAR*D and CO-MED Data

Background Depression affects one in nine people, but treatment response rates remain low. There is significant potential in the use of computational modelling techniques to predict individual patient responses and thus provide more personalized treatment. Deep learning is a promising computational technique that can be used for differential treatment selection based on predicted remission probability. Methods Using STAR*D and CO-MED trial data, we employed deep neural networks to predict remission after feature selection. Differential treatment benefit was estimated in terms of improvement of population remission rates after application of the model for treatment selection using both naive and conservative approaches. The naïve approach assessed population remission rate in five sets of 200 patients held apart from the training set; the conservative approach used bootstrapping for sample generation and focused on population remission rate for patients who actually received the drug predicted by the model compared to the general population. Results Our deep learning model predicted remission in a pooled CO-MED/STAR*D dataset (including four treatments) with an AUC of 0.69 using 17 input features. Our naive analysis showed an improvement of remission of over 30% (from a 34.33% population remission rate to 46.12%). Our conservative analysis showed a 7.2% improvement in population remission rate (p= 0.01, C.I. 2.48% ± .5%). Conclusion Our model serves as proof-of-concept that deep learning has utility in differential prediction of antidepressant response when selecting from a number of treatment options. These models may have significant real-world clinical implications.

[1]  B. Lebowitz,et al.  Evaluation of outcomes with citalopram for depression using measurement-based care in STAR*D: implications for clinical practice. , 2006, The American journal of psychiatry.

[2]  M. Rietschel,et al.  Combining clinical variables to optimize prediction of antidepressant treatment outcomes. , 2016, Journal of psychiatric research.

[3]  Timothy C.Y. Chan,et al.  Applications of machine learning algorithms to predict therapeutic outcomes in depression: A meta-analysis and systematic review. , 2018, Journal of affective disorders.

[4]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[5]  Haibo He,et al.  Learning from Imbalanced Data , 2009, IEEE Transactions on Knowledge and Data Engineering.

[6]  Gustavo Turecki,et al.  Suicide and suicidal behaviour , 2016, The Lancet.

[7]  Martin J. Wainwright,et al.  Noisy matrix decomposition via convex relaxation: Optimal rates in high dimensions , 2011, ICML.

[8]  Klaus-Robert Müller,et al.  Explainable artificial intelligence , 2017 .

[9]  J. Ioannidis,et al.  Comparative efficacy and acceptability of 21 antidepressant drugs for the acute treatment of adults with major depressive disorder: a systematic review and network meta-analysis , 2018, The Lancet.

[10]  A. Hunter,et al.  The promise of the quantitative electroencephalogram as a predictor of antidepressant treatment outcomes in major depressive disorder. , 2007, The Psychiatric clinics of North America.

[11]  Yoshua Bengio,et al.  Practical Recommendations for Gradient-Based Training of Deep Architectures , 2012, Neural Networks: Tricks of the Trade.

[12]  Klaus-Robert Müller,et al.  Explainable Artificial Intelligence: Understanding, Visualizing and Interpreting Deep Learning Models , 2017, ArXiv.

[13]  R. Kessler The costs of depression. , 2012, The Psychiatric clinics of North America.

[14]  M. Fava,et al.  Somatic symptoms as predictors of time to onset of response to fluoxetine in major depressive disorder. , 2004, The Journal of clinical psychiatry.

[15]  D. Mehta,et al.  The role of DNA methylation in stress-related psychiatric disorders , 2014, Neuropharmacology.

[16]  A. Serretti,et al.  Socio-demographic and clinical predictors of non-response/non-remission in treatment resistant depressed patients: A systematic review , 2014, Psychiatry Research.

[17]  Eugene Lin,et al.  A Deep Learning Approach for Predicting Antidepressant Response in Major Depression Using Clinical and Genetic Biomarkers , 2018, Front. Psychiatry.

[18]  D. Kupfer,et al.  Acute and Longer- Term Outcomes in Depressed Outpatients Requiring One or Several Treatment Steps: A STAR*D Report , 2006 .

[19]  R. G. Rogers,et al.  The relationship between major depression and nonsuicide mortality for U.S. adults: the importance of health behaviors. , 2014, The journals of gerontology. Series B, Psychological sciences and social sciences.

[20]  David Benrimoh,et al.  A systematic meta-review of predictors of antidepressant treatment outcome in major depressive disorder. , 2019, Journal of affective disorders.

[21]  M. Fava,et al.  Impact of medical comorbid disease on antidepressant treatment of major depressive disorder , 2004, Current psychiatry reports.

[22]  K. Gadde,et al.  Bupropion SR enhances weight loss: a 48-week double-blind, placebo- controlled trial. , 2002, Obesity research.

[23]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[24]  Rich Caruana,et al.  Overfitting in Neural Nets: Backpropagation, Conjugate Gradient, and Early Stopping , 2000, NIPS.

[25]  Lucas Janson,et al.  Panning for gold: ‘model‐X’ knockoffs for high dimensional controlled variable selection , 2016, 1610.02351.

[26]  M. Rietschel,et al.  Depression symptom dimensions as predictors of antidepressant treatment outcome: replicable evidence for interest-activity symptoms , 2011, Psychological Medicine.

[27]  R. Lam,et al.  Canadian Network for Mood and Anxiety Treatments (CANMAT) 2016 Clinical Guidelines for the Management of Adults with Major Depressive Disorder , 2016, Canadian journal of psychiatry. Revue canadienne de psychiatrie.

[28]  Zachary D. Cohen,et al.  The Personalized Advantage Index: Translating Research on Prediction into Individualized Treatment Recommendations. A Demonstration , 2014, PloS one.

[29]  M. Fava,et al.  CLINICAL RELEVANCE OF FATIGUE AS A RESIDUAL SYMPTOM IN MAJOR DEPRESSIVE DISORDER , 2014, Depression and anxiety.

[30]  Edo Liberty,et al.  Stratified Sampling Meets Machine Learning , 2016, ICML.

[31]  C. Nemeroff,et al.  Persistent changes in corticotropin-releasing factor systems due to early life stress: relationship to the pathophysiology of major depression and post-traumatic stress disorder. , 1997, Psychopharmacology bulletin.

[32]  C. Nemeroff,et al.  Neurobiological effects of childhood abuse: implications for the pathophysiology of depression and anxiety , 2003, Archives of Women's Mental Health.

[33]  A. Lukowiak,et al.  Improved efficacy with targeted pharmacogenetic-guided treatment of patients with depression and anxiety: A randomized clinical trial demonstrating clinical utility. , 2018, Journal of psychiatric research.

[34]  M. Fava,et al.  Predictors, moderators, and mediators (correlates) of treatment outcome in major depressive disorder , 2008, Dialogues in clinical neuroscience.

[35]  M. Keshavan,et al.  New dimensions and new tools to realize the potential of RDoC: digital phenotyping via smartphones and connected devices , 2017, Translational Psychiatry.

[36]  M. Trivedi,et al.  Peripheral biomarkers of major depression and antidepressant treatment response: Current knowledge and future outlooks. , 2017, Journal of affective disorders.

[37]  A. Young,et al.  Diagnostic and therapeutic utility of neuroimaging in depression: an overview , 2014, Neuropsychiatric Disease and Treatment.

[38]  T. M. Shapiro,et al.  The Roots of the Widening Racial Wealth Gap: Explaining the Black-White Economic Divide , 2013 .

[39]  L. Marangell,et al.  COMPREHENSIVE REVIEW OF FACTORS IMPLICATED IN THE HETEROGENEITY OF RESPONSE IN DEPRESSION , 2012, Depression and anxiety.

[40]  Marcia K. Johnson,et al.  Cross-trial prediction of treatment outcome in depression: a machine learning approach. , 2016, The lancet. Psychiatry.

[41]  David Morganstein,et al.  Cost of lost productive work time among US workers with depression. , 2003, JAMA.

[42]  Daniel J. Müller,et al.  Canadian Network for Mood and Anxiety Treatments (CANMAT) 2016 Clinical Guidelines for the Management of Adults with Major Depressive Disorder , 2016, Canadian journal of psychiatry. Revue canadienne de psychiatrie.

[43]  L. Campbell,et al.  Racial differences in household wealth: Beyond Black and White , 2006 .

[44]  Rudolf Uher,et al.  Genes, environment, and individual differences in responding to treatment for depression. , 2011, Harvard review of psychiatry.

[45]  M. Rietschel,et al.  Antidepressant drug-specific prediction of depression treatment outcomes from genetic and clinical variables , 2018, Scientific Reports.

[46]  R. Shelton,et al.  Combining medications to enhance depression outcomes (CO-MED): acute and long-term outcomes of a single-blind randomized study. , 2011, The American journal of psychiatry.

[47]  C. Lee Giles,et al.  Overfitting and neural networks: conjugate gradient and backpropagation , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[48]  Geoffrey I. Webb,et al.  Encyclopedia of Machine Learning and Data Mining , 2017, Encyclopedia of Machine Learning and Data Mining.

[49]  J. Mann,et al.  Effects of genes and stress on the neurobiology of depression. , 2006, International review of neurobiology.

[50]  R. Iniesta,et al.  Machine learning, statistical learning and the future of biological research in psychiatry , 2016, Psychological Medicine.

[51]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[52]  D. Tracy,et al.  Bupropion: a systematic review and meta-analysis of effectiveness as an antidepressant , 2016, Therapeutic advances in psychopharmacology.

[53]  Adam Kapelner,et al.  Inference for the Effectiveness of Personalized Medicine with Software , 2014 .

[54]  Gustavo Turecki,et al.  Definition, Assessment, and Staging of Treatment—Resistant Refractory Major Depression: A Review of Current Concepts and Methods , 2007, Canadian journal of psychiatry. Revue canadienne de psychiatrie.

[55]  Sijian Wang,et al.  RANDOM LASSO. , 2011, The annals of applied statistics.

[56]  Eamonn J. Keogh,et al.  Curse of Dimensionality , 2010, Encyclopedia of Machine Learning.