Considering patient clinical history impacts performance of machine learning models in predicting course of multiple sclerosis

Multiple Sclerosis (MS) progresses at an unpredictable rate, but predictions on the disease course in each patient would be extremely useful to tailor therapy to the individual needs. We explore different machine learning (ML) approaches to predict whether a patient will shift from the initial Relapsing-Remitting (RR) to the Secondary Progressive (SP) form of the disease, using only “real world” data available in clinical routine. The clinical records of 1624 outpatients (207 in the SP phase) attending the MS service of Sant’Andrea hospital, Rome, Italy, were used. Predictions at 180, 360 or 720 days from the last visit were obtained considering either the data of the last available visit (Visit-Oriented setting), comparing four classical ML methods (Random Forest, Support Vector Machine, K-Nearest Neighbours and AdaBoost) or the whole clinical history of each patient (History-Oriented setting), using a Recurrent Neural Network model, specifically designed for historical data. Missing values were handled by removing either all clinical records presenting at least one missing parameter (Feature-saving approach) or the 3 clinical parameters which contained missing values (Record-saving approach). The performances of the classifiers were rated using common indicators, such as Recall (or Sensitivity) and Precision (or Positive predictive value). In the visit-oriented setting, the Record-saving approach yielded Recall values from 70% to 100%, but low Precision (5% to 10%), which however increased to 50% when considering only predictions for which the model returned a probability above a given “confidence threshold”. For the History-oriented setting, both indicators increased as prediction time lengthened, reaching values of 67% (Recall) and 42% (Precision) at 720 days. We show how “real world” data can be effectively used to forecast the evolution of MS, leading to high Recall values and propose innovative approaches to improve Precision towards clinically useful values.

[1]  Massimiliano Calabrese,et al.  Clinical, MRI, and CSF Markers of Disability Progression in Multiple Sclerosis , 2013, Disease markers.

[2]  Mark Mühlau,et al.  Predicting conversion from clinically isolated syndrome to multiple sclerosis–An imaging-based machine learning approach , 2018, NeuroImage: Clinical.

[3]  Laura Palagi,et al.  Block layer decomposition schemes for training deep neural networks , 2019, Journal of Global Optimization.

[4]  Pierre Grammond,et al.  Defining secondary progressive multiple sclerosis. , 2016, Brain : a journal of neurology.

[5]  Shai Ben-David,et al.  Understanding Machine Learning: From Theory to Algorithms , 2014 .

[6]  Marc Debouverie,et al.  Older Age at Multiple Sclerosis Onset Is an Independent Factor of Poor Prognosis: A Population-Based Cohort Study , 2017, Neuroepidemiology.

[7]  Annalisa Barla,et al.  A machine learning pipeline for multiple sclerosis course detection from clinical scales and patient reported outcomes , 2015, 2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[8]  Raimar Kern,et al.  Multiple sclerosis: clinical profiling and data collection as prerequisite for personalized medicine approach , 2016, BMC Neurology.

[9]  Matias Viitala,et al.  Multiple sclerosis in Finland 2018—Data from the national register , 2019, Acta neurologica Scandinavica.

[10]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[11]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[12]  Xavier Montalban,et al.  Reaching an evidence-based prognosis for personalized treatment of multiple sclerosis , 2019, Nature Reviews Neurology.

[13]  Pablo Villoslada,et al.  Computational classifiers for predicting the short-term course of Multiple sclerosis , 2011, BMC neurology.

[14]  Jan Hillert,et al.  Clinical course of multiple sclerosis: A nationwide cohort study , 2017, Multiple sclerosis.

[15]  Eric Westman,et al.  Multiple sclerosis patients lacking oligoclonal bands in the cerebrospinal fluid have less global and regional brain atrophy , 2014, Journal of Neuroimmunology.

[16]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[17]  Pierre Grammond,et al.  Predictors of long‐term disability accrual in relapse‐onset multiple sclerosis , 2016, Annals of neurology.

[18]  Michael K Gould,et al.  Clinical and demographic predictors of long-term disability in patients with relapsing-remitting multiple sclerosis: a systematic review. , 2006, Archives of neurology.

[19]  Sara Llufriu,et al.  Neurofilament light chain and oligoclonal bands are prognostic biomarkers in radiologically isolated syndrome , 2018, Brain : a journal of neurology.

[20]  Conor Liston,et al.  New machine-learning technologies for computer-aided diagnosis , 2018, Nature Medicine.

[21]  Sabine Van Huffel,et al.  Machine Learning Approach for Classifying Multiple Sclerosis Courses by Combining Clinical Data with Lesion Loads and Magnetic Resonance Metabolic Features , 2017, Front. Neurosci..

[22]  Isabella Bordi,et al.  A Mechanistic, Stochastic Model Helps Understand Multiple Sclerosis Course and Pathogenesis , 2013, International journal of genomics.

[23]  J. Río,et al.  Short-term suboptimal response criteria for predicting long-term non-response to first-line disease modifying therapies in multiple sclerosis: A systematic review and meta-analysis , 2016, Journal of the Neurological Sciences.

[24]  Howard L. Weiner,et al.  Role of Immunosuppressive Therapy for the Treatment of Multiple Sclerosis , 2012, Neurotherapeutics.

[25]  Christel Renoux,et al.  Natural history of multiple sclerosis: long-term prognostic factors. , 2011, Neurologic clinics.

[26]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[27]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[28]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[29]  Lisa Tang,et al.  Deep Learning of Brain Lesion Patterns for Predicting Future Disease Activity in Patients with Early Symptoms of Multiple Sclerosis , 2016, LABELS/DLMIA@MICCAI.

[30]  Paweł Zalewski,et al.  Early Clinical Features, Time to Secondary Progression, and Disability Milestones in Polish Multiple Sclerosis Patients , 2019, Medicina.

[31]  O. Ciccarelli,et al.  Predicting outcome in clinically isolated syndrome using machine learning , 2014, NeuroImage: Clinical.

[32]  Andrea Zaccaria,et al.  Collaboration between a human group and artificial intelligence can improve prediction of multiple sclerosis course: a proof-of-principle study , 2017, F1000Research.

[33]  Danilo Bzdok,et al.  Points of Significance: Statistics versus machine learning , 2018, Nature Methods.

[34]  Devon S. Conway,et al.  Prognostic factors of disability in relapsing remitting multiple sclerosis. , 2019, Multiple sclerosis and related disorders.

[35]  Sidra Saleem,et al.  An Overview of Therapeutic Options in Relapsing-remitting Multiple Sclerosis , 2019, Cureus.

[36]  F. Lublin,et al.  Novel Agents for Relapsing Forms of Multiple Sclerosis. , 2016, Annual review of medicine.

[37]  David C. Kale,et al.  Do no harm: a roadmap for responsible machine learning for health care , 2019, Nature Medicine.

[38]  A Winkelmann,et al.  A Web-based tool for personalized prediction of long-term disease course in patients with multiple sclerosis , 2012, European journal of neurology.

[39]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[40]  Ziad Obermeyer,et al.  Lost in Thought - The Limits of the Human Mind and the Future of Medicine. , 2017, The New England journal of medicine.

[41]  David Howie,et al.  Interpreting probability , 2002 .

[42]  C. Brodley,et al.  Exploration of machine learning techniques in predicting multiple sclerosis disease course , 2017, PloS one.

[43]  Pierre Grammond,et al.  Contribution of different relapse phenotypes to disability in multiple sclerosis , 2017, Multiple sclerosis.

[44]  Reinhard Hohlfeld,et al.  Risks and risk management in modern multiple sclerosis immunotherapeutic treatment , 2019, Therapeutic advances in neurological disorders.

[45]  Giorgio Terracina,et al.  Classification of Multiple Sclerosis Clinical Profiles via Graph Convolutional Neural Networks , 2019, Front. Neurosci..

[46]  Ludwig Kappos,et al.  Vitamin D as an early predictor of multiple sclerosis activity and progression. , 2014, JAMA neurology.

[47]  A. Palavecino,et al.  Multiple sclerosis prevalence in Salta City, Argentina. , 2018, Multiple sclerosis and related disorders.

[48]  George C. Ebers,et al.  The natural history of multiple sclerosis, a geographically based study 10: relapses and long-term disability , 2010, Brain : a journal of neurology.

[49]  François Cotton,et al.  Graph Theory-Based Brain Connectivity for Automatic Classification of Multiple Sclerosis Clinical Courses , 2016, Front. Neurosci..

[50]  S. T. Buckland,et al.  An Introduction to the Bootstrap. , 1994 .

[51]  C Montomoli,et al.  BREMSO: a simple score to predict early the natural course of multiple sclerosis , 2015, European journal of neurology.

[52]  Andrea Zaccaria,et al.  Collaboration between a human group and artificial intelligence can improve prediction of multiple sclerosis course: a proof-of-principle study , 2017, F1000Research.

[53]  Giuseppe M Sechi,et al.  Prevalence of multiple sclerosis in Sardinia: A systematic cross-sectional multi-source survey , 2020, Multiple sclerosis.