A Systematic Review of Machine Learning Techniques in Hematopoietic Stem Cell Transplantation (HSCT)

Machine learning techniques are widely used nowadays in the healthcare domain for the diagnosis, prognosis, and treatment of diseases. These techniques have applications in the field of hematopoietic cell transplantation (HCT), which is a potentially curative therapy for hematological malignancies. Herein, a systematic review of the application of machine learning (ML) techniques in the HCT setting was conducted. We examined the type of data streams included, specific ML techniques used, and type of clinical outcomes measured. A systematic review of English articles using PubMed, Scopus, Web of Science, and IEEE Xplore databases was performed. Search terms included “hematopoietic cell transplantation (HCT),” “autologous HCT,” “allogeneic HCT,” “machine learning,” and “artificial intelligence.” Only full-text studies reported between January 2015 and July 2020 were included. Data were extracted by two authors using predefined data fields. Following PRISMA guidelines, a total of 242 studies were identified, of which 27 studies met the inclusion criteria. These studies were sub-categorized into three broad topics and the type of ML techniques used included ensemble learning (63%), regression (44%), Bayesian learning (30%), and support vector machine (30%). The majority of studies examined models to predict HCT outcomes (e.g., survival, relapse, graft-versus-host disease). Clinical and genetic data were the most commonly used predictors in the modeling process. Overall, this review provided a systematic review of ML techniques applied in the context of HCT. The evidence is not sufficiently robust to determine the optimal ML technique to use in the HCT setting and/or what minimal data variables are required.

[1]  K. Soman,et al.  Improved Detection of Invasive Pulmonary Aspergillosis Arising during Leukemia Treatment Using a Panel of Host Response Proteins and Fungal Antigens , 2015, PloS one.

[2]  H. Sone,et al.  Patient‐based prediction algorithm of relapse after allo‐HSCT for acute Leukemia and its usefulness in the decision‐making process using a machine learning approach , 2019, Cancer medicine.

[3]  J. Wiens,et al.  Predicting Acute Graft-Versus-Host Disease Using Machine Learning and Longitudinal Vital Sign Data From Electronic Health Records , 2020, JCO clinical cancer informatics.

[4]  Robert Gray,et al.  A Proportional Hazards Model for the Subdistribution of a Competing Risk , 1999 .

[5]  Yoshinobu Kanda,et al.  Using a machine learning algorithm to predict acute graft-versus-host disease following allogeneic transplantation. , 2019, Blood advances.

[6]  Interactive web application for plotting personalized prognosis prediction curves in allogeneic hematopoietic cell transplantation using machine learning. , 2020, Transplantation.

[7]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[8]  I. Fernández,et al.  Gene Expression-Based Predictive Models of Graft Versus Host Disease-Associated Dry Eye. , 2015, Investigative ophthalmology & visual science.

[9]  J. Richman,et al.  Clinical and Genetic Risk Prediction of Cognitive Impairment After Blood or Marrow Transplantation for Hematologic Malignancy. , 2020, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[10]  Leszek Gąsieniec,et al.  Predicting the availability of haematopoietic stem cell donors using machine learning. , 2020, Biology of blood and marrow transplantation : journal of the American Society for Blood and Marrow Transplantation.

[11]  F. Appelbaum,et al.  Haematopoietic cell transplantation as immunotherapy , 2001, Nature.

[12]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[13]  I. Kohane,et al.  Big Data and Machine Learning in Health Care. , 2018, JAMA.

[14]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[15]  Xin Jin,et al.  K-Means Clustering , 2010, Encyclopedia of Machine Learning.

[16]  Y. Bertrand,et al.  A decision support tool to find the best cyclosporine dose when switching from intravenous to oral route in pediatric stem cell transplant patients , 2020, European Journal of Clinical Pharmacology.

[17]  Alois Knoll,et al.  Gradient boosting machines, a tutorial , 2013, Front. Neurorobot..

[18]  Alfonso Valencia,et al.  Big data analytics for personalized medicine. , 2019, Current opinion in biotechnology.

[19]  Brent Logan,et al.  Tools for the Precision Medicine Era: How to Develop Highly Personalized Treatment Recommendations From Cohort and Registry Data Using Q-Learning , 2017, American journal of epidemiology.

[20]  J. Freidman,et al.  Multivariate adaptive regression splines , 1991 .

[21]  Sebastian Thrun,et al.  Dermatologist-level classification of skin cancer with deep neural networks , 2017, Nature.

[22]  P M Todd,et al.  Précis of Simple heuristics that make us smart , 2000, Behavioral and Brain Sciences.

[23]  K. Hsu,et al.  Evaluation of a Machine Learning-Based Prognostic Model for Unrelated Hematopoietic Cell Transplantation Donor Selection. , 2018, Biology of blood and marrow transplantation : journal of the American Society for Blood and Marrow Transplantation.

[24]  R. Jain Ridge regression and its application to medical data. , 1985, Computers and biomedical research, an international journal.

[25]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[26]  May D. Wang,et al.  –Omic and Electronic Health Record Big Data Analytics for Precision Medicine , 2017, IEEE Transactions on Biomedical Engineering.

[27]  Yoav Freund,et al.  The Alternating Decision Tree Learning Algorithm , 1999, ICML.

[28]  A. Roli Artificial Neural Networks , 2012, Lecture Notes in Computer Science.

[29]  M. J. van der Laan,et al.  Statistical Applications in Genetics and Molecular Biology Super Learner , 2010 .

[30]  Sherri Rose,et al.  Prediction of absolute risk of acute graft-versus-host disease following hematopoietic cell transplantation , 2018, PloS one.

[31]  T. A. Binkowski,et al.  Identification of high-risk amino-acid substitutions in hematopoietic cell transplantation: a challenging task , 2016, Bone Marrow Transplantation.

[32]  C. Y. Peng,et al.  An Introduction to Logistic Regression Analysis and Reporting , 2002 .

[33]  T. Braun,et al.  Promoting Health and Well-Being Through Mobile Health Technology (Roadmap 2.0) in Family Caregivers and Patients Undergoing Hematopoietic Stem Cell Transplantation: Protocol for the Development of a Mobile Randomized Controlled Trial , 2020, JMIR research protocols.

[34]  Zhongheng Zhang,et al.  Introduction to machine learning: k-nearest neighbors. , 2016, Annals of translational medicine.

[35]  Corinna Cortes,et al.  Boosting Decision Trees , 1995, NIPS.

[36]  T. Pastinen,et al.  Genomic prediction of relapse in recipients of allogeneic haematopoietic stem cell transplantation , 2018, Leukemia.

[37]  J. Hsu,et al.  Machine learning algorithms to differentiate among pulmonary complications after hematopoietic cell transplant. , 2020, Chest.

[38]  J. Irish,et al.  Machine learning reveals chronic graft-versus-host disease phenotypes and stratifies survival after stem cell transplant for hematologic malignancies , 2018, Haematologica.

[39]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[40]  M. Tewari,et al.  Computational analysis of continuous body temperature provides early discrimination of graft-versus-host disease in mice. , 2019, Blood advances.

[41]  D. Moher,et al.  Preferred reporting items for systematic reviews and meta-analyses: the PRISMA Statement , 2009, BMJ : British Medical Journal.

[42]  H. Chipman,et al.  Bayesian Additive Regression Trees , 2006 .

[43]  M. Ducher,et al.  Bayesian Networks: A New Approach to Predict Therapeutic Range Achievement of Initial Cyclosporine Blood Concentration After Pediatric Hematopoietic Stem Cell Transplantation , 2018, Drugs in R&D.

[44]  Hemant Ishwaran,et al.  Random Survival Forests , 2008, Wiley StatsRef: Statistics Reference Online.

[45]  Loren Gragert,et al.  HLA match likelihoods for hematopoietic stem-cell grafts in the U.S. registry. , 2014, The New England journal of medicine.

[46]  Ibrahim N. Muhsen,et al.  Registries and artificial intelligence: investing in the future of hematopoietic cell transplantation , 2018, Bone Marrow Transplantation.

[47]  Jeffrey Dean,et al.  Machine Learning in Medicine , 2019, The New England journal of medicine.

[48]  J. Ross Quinlan,et al.  Simplifying decision trees , 1987, Int. J. Hum. Comput. Stud..

[49]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[50]  Debarka Sengupta,et al.  Staging System to Predict the Risk of Relapse in Multiple Myeloma Patients Undergoing Autologous Stem Cell Transplantation , 2019, Front. Oncol..

[51]  V. Cherkassky,et al.  Machine Learning Approach to Predicting Stem Cell Donor Availability. , 2018, Biology of blood and marrow transplantation : journal of the American Society for Blood and Marrow Transplantation.

[52]  J Elith,et al.  A working guide to boosted regression trees. , 2008, The Journal of animal ecology.

[53]  J. Friedman Fast sparse regression and classification , 2012 .

[54]  David Heckerman,et al.  A Tutorial on Learning with Bayesian Networks , 1999, Innovations in Bayesian Networks.

[55]  Fionn Murtagh,et al.  Multilayer perceptrons for classification and regression , 1991, Neurocomputing.

[56]  Tom M. Mitchell,et al.  Machine Learning and Data Mining , 2012 .

[57]  Edward A Copelan,et al.  Hematopoietic stem-cell transplantation. , 2006, The New England journal of medicine.

[58]  Christoph Schmid,et al.  Prediction of Hematopoietic Stem Cell Transplantation Related Mortality- Lessons Learned from the In-Silico Approach: A European Society for Blood and Marrow Transplantation Acute Leukemia Working Party Data Mining Study , 2016, PloS one.

[59]  M. Horowitz,et al.  Validation and refinement of the Disease Risk Index for allogeneic stem cell transplantation. , 2014, Blood.

[60]  Subhashini Venugopalan,et al.  Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs. , 2016, JAMA.

[61]  G. McLachlan Discriminant Analysis and Statistical Pattern Recognition , 1992 .

[62]  H. Chipman,et al.  BART: Bayesian Additive Regression Trees , 2008, 0806.3286.

[63]  Christoph Schmid,et al.  Prediction of Allogeneic Hematopoietic Stem-Cell Transplantation Mortality 100 Days After Transplantation Using a Machine Learning Algorithm: A European Group for Blood and Marrow Transplantation Acute Leukemia Working Party Retrospective Data Mining Study. , 2015, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[64]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.