Reproducible radiomics through automated machine learning validated on twelve clinical applications

Radiomics uses quantitative medical imaging features to predict clinical outcomes. While many radiomics methods have been described in the literature, these are generally designed for a single application. The aim of this study is to generalize radiomics across applications by proposing a framework to automatically construct and optimize the radiomics workflow per application. To this end, we formulate radiomics as a modular workflow, consisting of several components: image and segmentation preprocessing, feature extraction, feature and sample preprocessing, and machine learning. For each component, a collection of common algorithms is included. To optimize the workflow per application, we employ automated machine learning using a random search and ensembling. We evaluate our method in twelve different clinical applications, resulting in the following area under the curves: 1) liposarcoma (0.83); 2) desmoid-type fibromatosis (0.82); 3) primary liver tumors (0.81); 4) gastrointestinal stromal tumors (0.77); 5) colorectal liver metastases (0.68); 6) melanoma metastases (0.51); 7) hepatocellular carcinoma (0.75); 8) mesenteric fibrosis (0.81); 9) prostate cancer (0.72); 10) glioma (0.70); 11) Alzheimer’s disease (0.87); and 12) head and neck cancer (0.84). Concluding, our method fully automatically constructs and optimizes the radiomics workflow, thereby streamlining the search for radiomics biomarkers in new applications. To facilitate reproducibility and future research, we publicly release six datasets, the software implementation of our framework (open-source), and the code to reproduce this study.

[1]  R. Steenbakkers,et al.  The Image Biomarker Standardization Initiative: Standardized Quantitative Radiomics for High-Throughput Image-based Phenotyping. , 2020, Radiology.

[2]  Hamed R. Bonab,et al.  Less Is More: A Comprehensive Framework for the Number of Components of Ensemble Classifiers , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[3]  R. Dennis Cook,et al.  Cross-Validation of Regression Models , 1984 .

[4]  Yi Liao,et al.  Identification of suspicious invasive placentation based on clinical MRI data using textural features and automated machine learning , 2019, European Radiology.

[5]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[6]  G. Ginsburg,et al.  The path to personalized medicine. , 2002, Current opinion in chemical biology.

[7]  Geoffrey S Ginsburg,et al.  Personalized medicine: progress and promise. , 2011, Annual review of genomics and human genetics.

[8]  Vishwa S. Parekh,et al.  Deep learning and radiomics in precision medicine , 2019, Expert review of precision medicine and drug development.

[9]  Andy B. Yoo,et al.  Approved for Public Release; Further Dissemination Unlimited X-ray Pulse Compression Using Strained Crystals X-ray Pulse Compression Using Strained Crystals , 2002 .

[10]  Qiyong Gong,et al.  Automated Machine Learning Based on Radiomics Features Predicts H3 K27M Mutation in Midline Gliomas of the Brain. , 2019, Neuro-oncology.

[11]  S. Sleijfer,et al.  Differential diagnosis and mutation stratification of desmoid-type fibromatosis on MRI using radiomics. , 2020, European journal of radiology.

[12]  Stuart A. Taylor,et al.  Imaging biomarker roadmap for cancer studies , 2016, Nature Reviews Clinical Oncology.

[13]  Yoshua Bengio,et al.  Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[14]  Yan Wang,et al.  The Medical Segmentation Decathlon , 2021, ArXiv.

[15]  Daniel S. Marcus,et al.  The extensible neuroimaging archive toolkit , 2007, Neuroinformatics.

[16]  Andriy Fedorov,et al.  Computational Radiomics System to Decode the Radiographic Phenotype. , 2017, Cancer research.

[17]  R. Gillies,et al.  Repeatability and Reproducibility of Radiomic Features: A Systematic Review , 2018, International journal of radiation oncology, biology, physics.

[18]  Wiro J. Niessen,et al.  Fastr: A Workflow Engine for Advanced Data Flows in Medical Image Analysis , 2016, Front. ICT.

[19]  Michal Strzelecki,et al.  MaZda - A software package for image texture analysis , 2009, Comput. Methods Programs Biomed..

[20]  Robert Tibshirani,et al.  Bootstrap Methods for Standard Errors, Confidence Intervals, and Other Measures of Statistical Accuracy , 1986 .

[21]  Randal S. Olson,et al.  Benchmarking Relief-Based Feature Selection Methods , 2017, J. Biomed. Informatics.

[22]  Jiangdian Song,et al.  A review of original articles published in the emerging field of radiomics. , 2020, European journal of radiology.

[23]  Yoshua Bengio,et al.  Inference for the Generalization Error , 1999, Machine Learning.

[24]  Alaa Tharwat,et al.  Classification assessment methods , 2020, Applied Computing and Informatics.

[25]  Patrick Granton,et al.  Radiomics: extracting more information from medical images using advanced feature analysis. , 2012, European journal of cancer.

[26]  H. Aerts The Potential of Radiomic-Based Phenotyping in Precision Medicine: A Review. , 2016, JAMA oncology.

[27]  Guoqiang Zhong,et al.  Differentiable Light-Weight Architecture Search , 2021, 2021 IEEE International Conference on Multimedia and Expo (ICME).

[28]  Michael C. Frank,et al.  Estimating the reproducibility of psychological science , 2015, Science.

[29]  G. Houston,et al.  Diagnostic classification of arterial spin labeling and structural MRI in presenile early stage dementia , 2014, Human brain mapping.

[30]  Lars Kotthoff,et al.  Automated Machine Learning: Methods, Systems, Challenges , 2019, The Springer Series on Challenges in Machine Learning.

[31]  Randal S. Olson,et al.  TPOT: A Tree-based Pipeline Optimization Tool for Automating Machine Learning , 2016, AutoML@ICML.

[32]  Cha Zhang,et al.  Ensemble Machine Learning , 2012 .

[33]  Martina Sollini,et al.  Towards clinical application of image mining: a systematic review on artificial intelligence and radiomics , 2019, European Journal of Nuclear Medicine and Molecular Imaging.

[34]  Hung-Ming Wang,et al.  Development and Evaluation of an Open-Source Software Package “CGITA” for Quantifying Tumor Heterogeneity with Molecular Images , 2014, BioMed research international.

[35]  L. Hood,et al.  Predictive, personalized, preventive, participatory (P4) cancer medicine , 2011, Nature Reviews Clinical Oncology.

[36]  P. Lambin,et al.  Radiomics: the bridge between medical imaging and personalized medicine , 2017, Nature Reviews Clinical Oncology.

[37]  Ronald Boellaard,et al.  RaCaT: An open source and easy to use radiomics calculator tool , 2019, PloS one.

[38]  Peter Kovesi,et al.  Phase Congruency Detects Corners and Edges , 2003, DICTA.

[39]  Massimo Bellomi,et al.  Radiomics: the facts and the challenges of image analysis , 2018, European Radiology Experimental.

[40]  Alejandro F. Frangi,et al.  Muliscale Vessel Enhancement Filtering , 1998, MICCAI.

[41]  Jens Petersen,et al.  nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation , 2020, Nature Methods.

[42]  Hugo Jair Escalante,et al.  Guest Editorial: Automated Machine Learning , 2021, IEEE Trans. Pattern Anal. Mach. Intell..

[43]  Marina De Vos,et al.  The WORC* database: MRI and CT scans, segmentations, and clinical labels for 930 patients from six radiomics studies , 2021, medRxiv.

[44]  Carsten Franke,et al.  Job Scheduling Strategies for Parallel Processing , 2002, Lecture Notes in Computer Science.

[45]  M. Hatt,et al.  IBSI: an international community radiomics standardization initiative , 2018 .

[46]  Stefano Trebeschi,et al.  Radiogenomics: bridging imaging and genomics , 2019, Abdominal Radiology.

[47]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[48]  Haibo He,et al.  ADASYN: Adaptive synthetic sampling approach for imbalanced learning , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[49]  Harini Veeraraghavan,et al.  Technical Note: Extension of CERR for computational radiomics: A comprehensive MATLAB platform for reproducible radiomics research. , 2018, Medical physics.

[50]  Jinzhong Yang,et al.  IBEX: an open infrastructure software platform to facilitate collaborative work in radiomics. , 2015, Medical physics.

[51]  P. Lambin,et al.  A review in radiomics: Making personalized medicine a reality via routine imaging , 2021, Medicinal research reviews.

[52]  Howard Bowman,et al.  I tried a bunch of things: The dangers of unexpected overfitting in classification of brain data , 2020, Neuroscience and Biobehavioral Reviews.

[53]  Foster J. Provost,et al.  ROC confidence bands: an empirical evaluation , 2005, ICML.

[54]  Kevin Leyton-Brown,et al.  Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms , 2012, KDD.

[55]  Wiro J. Niessen,et al.  Classification Of Prostate Cancer: High Grade Versus Low Grade Using A Radiomics Approach , 2019, 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019).

[56]  Issam El Naqa,et al.  Machine and deep learning methods for radiomics. , 2020, Medical physics.

[57]  Hui Han,et al.  Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning , 2005, ICIC.

[58]  Christos Davatzikos,et al.  Brain Cancer Imaging Phenomics Toolkit (brain-CaPTk): An Interactive Platform for Quantitative Analysis of Glioblastoma , 2017, BrainLes@MICCAI.

[59]  M. Dietzel,et al.  A decade of radiomics research: are images really data or just patterns in the noise? , 2020, European Radiology.

[60]  Irène Buvat,et al.  LIFEx: A Freeware for Radiomic Feature Calculation in Multimodality Imaging to Accelerate Advances in the Characterization of Tumor Heterogeneity. , 2018, Cancer research.

[61]  Fernando Nogueira,et al.  Imbalanced-learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning , 2016, J. Mach. Learn. Res..

[62]  Nick C Fox,et al.  Magnetic resonance imaging in Alzheimer's Disease Neuroimaging Initiative 2 , 2015, Alzheimer's & Dementia.

[63]  P. Lambin,et al.  Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach , 2014, Nature Communications.

[64]  Aaron Klein,et al.  Auto-sklearn: Efficient and Robust Automated Machine Learning , 2019, Automated Machine Learning.

[65]  Stefan Klein,et al.  Radiomics: Data mining using quantitative medical image features , 2020 .

[66]  S. Klein,et al.  Distinguishing pure histopathological growth patterns of colorectal liver metastases on CT using deep learning and radiomics: a pilot study , 2021, Clinical & Experimental Metastasis.

[67]  S. Sleijfer,et al.  The BRAF P.V600E Mutation Status of Melanoma Lung Metastases Cannot Be Discriminated on Computed Tomography by LIDC Criteria nor Radiomics Using Machine Learning , 2021, Journal of personalized medicine.

[68]  S. Sleijfer,et al.  Radiomics approach to distinguish between well differentiated liposarcomas and lipomas on MRI , 2019, The British journal of surgery.

[69]  S. Klein,et al.  Data belonging to Predicting the 1p/19q co-deletion status of presumed low grade glioma with an externally validated machine learning algorithm , 2020 .

[70]  Konstantinos N. Plataniotis,et al.  From Handcrafted to Deep-Learning-Based Cancer Radiomics: Challenges and opportunities , 2018, IEEE Signal Processing Magazine.

[71]  S. Klein,et al.  Automated differentiation of malignant and benign primary solid liver lesions on MRI: an externally validated radiomics model , 2021, medRxiv.

[72]  Vishwa Parekh,et al.  Radiomics: a new application from established techniques , 2016, Expert review of precision medicine and drug development.

[73]  Feihong Yu,et al.  Ultrasound-based radiomics nomogram: A potential biomarker to predict axillary lymph node metastasis in early-stage invasive breast cancer. , 2019, European journal of radiology.

[74]  Fanny Orlhac,et al.  The Dark Side of Radiomics: On the Paramount Importance of Publishing Negative Results , 2019, The Journal of Nuclear Medicine.