Machine Learning methods for Quantitative Radiomic Biomarkers

Radiomics extracts and mines large number of medical imaging features quantifying tumor phenotypic characteristics. Highly accurate and reliable machine-learning approaches can drive the success of radiomic applications in clinical care. In this radiomic study, fourteen feature selection methods and twelve classification methods were examined in terms of their performance and stability for predicting overall survival. A total of 440 radiomic features were extracted from pre-treatment computed tomography (CT) images of 464 lung cancer patients. To ensure the unbiased evaluation of different machine-learning methods, publicly available implementations along with reported parameter configurations were used. Furthermore, we used two independent radiomic cohorts for training (n = 310 patients) and validation (n = 154 patients). We identified that Wilcoxon test based feature selection method WLCX (stability = 0.84 ± 0.05, AUC = 0.65 ± 0.02) and a classification method random forest RF (RSD = 3.52%, AUC = 0.66 ± 0.03) had highest prognostic performance with high stability against data perturbation. Our variability analysis indicated that the choice of classification method is the most dominant source of performance variation (34.21% of total variance). Identification of optimal machine-learning methods for radiomic applications is a crucial step towards stable and clinically relevant radiomic biomarkers, providing a non-invasive way of quantifying and monitoring tumor-phenotypic characteristics in clinical practice.

[1]  Robert M. Haralick,et al.  Textural Features for Image Classification , 1973, IEEE Trans. Syst. Man Cybern..

[2]  Mary M. Galloway,et al.  Texture analysis using gray level run lengths , 1974 .

[3]  Roberto Battiti,et al.  Using mutual information for selecting features in supervised neural net learning , 1994, IEEE Trans. Neural Networks.

[4]  N. Christakis,et al.  Extent and determinants of error in doctors' prognoses in terminally ill patients: prospective cohort study , 2000, BMJ : British Medical Journal.

[5]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[6]  Paul Glare,et al.  A systematic review of physicians' survival predictions in terminally ill cancer patients , 2003, BMJ : British Medical Journal.

[7]  Joseph O Deasy,et al.  CERR: a computational environment for radiotherapy research. , 2003, Medical physics.

[8]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[9]  F. Fleuret Fast Binary Feature Selection with Conditional Mutual Information , 2004, J. Mach. Learn. Res..

[10]  Aleks Jakulin Machine Learning Based on Attribute Interactions , 2005 .

[11]  Robert P. W. Duin,et al.  The Dissimilarity Representation for Pattern Recognition - Foundations and Applications , 2005, Series in Machine Perception and Artificial Intelligence.

[12]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  J. Schiller,et al.  Clinical model to predict survival in chemonaive patients with advanced non-small-cell lung cancer treated with third-generation chemotherapy regimens based on eastern cooperative oncology group data. , 2005, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[14]  Sotiris B. Kotsiantis,et al.  Machine learning: a review of classification and combining techniques , 2006, Artificial Intelligence Review.

[15]  Howard Y. Chang,et al.  Decoding global gene expression programs in liver cancer by noninvasive imaging , 2007, Nature Biotechnology.

[16]  Sunita Sarawagi,et al.  Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Las Vegas, Nevada, USA, August 24-27, 2008 , 2008, KDD.

[17]  Max Kuhn,et al.  Building Predictive Models in R Using the caret Package , 2008 .

[18]  Balaji Ganeshan,et al.  Texture analysis of non-small cell lung cancer on unenhanced computed tomography: initial evidence for a relationship with tumour glucose metabolism and stage , 2010, Cancer imaging : the official publication of the International Cancer Imaging Society.

[19]  Huan Liu,et al.  Advancing Feature Selection Research − ASU Feature Selection Repository , 2010 .

[20]  Francis Guillemin,et al.  How accurate are physicians in the prediction of patient survival in advanced lung cancer? , 2010, The oncologist.

[21]  Harold W. Kuhn,et al.  The Hungarian method for the assignment problem , 1955, 50 Years of Integer Programming.

[22]  Jean-Philippe Vert,et al.  The Influence of Feature Selection Methods on Accuracy, Stability and Interpretability of Molecular Signatures , 2011, PloS one.

[23]  Gavin Brown,et al.  Conditional Likelihood Maximisation: A Unifying Framework for Information Theoretic Feature Selection , 2012, J. Mach. Learn. Res..

[24]  Patrick Granton,et al.  Radiomics: extracting more information from medical images using advanced feature analysis. , 2012, European journal of cancer.

[25]  Ameet Talwalkar,et al.  Foundations of Machine Learning , 2012, Adaptive computation and machine learning.

[26]  J. Reis-Filho,et al.  Breast cancer intratumor genetic heterogeneity: causes and implications , 2012, Expert review of anticancer therapy.

[27]  P. Lambin,et al.  Stability of FDG-PET Radiomics features: An integrated analysis of test-retest and inter-observer variability , 2013, Acta oncologica.

[28]  Vicky Goh,et al.  Are Pretreatment 18F-FDG PET Tumor Textural Features in Non–Small Cell Lung Cancer Associated with Response and Survival After Chemoradiotherapy? , 2013, The Journal of Nuclear Medicine.

[29]  L. Pusztai,et al.  Cancer heterogeneity: implications for targeted therapeutics , 2013, British Journal of Cancer.

[30]  V. Goh,et al.  Non-small cell lung cancer: histopathologic correlates for texture parameters at CT. , 2013, Radiology.

[31]  U. Ficola,et al.  Prediction of 2 years-survival in patients with stage I and II non-small cell lung cancer utilizing 18F-FDG PET/CT SUV quantification , 2013, Radiology and oncology.

[32]  P. Lambin,et al.  Predicting outcomes in radiation oncology—multifactorial decision support systems , 2013, Nature Reviews Clinical Oncology.

[33]  W. Niessen,et al.  Quantification of Heterogeneity as a Biomarker in Tumor Imaging: A Systematic Review , 2014, PloS one.

[34]  P. Lambin,et al.  Robust Radiomics Feature Quantification Using Semiautomatic Volumetric Segmentation , 2014, PloS one.

[35]  Senén Barro,et al.  Do we need hundreds of classifiers to solve real world classification problems? , 2014, J. Mach. Learn. Res..

[36]  P. Lambin,et al.  Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach , 2014, Nature Communications.

[37]  S. Plevritis,et al.  Glioblastoma multiforme: exploratory radiogenomic analysis by using quantitative image features. , 2014, Radiology.

[38]  P. Lambin,et al.  A prospective study comparing the predictions of doctors versus models for treatment outcome of lung cancer patients: a step toward individualized care and shared decision making. , 2014, Radiotherapy and oncology : journal of the European Society for Therapeutic Radiology and Oncology.

[39]  D. Rimm,et al.  Quantitative assessment Ki-67 score for prediction of response to neoadjuvant chemotherapy in breast cancer , 2014, Laboratory Investigation.

[40]  James H. Doroshow,et al.  Translational research in oncology—10 years of progress and future prospects , 2014, Nature Reviews Clinical Oncology.

[41]  P. Lambin,et al.  Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach , 2014, Nature Communications.

[42]  Scott N. Hwang,et al.  Outcome prediction in patients with glioblastoma by using imaging, clinical, and genomic biomarkers: focus on the nonenhancing component of the tumor. , 2014, Radiology.

[43]  Verónica Bolón-Canedo,et al.  A review of microarray datasets and applied feature selection methods , 2014, Inf. Sci..

[44]  Robert J. Gillies,et al.  Predicting Outcomes of Nonsmall Cell Lung Cancer Using CT Image Features , 2014, IEEE Access.

[45]  Olivier Gevaert,et al.  Addition of MR imaging features and genetic biomarkers strengthens glioblastoma survival prediction in TCGA patients. , 2015, Journal of neuroradiology. Journal de neuroradiologie.

[46]  P. Lambin,et al.  CT-based radiomic signature predicts distant metastasis in lung adenocarcinoma. , 2015, Radiotherapy and oncology : journal of the European Society for Therapeutic Radiology and Oncology.