An Empirical Approach for Avoiding False Discoveries When Applying High-Dimensional Radiomics to Small Datasets

<italic>Purpose:</italic> Radiomic studies, where correlations are drawn between patients’ medical image features and patient outcomes, often deal with small datasets. Consequently, results can suffer from lack of replicability and stability. This paper establishes a methodology to assess and reduce the impact of statistical fluctuations that may occur in small datasets. Such fluctuations can lead to false discoveries, particularly when applying feature selection or machine learning (ML) methods commonly used in the radiomics literature. <italic>Methods:</italic> Two feature selection methods were created, one for choosing single predictive features, and another for obtaining features sets that could be combined in a predictive model. The features were combined using ML tools less affected by overfitting (Naïve Bayes, logistic regression, and linear support vector machines). Only three features were allowed to be combined at a time, further limiting overfitting. This methodology was applied to MR images from small datasets in metastatic liver disease (69 samples) and primary uterine adenocarcinoma (93 samples), and the outcomes studied were: desmoplasia (for liver metastases), lymphovascular space invasion (LVSI), cancer staging (FIGO), and tumor grade (for uterine tumors). For outcomes in uterine cancer, the predictive models were tested on independent subsets. <italic>Results:</italic> With respect to the combined predictive feature approach: for LVSI, a prognostic factor that a human reader cannot detect, the predictive model yielded AUC = 0.87 ± 0.07 and accuracy = 0.84 ± 0.09 in the testing set. For FIGO staging, AUC = 0.81 ± 0.03 and accuracy = 0.79 ± 0.08. For tumor grade, AUC = 0.76 ± 0.05 and accuracy = 0.70 ± 0.08. <italic>Conclusion:</italic> Despite considering a large set (<inline-formula> <tex-math notation="LaTeX">$\sim 10^{4}$ </tex-math></inline-formula>) of texture features, the false discovery avoidance methodology allowed only robust predictive models to be retained. Thus, the stringent false discovery avoidance methods introduced here do not preclude the discovery of promising correlations.

[1]  Paul Kinahan,et al.  Radiomics: Images Are More than Pictures, They Are Data , 2015, Radiology.

[2]  Max Bramer,et al.  Principles of Data Mining , 2016, Undergraduate Topics in Computer Science.

[3]  J. Toriwaki,et al.  Study of computer diagnosis of X-ray and CT images in Japan-a brief survey , 1994, Proceedings of IEEE Workshop on Biomedical Image Analysis.

[4]  H. Aerts Semantics Features : Phenotype Quantification by a Radiologist ’ s Expert Eye , 2016 .

[5]  L. Ljung,et al.  Overtraining, regularization and searching for a minimum, with application to neural networks , 1995 .

[6]  Howard Y. Chang,et al.  Decoding global gene expression programs in liver cancer by noninvasive imaging , 2007, Nature Biotechnology.

[7]  F. Turkheimer,et al.  A radiomics model from joint FDG-PET and MRI texture features for the prediction of lung metastases in soft-tissue sarcomas of the extremities , 2015 .

[8]  Andre Dekker,et al.  Radiomics: the process and the challenges. , 2012, Magnetic resonance imaging.

[9]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[10]  P. Lambin,et al.  Exploratory Study to Identify Radiomics Classifiers for Lung Cancer Histology , 2016, Front. Oncol..

[11]  Yanqi Huang,et al.  Development and Validation of a Radiomics Nomogram for Preoperative Prediction of Lymph Node Metastasis in Colorectal Cancer. , 2016, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[12]  P. Lambin,et al.  Machine Learning methods for Quantitative Radiomic Biomarkers , 2015, Scientific Reports.

[13]  Peter Balter,et al.  Can radiomics features be reproducibly measured from CBCT images for patients with non-small cell lung cancer? , 2015, Medical physics.

[14]  Matthew T. Freedman,et al.  Reduction of false positives in lung nodule detection using a two-level neural classification , 1996, IEEE Trans. Medical Imaging.

[15]  Douglas M. Hawkins,et al.  The Problem of Overfitting , 2004, J. Chem. Inf. Model..

[16]  Yanqi Huang,et al.  Radiomics Signature: A Potential Biomarker for the Prediction of Disease-Free Survival in Early-Stage (I or II) Non-Small Cell Lung Cancer. , 2016, Radiology.

[17]  Mary M. Galloway,et al.  Texture analysis using gray level run lengths , 1974 .

[18]  Philippe Lambin,et al.  Quantitative radiomics studies for tissue characterization: a review of technology and methodological procedures , 2017, The British journal of radiology.

[19]  Robert M. Haralick,et al.  Textural Features for Image Classification , 1973, IEEE Trans. Syst. Man Cybern..

[20]  Samuel H. Hawkins,et al.  Predicting Malignant Nodules from Screening CT Scans , 2016, Journal of thoracic oncology : official publication of the International Association for the Study of Lung Cancer.

[21]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[22]  Andre Dekker,et al.  An Approach Toward Automatic Classification of Tumor Histopathology of Non–Small Cell Lung Cancer Based on Radiomic Features , 2016, Tomography.

[23]  W. Tsai,et al.  Reproducibility of radiomics for deciphering tumor phenotype with imaging , 2016, Scientific Reports.

[24]  Martin P. DeSimio,et al.  Computer-aided breast cancer detection and diagnosis of masses using difference of Gaussians and derivative-based feature saliency , 1997, IEEE Transactions on Medical Imaging.

[25]  Patrick Granton,et al.  Radiomics: extracting more information from medical images using advanced feature analysis. , 2012, European journal of cancer.