Toward automatic prediction of EGFR mutation status in pulmonary adenocarcinoma with 3D deep learning

To develop a deep learning system based on 3D convolutional neural networks (CNNs), and to automatically predict EGFR‐mutant pulmonary adenocarcinoma in CT images. A dataset of 579 nodules with EGFR mutation status labels of mutant (Mut) or wild‐type (WT) was retrospectively analyzed. A deep learning system, namely 3D DenseNets, was developed to process 3D patches of nodules from CT data, and learn strong representations with supervised end‐to‐end training. The 3D DenseNets were trained with a training subset of 348 nodules and tuned with a development subset of 116 nodules. A strong data augmentation technique, mixup, was used for better generalization. We evaluated our model on a holdout subset of 115 nodules. An independent public dataset of 37 nodules from the cancer imaging archive (TCIA) was also used to test the generalization of our method. Conventional radiomics analysis was also performed for comparison. Our method achieved promising performance on predicting EGFR mutation status, with AUCs of 75.8% and 75.0% for our holdout test set and public test set, respectively. Moreover, strong relations were found between deep learning feature and conventional radiomics, while deep learning worked through an enhanced radiomics manner, that is, deep learned radiomics (DLR), in terms of robustness, compactness and expressiveness. The proposed deep learning system predicts EGFR‐mutant of lung adenocarcinomas in CT images noninvasively and automatically, indicating its potential to help clinical decision‐making by identifying eligible patients of pulmonary adenocarcinoma for EGFR‐targeted therapy.

[1]  E. DeLong,et al.  Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. , 1988, Biometrics.

[2]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[3]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Jeffrey W. Clark,et al.  Anaplastic lymphoma kinase inhibition in non-small-cell lung cancer. , 2010, The New England journal of medicine.

[5]  John Quackenbush,et al.  Somatic Mutations Drive Distinct Imaging Phenotypes in Lung Cancer. , 2017, Cancer research.

[6]  Cristiana Maurella,et al.  Comparison among conventional and advanced MRI, 18F-FDG PET/CT, phenotype and genotype in glioblastoma , 2017, Oncotarget.

[7]  T. Jiang,et al.  MRI features can predict EGFR expression in lower grade gliomas: A voxel-based radiomic analysis , 2017, European Radiology.

[8]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[9]  Jeffrey W. Clark,et al.  Crizotinib in ROS1-rearranged non-small-cell lung cancer. , 2014, The New England journal of medicine.

[10]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[11]  Andrew Y. Ng,et al.  CheXNet: Radiologist-Level Pneumonia Detection on Chest X-Rays with Deep Learning , 2017, ArXiv.

[12]  Sebastian Thrun,et al.  Dermatologist-level classification of skin cancer with deep neural networks , 2017, Nature.

[13]  Jie Yang,et al.  Prognostic value of K-RAS mutations in patients with non-small cell lung cancer: a systematic review with meta-analysis. , 2013, Lung cancer.

[14]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[15]  Peter Schenk,et al.  Cell‐Free Plasma DNA‐Guided Treatment With Osimertinib in Patients With Advanced EGFR‐Mutated NSCLC , 2018, Journal of thoracic oncology : official publication of the International Association for the Study of Lung Cancer.

[16]  Tianhong Li,et al.  Genotyping and genomic profiling of non-small-cell lung cancer: implications for current and future therapies. , 2013, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[17]  Subhashini Venugopalan,et al.  Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs. , 2016, JAMA.

[18]  William Pao,et al.  Molecular characteristics of bronchioloalveolar carcinoma and adenocarcinoma, bronchioloalveolar carcinoma subtype, predict response to erlotinib. , 2008, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[19]  Tianqi Chen,et al.  Empirical Evaluation of Rectified Activations in Convolutional Network , 2015, ArXiv.

[20]  Steven E. Schild,et al.  Non-small cell lung cancer, version 5.2017: Clinical practice guidelines in oncology , 2017 .

[21]  Olivier Gevaert,et al.  Non-small cell lung cancer: identifying prognostic imaging biomarkers by leveraging public gene expression microarray data--methods and preliminary results. , 2012, Radiology.

[22]  Qiong Li,et al.  Radiomics signature: A potential and incremental predictor for EGFR mutation status in NSCLC patients, comparison with CT morphology. , 2019, Lung cancer.

[23]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[24]  Geoffrey E. Hinton,et al.  On the importance of initialization and momentum in deep learning , 2013, ICML.

[25]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[26]  Chun-Ming Tsai,et al.  Phase III study of afatinib or cisplatin plus pemetrexed in patients with metastatic lung adenocarcinoma with EGFR mutations. , 2013, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[27]  A. Gemma,et al.  F1000 highlights , 2010 .

[28]  Hao Chen,et al.  Automated Pulmonary Nodule Detection via 3D ConvNets with Online Sample Filtering and Hybrid-Loss Residual Learning , 2017, MICCAI.

[29]  Tomoko Betsuyaku,et al.  Comparison of detection methods of EGFR T790M mutations using plasma, serum, and tumor tissue in EGFR-TKI-resistant non-small cell lung cancer , 2018, OncoTargets and therapy.

[30]  Jung-Hwan Lim,et al.  Feasibility of re‐biopsy and EGFR mutation analysis in patients with non‐small cell lung cancer , 2018, Thoracic cancer.

[31]  Paul Baas,et al.  Liquid Biopsy for Advanced Non‐Small Cell Lung Cancer (NSCLC): A Statement Paper from the IASLC , 2018, Journal of thoracic oncology : official publication of the International Association for the Study of Lung Cancer.

[32]  N. Girard,et al.  New driver mutations in non-small-cell lung cancer. , 2011, The Lancet. Oncology.

[33]  Stephen M. Moore,et al.  The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository , 2013, Journal of Digital Imaging.

[34]  Mark J. Ratain,et al.  Tumour heterogeneity in the clinic , 2013, Nature.

[35]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[36]  G. Giaccone,et al.  Molecular testing guideline for selection of lung cancer patients for EGFR and ALK tyrosine kinase inhibitors: guideline from the College of American Pathologists, International Association for the Study of Lung Cancer, and Association for Molecular Pathology. , 2013, The Journal of molecular diagnostics : JMD.

[37]  Aaron Klein,et al.  Efficient and Robust Automated Machine Learning , 2015, NIPS.

[38]  P. J. García Nieto,et al.  A new predictive model for the cyanotoxin content from experimental cyanobacteria concentrations in a reservoir based on the ABC optimized support vector machine approach: A case study in Northern Spain , 2015, Ecol. Informatics.

[39]  A. Jemal,et al.  Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries , 2018, CA: a cancer journal for clinicians.

[40]  Bingbing Ni,et al.  3D Deep Learning from CT Scans Predicts Tumor Invasiveness of Subcentimeter Pulmonary Adenocarcinomas. , 2018, Cancer research.

[41]  P. Lambin,et al.  Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach , 2014, Nature Communications.

[42]  Y. Liu,et al.  Radiomic Features Are Associated With EGFR Mutation Status in Lung Adenocarcinomas. , 2016, Clinical lung cancer.

[43]  H. Aerts The Potential of Radiomic-Based Phenotyping in Precision Medicine: A Review. , 2016, JAMA oncology.

[44]  Chintan Parmar,et al.  Associations Between Somatic Mutations and Metabolic Imaging Phenotypes in Non–Small Cell Lung Cancer , 2017, The Journal of Nuclear Medicine.

[45]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[46]  Paul Kinahan,et al.  Radiomics: Images Are More than Pictures, They Are Data , 2015, Radiology.

[47]  L. Schwartz,et al.  Defining a Radiomic Response Phenotype: A Pilot Study using targeted therapy in NSCLC , 2016, Scientific Reports.

[48]  T. Mok,et al.  Gefitinib or carboplatin-paclitaxel in pulmonary adenocarcinoma. , 2009, The New England journal of medicine.

[49]  Hongyi Zhang,et al.  mixup: Beyond Empirical Risk Minimization , 2017, ICLR.