Prediction of pathologic stage in non-small cell lung cancer using machine learning algorithm based on CT image feature analysis

PurposeTo explore imaging biomarkers that can be used for diagnosis and prediction of pathologic stage in non-small cell lung cancer (NSCLC) using multiple machine learning algorithms based on CT image feature analysis.MethodsPatients with stage IA to IV NSCLC were included, and the whole dataset was divided into training and testing sets and an external validation set. To tackle imbalanced datasets in NSCLC, we generated a new dataset and achieved equilibrium of class distribution by using SMOTE algorithm. The datasets were randomly split up into a training/testing set. We calculated the importance value of CT image features by means of mean decrease gini impurity generated by random forest algorithm and selected optimal features according to feature importance (mean decrease gini impurity > 0.005). The performance of prediction model in training and testing sets were evaluated from the perspectives of classification accuracy, average precision (AP) score and precision-recall curve. The predictive accuracy of the model was externally validated using lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC) samples from TCGA database.ResultsThe prediction model that incorporated nine image features exhibited a high classification accuracy, precision and recall scores in the training and testing sets. In the external validation, the predictive accuracy of the model in LUAD outperformed that in LUSC.ConclusionsThe pathologic stage of patients with NSCLC can be accurately predicted based on CT image features, especially for LUAD. Our findings extend the application of machine learning algorithms in CT image feature prediction for pathologic staging and identify potential imaging biomarkers that can be used for diagnosis of pathologic stage in NSCLC patients.

[1]  Andrew H. Beck,et al.  Diagnostic Assessment of Deep Learning Algorithms for Detection of Lymph Node Metastases in Women With Breast Cancer , 2017, JAMA.

[2]  Kujtim Latifi,et al.  Imaging features from pretreatment CT scans are associated with clinical outcomes in nonsmall‐cell lung cancer patients treated with stereotactic body radiotherapy , 2017, Medical physics.

[3]  Fumihiro Tanaka,et al.  The prognostic significance of HER2 overexpression in non-small cell lung cancer. , 2011, Anticancer research.

[4]  Paulo Mazzoncini de Azevedo Marques,et al.  Radiomics-based features for pattern recognition of lung cancer histopathology and metastases , 2018, Comput. Methods Programs Biomed..

[5]  Shu-Ju Tu,et al.  Localized thin-section CT with radiomics feature extraction and machine learning to classify early-detected pulmonary nodules from lung cancer screening , 2018, Physics in medicine and biology.

[6]  Paul J. Perry,et al.  A Review for the Clinician , 2002 .

[7]  Prateek Prasanna,et al.  Radiomics and radiogenomics in lung cancer: A review for the clinician. , 2018, Lung cancer.

[8]  Morihito Okada,et al.  Prediction of pathologic node-negative clinical stage IA lung adenocarcinoma for optimal candidates undergoing sublobar resection. , 2012, The Journal of thoracic and cardiovascular surgery.

[9]  P. Lambin,et al.  Predicting outcomes in radiation oncology—multifactorial decision support systems , 2013, Nature Reviews Clinical Oncology.

[10]  P. Lambin,et al.  CT-based radiomic signature predicts distant metastasis in lung adenocarcinoma. , 2015, Radiotherapy and oncology : journal of the European Society for Therapeutic Radiology and Oncology.

[11]  Boris Sepesi,et al.  Development and Validation of a Predictive Radiomics Model for Clinical Outcomes in Stage I Non-small Cell Lung Cancer. , 2017, International journal of radiation oncology, biology, physics.

[12]  Senén Barro,et al.  An extensive experimental survey of regression methods , 2019, Neural Networks.

[13]  Patrick Granton,et al.  Radiomics: extracting more information from medical images using advanced feature analysis. , 2012, European journal of cancer.

[14]  Hidetaka Arimura,et al.  Exploration of temporal stability and prognostic power of radiomic features based on electronic portal imaging device images. , 2018, Physica medica : PM : an international journal devoted to the applications of physics to medicine and biology : official journal of the Italian Association of Biomedical Physics.

[15]  M. Li,et al.  Long non-coding RNA MALAT1 regulates ovarian cancer cell proliferation, migration and apoptosis through Wnt/β-catenin signaling pathway. , 2018, European review for medical and pharmacological sciences.

[16]  Peter Balter,et al.  Delta-radiomics features for the prediction of patient outcomes in non–small cell lung cancer , 2017, Scientific Reports.

[17]  Takashi Nakajima,et al.  Fluorine-18-α-Methyltyrosine Positron Emission Tomography for Diagnosis and Staging of Lung Cancer: A Clinicopathologic Study , 2007, Clinical Cancer Research.

[18]  Jin Mo Goo,et al.  The prognostic value of CT radiomic features for patients with pulmonary adenocarcinoma treated with EGFR tyrosine kinase inhibitors , 2017, PloS one.

[19]  Cheng-Yi Cheng,et al.  Predictive Value of 18F-FDG PET and CT Morphologic Features for Recurrence in Pathological Stage IA Non-Small Cell Lung Cancer , 2015, Medicine.

[20]  Yoichi Kameda,et al.  Sublobar resection for patients with peripheral small adenocarcinomas of the lung: surgical outcome is associated with features on computed tomographic imaging. , 2007, The Annals of thoracic surgery.

[21]  Liu Yang,et al.  Evidence, Mechanism, and Clinical Relevance of the Transdifferentiation from Lung Adenocarcinoma to Squamous Cell Carcinoma. , 2017, The American journal of pathology.

[22]  Bin Zhang,et al.  Radiomic machine-learning classifiers for prognostic biomarkers of advanced nasopharyngeal carcinoma. , 2017, Cancer letters.

[23]  D. Libby,et al.  Tumor size predicts survival within stage IA non-small cell lung cancer. , 2003, Chest.

[24]  João Manuel R. S. Tavares,et al.  Automatic 3D pulmonary nodule detection in CT images: A survey , 2016, Comput. Methods Programs Biomed..

[25]  Jae Kwon Kim,et al.  A Performance Comparison on the Machine Learning Classifiers in Predictive Pathology Staging of Prostate Cancer , 2017, MedInfo.

[26]  Kyung Soo Lee,et al.  Quantitative image variables reflect the intratumoral pathologic heterogeneity of lung adenocarcinoma , 2016, Oncotarget.

[27]  Federico González-Aragoneses,et al.  Multicenter analysis of survival and prognostic factors in pathologic stage I non-small-cell lung cancer according to the new 2009 TNM classification. , 2011, Archivos de bronconeumologia.

[28]  Xiao Liang,et al.  Novel radiomic signature as a prognostic biomarker for locally advanced rectal cancer , 2018, Journal of magnetic resonance imaging : JMRI.

[29]  Roberto Maroldi,et al.  Texture analysis of advanced non-small cell lung cancer (NSCLC) on contrast-enhanced computed tomography: prediction of the response to the first-line chemotherapy , 2013, European Radiology.

[30]  Carsten Brink,et al.  Survival prediction of non-small cell lung cancer patients using radiomics analyses of cone-beam CT images. , 2017, Radiotherapy and oncology : journal of the European Society for Therapeutic Radiology and Oncology.

[31]  S. Perner,et al.  [Translational research and diagnostics in lung cancer]. , 2012, Der Pathologe.

[32]  Yi Lu,et al.  Application of Machine Learning Approaches for the Design and Study of Anticancer Drugs. , 2019, Current drug targets.

[33]  Roy S. Herbst,et al.  The biology and management of non-small cell lung cancer , 2018, Nature.

[34]  Byung-Tae Kim,et al.  Non-small cell lung cancer: prospective comparison of integrated FDG PET/CT and CT alone for preoperative staging. , 2005, Radiology.

[35]  Pablo León-Atance,et al.  Análisis multicéntrico de supervivencia y factores pronósticos en el carcinoma no microcítico de pulmón en estadio I patológico según la nueva clasificación TNM de 2009 , 2011 .

[36]  Vicenta S Martínez-Zorzano,et al.  Erythrocyte fatty acids as potential biomarkers in the diagnosis of advanced lung adenocarcinoma, lung squamous cell carcinoma, and small cell lung cancer. , 2014, American journal of clinical pathology.