Classifying non-small cell lung cancer types and transcriptomic subtypes using convolutional neural networks

OBJECTIVE Non-small cell lung cancer is a leading cause of cancer death worldwide, and histopathological evaluation plays the primary role in its diagnosis. However, the morphological patterns associated with the molecular subtypes have not been systematically studied. To bridge this gap, we developed a quantitative histopathology analytic framework to identify the types and gene expression subtypes of non-small cell lung cancer objectively. MATERIALS AND METHODS We processed whole-slide histopathology images of lung adenocarcinoma (n = 427) and lung squamous cell carcinoma patients (n = 457) in the Cancer Genome Atlas. We built convolutional neural networks to classify histopathology images, evaluated their performance by the areas under the receiver-operating characteristic curves (AUCs), and validated the results in an independent cohort (n = 125). RESULTS To establish neural networks for quantitative image analyses, we first built convolutional neural network models to identify tumor regions from adjacent dense benign tissues (AUCs > 0.935) and recapitulated expert pathologists' diagnosis (AUCs > 0.877), with the results validated in an independent cohort (AUCs = 0.726-0.864). We further demonstrated that quantitative histopathology morphology features identified the major transcriptomic subtypes of both adenocarcinoma and squamous cell carcinoma (P < .01). DISCUSSION Our study is the first to classify the transcriptomic subtypes of non-small cell lung cancer using fully automated machine learning methods. Our approach does not rely on prior pathology knowledge and can discover novel clinically relevant histopathology patterns objectively. The developed procedure is generalizable to other tumor types or diseases.

[1]  Andrew H. Beck,et al.  Diagnostic Assessment of Deep Learning Algorithms for Detection of Lymph Node Metastases in Women With Breast Cancer , 2017, JAMA.

[2]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[3]  Andrew H. Beck,et al.  Systematic Analysis of Breast Cancer Morphology Uncovers Stromal Features Associated with Survival , 2011, Science Translational Medicine.

[4]  Christopher R. Cabanski,et al.  Validation of interobserver agreement in lung cancer assessment: hematoxylin-eosin diagnostic reproducibility for non-small cell lung cancer: the 2004 World Health Organization classification and therapeutically relevant subsets. , 2013, Archives of pathology & laboratory medicine.

[5]  Subhashini Venugopalan,et al.  Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs. , 2016, JAMA.

[6]  Richard Pazdur,et al.  FDA drug approval summary: bevacizumab (Avastin) plus Carboplatin and Paclitaxel as first-line treatment of advanced/metastatic recurrent nonsquamous non-small cell lung cancer. , 2007, The oncologist.

[7]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[8]  S. Tamang,et al.  Potential Biases in Machine Learning Algorithms Using Electronic Health Record Data , 2018, JAMA internal medicine.

[9]  Euan A Ashley,et al.  The precision medicine initiative: a new national effort. , 2015, JAMA.

[10]  R. Altman,et al.  Association of Omics Features with Histopathology Patterns in Lung Adenocarcinoma. , 2017, Cell systems.

[11]  Ehsan Kazemi,et al.  Deep Convolutional Neural Networks Enable Discrimination of Heterogeneous Digital Pathology Images , 2017, bioRxiv.

[12]  W. Travis,et al.  New pathologic classification of lung cancer: relevance for clinical practice and clinical trials. , 2013, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[13]  Mahadev Satyanarayanan,et al.  OpenSlide: A vendor-neutral software foundation for digital pathology , 2013, Journal of pathology informatics.

[14]  Ah Chung Tsoi,et al.  Face recognition: a convolutional neural-network approach , 1997, IEEE Trans. Neural Networks.

[15]  Steven J. M. Jones,et al.  Comprehensive molecular profiling of lung adenocarcinoma , 2014, Nature.

[16]  A. Jemal,et al.  Global cancer statistics, 2012 , 2015, CA: a cancer journal for clinicians.

[17]  C. Comin,et al.  EU–USA pathology panel for uniform diagnosis in randomised controlled trials for HRCT screening in lung cancer , 2006, European Respiratory Journal.

[18]  Sebastian Thrun,et al.  Dermatologist-level classification of skin cancer with deep neural networks , 2017, Nature.

[19]  Yoshua Bengio,et al.  Generative Adversarial Networks , 2014, ArXiv.

[20]  Ce Zhang,et al.  Predicting non-small cell lung cancer prognosis by fully automated microscopic pathology image features , 2016, Nature Communications.

[21]  Andrew D Althouse,et al.  Accuracy of the IASLC/ATS/ERS histological subtyping of stage I lung adenocarcinoma on intraoperative frozen sections , 2015, Modern Pathology.

[22]  Kun‐Hsing Yu,et al.  Omics Profiling in Precision Oncology* , 2016, Molecular & Cellular Proteomics.

[23]  James PB O'Connor,et al.  Rethinking the role of clinical imaging , 2017, eLife.

[24]  Syed Haider,et al.  International Cancer Genome Consortium Data Portal—a one-stop shop for cancer genomics data , 2011, Database J. Biol. Databases Curation.

[25]  F. Markowetz,et al.  Quantitative Image Analysis of Cellular Heterogeneity in Breast Tumors Complements Genomic Profiling , 2012, Science Translational Medicine.

[26]  E. Topol,et al.  Adapting to Artificial Intelligence: Radiologists and Pathologists as Information Specialists. , 2016, JAMA.

[27]  Todd H. Stokes,et al.  Pathology imaging informatics for quantitative analysis of whole-slide images , 2013, Journal of the American Medical Informatics Association : JAMIA.

[28]  D. Naidich,et al.  Evaluation of individuals with pulmonary nodules: when is it lung cancer? Diagnosis and management of lung cancer, 3rd ed: American College of Chest Physicians evidence-based clinical practice guidelines. , 2013, Chest.

[29]  Lara Iglesias,et al.  The new IASLC/ATS/ERS lung adenocarcinoma classification from a clinical perspective: current concepts and future prospects. , 2014, Journal of thoracic disease.

[30]  Shaimaa Al-Janabi,et al.  Digital pathology: current status and future perspectives , 2012, Histopathology.

[31]  F. Collins,et al.  A new initiative on precision medicine. , 2015, The New England journal of medicine.

[32]  Isaac S Kohane,et al.  Artificial Intelligence in Healthcare , 2019, Artificial Intelligence and Machine Learning for Business for Non-Engineers.

[33]  Naomi S. Altman,et al.  Points of Significance: Model selection and overfitting , 2016, Nature Methods.

[34]  Michael Snyder,et al.  Omics AnalySIs System for PRecision Oncology (OASISPRO): a web-based omics analysis tool for clinical phenotype prediction , 2018, Bioinform..

[35]  J. M. Crawford,et al.  Pathologist workforce in the United States: I. Development of a predictive model to examine factors influencing supply. , 2013, Archives of pathology & laboratory medicine.

[36]  H. Benjamin,et al.  Accurate Classification of Non–Small Cell Lung Carcinoma Using a Novel MicroRNA-Based Approach , 2010, Clinical Cancer Research.

[37]  David M Jablons,et al.  Randomized phase II trial comparing bevacizumab plus carboplatin and paclitaxel with carboplatin and paclitaxel alone in previously untreated locally advanced or metastatic non-small-cell lung cancer. , 2004, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[38]  I. Kohane,et al.  Framing the challenges of artificial intelligence in medicine , 2018, BMJ Quality & Safety.

[39]  Ronald M. Summers,et al.  Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning , 2016, IEEE Transactions on Medical Imaging.

[40]  Gary D Bader,et al.  International network of cancer genome projects , 2010, Nature.

[41]  Rebecca L. Siegel Mph,et al.  Cancer statistics, 2016 , 2016 .

[42]  George Lee,et al.  Image analysis and machine learning in digital pathology: Challenges and opportunities , 2016, Medical Image Anal..

[43]  Michael Snyder Genomics and Personalized Medicine: What Everyone Needs to Know® , 2016 .

[44]  Matthew D. Wilkerson,et al.  Differential Pathogenesis of Lung Adenocarcinoma Subtypes Involving Sequence Mutations, Copy Number, Chromosomal Instability, and Methylation , 2012, PloS one.

[45]  Steven J. M. Jones,et al.  Comprehensive genomic characterization of squamous cell lung cancers , 2012, Nature.

[46]  Steven E Schild,et al.  Non-small cell lung cancer: epidemiology, risk factors, treatment, and survivorship. , 2008, Mayo Clinic proceedings.

[47]  K. Jöckel,et al.  Diagnostic agreement in the histopathological evaluation of lung cancer tissue in a population-based case-control study. , 2006, Lung cancer.

[48]  John R. Gilbertson,et al.  Computer aided diagnostic tools aim to empower rather than replace pathologists: Lessons learned from computational chess , 2011, Journal of pathology informatics.

[49]  N. Razavian,et al.  Classification and mutation prediction from non–small cell lung cancer histopathology images using deep learning , 2018, Nature Medicine.

[50]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[51]  Jamie Weydert,et al.  Mandatory Second Opinion in Surgical Pathology Referral Material: Clinical Consequences of Major Disagreements , 2008, The American journal of surgical pathology.

[52]  Johan Vansteenkiste,et al.  Phase III study comparing cisplatin plus gemcitabine with cisplatin plus pemetrexed in chemotherapy-naive patients with advanced-stage non-small-cell lung cancer. , 2008, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[53]  A. Jemal,et al.  Cancer statistics, 2016 , 2016, CA: a cancer journal for clinicians.

[54]  John D. Pfeifer,et al.  Review of the current state of whole slide imaging in pathology , 2011, Journal of pathology informatics.