Classification and mutation prediction from non–small cell lung cancer histopathology images using deep learning

Visual inspection of histopathology slides is one of the main methods used by pathologists to assess the stage, type and subtype of lung tumors. Adenocarcinoma (LUAD) and squamous cell carcinoma (LUSC) are the most prevalent subtypes of lung cancer, and their distinction requires visual inspection by an experienced pathologist. In this study, we trained a deep convolutional neural network (inception v3) on whole-slide images obtained from The Cancer Genome Atlas to accurately and automatically classify them into LUAD, LUSC or normal lung tissue. The performance of our method is comparable to that of pathologists, with an average area under the curve (AUC) of 0.97. Our model was validated on independent datasets of frozen tissues, formalin-fixed paraffin-embedded tissues and biopsies. Furthermore, we trained the network to predict the ten most commonly mutated genes in LUAD. We found that six of them—STK11, EGFR, FAT1, SETBP1, KRAS and TP53—can be predicted from pathology images, with AUCs from 0.733 to 0.856 as measured on a held-out population. These findings suggest that deep-learning models can assist pathologists in the detection of cancer subtype or gene mutations. Our approach can be applied to any cancer type, and the code is available at https://github.com/ncoudray/DeepPATH.A convolutional neural network model using feature extraction and machine-learning techniques provides a tool for classification of lung cancer histopathology images and predicting mutational status of driver oncogenes

[1]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[2]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[3]  R. Tibshirani,et al.  An introduction to the bootstrap , 1993 .

[4]  S. T. Buckland,et al.  An Introduction to the Bootstrap. , 1994 .

[5]  Robert F. Bonner,et al.  Laser Capture Microdissection: Molecular Analysis of Tissue , 1997, Science.

[6]  J. Vasiliev,et al.  Changes in p53 expression can modify cell shape of ras-transformed fibroblasts and epitheliocytes , 1997, Oncogene.

[7]  David Sidransky,et al.  Inactivation of LKB1/STK11 is a common event in adenocarcinomas of the lung. , 2002, Cancer research.

[8]  U. Pastorino,et al.  Quantification of Free Circulating DNA As a Diagnostic Marker in Lung Cancer , 2003 .

[9]  Hans Clevers,et al.  LKB1 tumor suppressor protein: PARtaker in cell polarity. , 2004, Trends in cell biology.

[10]  J. Rigas,et al.  Determinants of tumor response and survival with erlotinib in patients with non--small-cell lung cancer. , 2004, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[11]  Y. Yatabe,et al.  EGFR Mutation Is Specific for Terminal Respiratory Unit Type Adenocarcinoma , 2005, The American journal of surgical pathology.

[12]  M. Nordberg,et al.  Cancer , 1906, The Hospital.

[13]  M. Loda,et al.  Mutation-Specific Antibodies for the Detection of EGFR Mutations in Non–Small-Cell Lung Cancer , 2009, Clinical Cancer Research.

[14]  Mikhail Teverovskiy,et al.  A systems pathology model for predicting overall survival in patients with refractory, advanced non-small-cell lung cancer treated with gefitinib. , 2009, European journal of cancer.

[15]  A. Iafrate,et al.  Unique Clinicopathologic Features Characterize ALK-Rearranged Lung Adenocarcinoma in the Western Population , 2009, Clinical Cancer Research.

[16]  Y. Ishikawa,et al.  Correlation between morphology and EGFR mutations in lung adenocarcinomas Significance of the micropapillary pattern and the hobnail cell type. , 2009, Lung cancer.

[17]  Samuel Leung,et al.  Optimal Immunohistochemical Markers For Distinguishing Lung Adenocarcinomas From Squamous Cell Carcinomas in Small Tumor Samples , 2010, The American journal of surgical pathology.

[18]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[19]  T. Kohno,et al.  Comprehensive Histologic Analysis of ALK-Rearranged Lung Carcinomas , 2011, The American journal of surgical pathology.

[20]  Masahiro Tsuboi,et al.  International Association for the Study of Lung Cancer/American Thoracic Society/European Respiratory Society International Multidisciplinary Classification of Lung Adenocarcinoma , 2011, Journal of thoracic oncology : official publication of the International Association for the Study of Lung Cancer.

[21]  Akira Mogi,et al.  TP53 Mutations in Nonsmall Cell Lung Cancer , 2011, Journal of biomedicine & biotechnology.

[22]  S. Digumarthy,et al.  Genotypic and Histological Evolution of Lung Cancers Acquiring Resistance to EGFR Inhibitors , 2011, Science Translational Medicine.

[23]  S. Agarwal,et al.  Standardization of epidermal growth factor receptor (EGFR) measurement by quantitative immunofluorescence and impact on antibody-based mutation detection in non-small cell lung cancer. , 2011, The American journal of pathology.

[24]  Steven J. M. Jones,et al.  Comprehensive genomic characterization of squamous cell lung cancers , 2012, Nature.

[25]  M. McHugh Interrater reliability: the kappa statistic , 2012, Biochemia medica.

[26]  A. Warth,et al.  EGFR, KRAS, BRAF and ALK gene alterations in lung adenocarcinomas: patient outcome, interplay with morphology and immunophenotype , 2013, European Respiratory Journal.

[27]  Benjamin J. Raphael,et al.  Mutational landscape and significance across 12 major cancer types , 2013, Nature.

[28]  Yi-long Wu,et al.  Mutation incidence and coincidence in non small-cell lung cancer: meta-analyses by ethnicity and histology (mutMap) , 2013, Annals of oncology : official journal of the European Society for Medical Oncology.

[29]  L. Liau,et al.  Recurrent somatic mutation of FAT1 in multiple human cancers leads to aberrant Wnt activation , 2013, Nature Genetics.

[30]  Liu Wei,et al.  LKB1 inactivation dictates therapeutic response of non-small cell lung cancer to the metabolism drug phenformin. , 2013, Cancer cell.

[31]  Lucio Crinò,et al.  Selumetinib plus docetaxel for KRAS-mutant advanced non-small-cell lung cancer: a randomised, multicentre, placebo-controlled, phase 2 study. , 2013, The Lancet. Oncology.

[32]  Current status of targeted therapy in non-small cell lung cancer. , 2014, Drugs of today.

[33]  Laurens van der Maaten,et al.  Accelerating t-SNE using tree-based algorithms , 2014, J. Mach. Learn. Res..

[34]  N. Pavlakis,et al.  EGFR mutation specific immunohistochemistry is a useful adjunct which helps to identify false negative mutation testing in lung cancer , 2014, Pathology.

[35]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[36]  Michael den Bakker,et al.  Prognostic and predictive biomarkers in lung cancer. A review , 2014, Virchows Archiv.

[37]  D. Parums,et al.  Current status of targeted therapy in non-small cell lung cancer. , 2014, Drugs of today.

[38]  Steven J. M. Jones,et al.  Comprehensive molecular profiling of lung adenocarcinoma , 2014, Nature.

[39]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[40]  Thomas J. Smith,et al.  Systemic Therapy for Stage IV Non-Small-Cell Lung Cancer: American Society of Clinical Oncology Clinical Practice Guideline Update. , 2015, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[41]  Daniel L. Rubin,et al.  Automated Grading of Gliomas using Deep Learning in Digital Pathology Images: A modular approach with ensemble of convolutional neural networks , 2015, AMIA.

[42]  B. Chan,et al.  Targeted therapy for non-small cell lung cancer: current standards and the promise of the future. , 2015, Translational lung cancer research.

[43]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  A. Churg,et al.  Accuracy of classifying poorly differentiated non-small cell lung carcinoma biopsies with commonly used lung carcinoma markers. , 2015, Human pathology.

[45]  Marios Anthimopoulos,et al.  Lung Pattern Classification for Interstitial Lung Diseases Using a Deep Convolutional Neural Network , 2016, IEEE Transactions on Medical Imaging.

[46]  Ronald M. Summers,et al.  Deep Learning in Medical Imaging: Overview and Future Promise of an Exciting New Technique , 2016 .

[47]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Julie M. Batten,et al.  IDH2 Mutations Define a Unique Subtype of Breast Cancer with Altered Nuclear Polarity. , 2016, Cancer research.

[49]  Subhashini Venugopalan,et al.  Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs. , 2016, JAMA.

[50]  J. Jen,et al.  Molecular characterization of pulmonary sarcomatoid carcinoma: analysis of 33 cases , 2016, Modern Pathology.

[51]  N. Rajpoot,et al.  Locality Sensitive Deep Learning for Detection and Classification of Nuclei in Routine Colon Cancer Histology Images , 2016, IEEE Transactions on Medical Imaging.

[52]  D. Shen,et al.  Computer-Aided Diagnosis with Deep Learning Architecture: Applications to Breast Lesions in US Images and Pulmonary Nodules in CT Scans , 2016, Scientific Reports.

[53]  Nasir M. Rajpoot,et al.  Locality Sensitive Deep Learning for Detection and Classification of Nuclei in Routine Colon Cancer Histology Images , 2016, IEEE Trans. Medical Imaging.

[54]  Ce Zhang,et al.  Predicting non-small cell lung cancer prognosis by fully automated microscopic pathology image features , 2016, Nature Communications.

[55]  Allison P. Heath,et al.  Toward a Shared Vision for Cancer Genomic Data. , 2016, The New England journal of medicine.

[56]  Lin Yang,et al.  An Automatic Learning-Based Framework for Robust Nucleus Segmentation , 2016, IEEE Transactions on Medical Imaging.

[57]  Junzhou Huang,et al.  Comprehensive Computational Pathological Image Analysis Predicts Lung Cancer Prognosis , 2017, Journal of thoracic oncology : official publication of the International Association for the Study of Lung Cancer.

[58]  Aren Jansen,et al.  CNN architectures for large-scale audio classification , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[59]  Heung-Il Suk,et al.  Deep Learning in Medical Image Analysis. , 2017, Annual review of biomedical engineering.

[60]  Liron Pantanowitz,et al.  Current State of the Regulatory Trajectory for Whole Slide Imaging Devices in the USA , 2017, Journal of pathology informatics.

[61]  Ovidiu Daescu,et al.  Histopathological Diagnosis for Viable and Non-viable Tumor Prediction for Osteosarcoma Using Convolutional Neural Network , 2017, ISBRA.

[62]  K. Goldberg,et al.  Oncology Drug Approvals: Evaluating Endpoints and Evidence in an Era of Breakthrough Therapies , 2017, The oncologist.

[63]  Anant Madabhushi,et al.  Accurate and reproducible invasive breast cancer detection in whole-slide images: A Deep Learning approach for quantifying tumor extent , 2017, Scientific Reports.

[64]  David B. A. Epstein,et al.  Tumor Segmentation in Whole Slide Images Using Persistent Homology and Deep Convolutional Features , 2017, MIUA.

[65]  Sebastian Thrun,et al.  Dermatologist-level classification of skin cancer with deep neural networks , 2017, Nature.

[66]  Thomas J. Smith,et al.  Systemic Therapy for Stage IV Non-Small-Cell Lung Cancer: American Society of Clinical Oncology Clinical Practice Guideline Update. , 2017, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[67]  Geert J. S. Litjens,et al.  Automatic segmentation of histopathological slides of renal tissue using deep learning , 2018, Medical Imaging.

[68]  Rabi Yacoub,et al.  Multi-radial LBP Features as a Tool for Rapid Glomerular Detection and Assessment in Whole Slide Histopathology Images , 2017, Scientific Reports.

[69]  Andrew J. Schaumberg,et al.  D R A F T H&E-stained Whole Slide Image Deep Learning Predicts SPOP Mutation State in Prostate Cancer , 2017 .

[70]  Geert J. S. Litjens,et al.  Automated segmentation of epithelial tissue in prostatectomy slides using deep learning , 2018, Medical Imaging.

[71]  Ehsan Kazemi,et al.  Deep Convolutional Neural Networks Enable Discrimination of Heterogeneous Digital Pathology Images , 2017, bioRxiv.