Weakly-supervised learning for lung carcinoma classification using deep learning

Lung cancer is one of the major causes of cancer-related deaths in many countries around the world, and its histopathological diagnosis is crucial for deciding on optimum treatment strategies. Recently, Artificial Intelligence (AI) deep learning models have been widely shown to be useful in various medical fields, particularly image and pathological diagnoses; however, AI models for the pathological diagnosis of pulmonary lesions that have been validated on large-scale test sets are yet to be seen. We trained a Convolution Neural Network (CNN) based on the EfficientNet-B3 architecture, using transfer learning and weakly-supervised learning, to predict carcinoma in Whole Slide Images (WSIs) using a training dataset of 3,554 WSIs. We obtained highly promising results for differentiating between lung carcinoma and non-neoplastic with high Receiver Operator Curve (ROC) area under the curves (AUCs) on four independent test sets (ROC AUCs of 0.975, 0.974, 0.988, and 0.981, respectively). Development and validation of algorithms such as ours are important initial steps in the development of software suites that could be adopted in routine pathological practices and potentially help reduce the burden on pathologists.

[1]  S. Toyooka,et al.  Gefitinib versus cisplatin plus docetaxel in patients with non-small-cell lung cancer harbouring mutations of the epidermal growth factor receptor (WJTOG3405): an open label, randomised phase 3 trial. , 2010, The Lancet. Oncology.

[2]  Luiz Eduardo Soares de Oliveira,et al.  Multiple instance learning for histopathological breast cancer image classification , 2019, Expert Syst. Appl..

[3]  S. T. Buckland,et al.  An Introduction to the Bootstrap. , 1994 .

[4]  Young Hak Kim,et al.  Alectinib versus crizotinib in patients with ALK-positive non-small-cell lung cancer (J-ALEX): an open-label, randomised phase 3 trial , 2017, The Lancet.

[5]  Quoc V. Le,et al.  EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks , 2019, ICML.

[6]  B. Pritt,et al.  Histopathologic review of granulomatous inflammation , 2017, Journal of clinical tuberculosis and other mycobacterial diseases.

[7]  L. Kreyberg Main Histological Types of Primary Epithelial Lung Tumours , 1961, British Journal of Cancer.

[8]  Andrew H. Beck,et al.  Diagnostic Assessment of Deep Learning Algorithms for Detection of Lymph Node Metastases in Women With Breast Cancer , 2017, JAMA.

[9]  C. Castro,et al.  Changing trends in the distribution of the histologic types of lung cancer: a review of 4,439 cases. , 2007, Annals of Diagnostic Pathology.

[10]  John D. Hunter,et al.  Matplotlib: A 2D Graphics Environment , 2007, Computing in Science & Engineering.

[11]  Tae-Yeong Kwak,et al.  Artificial Intelligence in Pathology , 2018, Journal of pathology and translational medicine.

[12]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[13]  Tomasz Markiewicz,et al.  Convolutional neural networks can accurately distinguish four histologic growth patterns of lung adenocarcinoma in digital slides , 2019, Scientific Reports.

[14]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[15]  Paul A. Viola,et al.  Multiple Instance Boosting for Object Detection , 2005, NIPS.

[16]  Thomas G. Dietterich,et al.  Solving the Multiple Instance Problem with Axis-Parallel Rectangles , 1997, Artif. Intell..

[17]  R. Kawashima,et al.  Succeeding in deactivating: associations of hair zinc levels with functional and structural neural mechanisms , 2020, Scientific Reports.

[18]  Aldenor G. Santos,et al.  Occurrence of the potent mutagens 2- nitrobenzanthrone and 3-nitrobenzanthrone in fine airborne particles , 2019, Scientific Reports.

[19]  Zhi-Hua Zhou,et al.  A brief introduction to weakly supervised learning , 2018 .

[20]  Rajarsi R. Gupta,et al.  Spatial Organization and Molecular Correlation of Tumor-Infiltrating Lymphocytes Using Deep Learning on Pathology Images. , 2018, Cell reports.

[21]  Thomas J. Fuchs,et al.  Clinical-grade computational pathology using weakly supervised deep learning on whole slide images , 2019, Nature Medicine.

[22]  Joachim M. Buhmann,et al.  Computational Pathology: Challenges and Promises for Tissue Analysis , 2015, Comput. Medical Imaging Graph..

[23]  Joel H. Saltz,et al.  Patch-Based Convolutional Neural Network for Whole Slide Tissue Image Classification , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Saeed Hassanpour,et al.  Pathologist-level classification of histologic patterns on resected lung adenocarcinoma slides with deep neural networks , 2019, Scientific Reports.

[25]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[26]  Thomas Hofmann,et al.  Multiple instance learning with generalized support vector machines , 2002, AAAI/IAAI.

[27]  N. Razavian,et al.  Classification and mutation prediction from non–small cell lung cancer histopathology images using deep learning , 2018, Nature Medicine.

[28]  George Lee,et al.  Image analysis and machine learning in digital pathology: Challenges and opportunities , 2016, Medical Image Anal..

[29]  M. V. van Zelm,et al.  Immunopathogenesis of granulomas in chronic autoinflammatory diseases , 2016, Clinical & translational immunology.

[30]  Junzhou Huang,et al.  Comprehensive Computational Pathological Image Analysis Predicts Lung Cancer Prognosis , 2017, Journal of thoracic oncology : official publication of the International Association for the Study of Lung Cancer.

[31]  K. Arihiro,et al.  Deep Learning Models for Histopathological Classification of Gastric and Colonic Epithelial Tumours , 2020, Scientific Reports.

[32]  Saeed Hassanpour,et al.  Deep Learning for Classification of Colorectal Polyps on Whole-slide Images , 2017, Journal of pathology informatics.

[33]  B. van Ginneken,et al.  Deep learning as a tool for increased accuracy and efficiency of histopathological diagnosis , 2016, Scientific Reports.

[34]  A. Jemal,et al.  Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries , 2018, CA: a cancer journal for clinicians.

[35]  Ming-Hsuan Yang,et al.  Robust Object Tracking with Online Multiple Instance Learning , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Stephen M. Moore,et al.  The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository , 2013, Journal of Digital Imaging.

[37]  Chaoxian Zhang,et al.  Selection of reference genes for qPCR normalization in buffalobur (Solanum rostratum Dunal) , 2019, Scientific Reports.

[38]  Brendan J. Frey,et al.  Classifying and segmenting microscopy images with deep multiple instance learning , 2015, Bioinform..

[39]  Allison P. Heath,et al.  Toward a Shared Vision for Cancer Genomic Data. , 2016, The New England journal of medicine.

[40]  Ieee Xplore,et al.  IEEE Transactions on Pattern Analysis and Machine Intelligence Information for Authors , 2022, IEEE Transactions on Pattern Analysis and Machine Intelligence.