Efficient pan-cancer whole-slide image classification and outlier detection using convolutional neural networks

Visual analysis of solid tissue mounted on glass slides is currently the primary method used by pathologists for determining the stage, type and subtypes of cancer. Although whole slide images are usually large (10s to 100s thousands pixels wide), an exhaustive though time-consuming assessment is necessary to reduce the risk of misdiagnosis. In an effort to address the many diagnostic challenges faced by trained experts, recent research has been focused on developing automatic prediction systems for this multi-class classification problem. Typically, complex convolutional neural network (CNN) architectures, such as Google’s Inception, are used to tackle this problem. Here, we introduce a greatly simplified CNN architecture, PathCNN, which allows for more efficient use of computational resources and better classification performance. Using this improved architecture, we trained simultaneously on whole-slide images from multiple tumor sites and corresponding non-neoplastic tissue. Dimensionality reduction analysis of the weights of the last layer of the network capture groups of images that faithfully represent the different types of cancer, highlighting at the same time differences in staining and capturing outliers, artifacts and misclassification errors. Our code is available online at: https://github.com/sedab/PathCNN.

[1]  Yoshua Bengio,et al.  Word-level training of a handwritten word recognizer based on convolutional neural networks , 1994, Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 3 - Conference C: Signal Processing (Cat. No.94CH3440-5).

[2]  A. Madabhushi Digital pathology image analysis: opportunities and challenges. , 2009, Imaging in medicine.

[3]  George Lee,et al.  Image analysis and machine learning in digital pathology: Challenges and opportunities , 2016, Medical Image Anal..

[4]  Michael Snyder Genomics and Personalized Medicine: What Everyone Needs to Know® , 2016 .

[5]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Naobumi Tochigi,et al.  Adenosquamous carcinoma of the lung: a microdissection study of KRAS and EGFR mutational and amplification status in a western patient population. , 2011, American journal of clinical pathology.

[7]  Juan Liu,et al.  Computer-Based Image Studies on Tumor Nests Mathematical Features of Breast Cancer and Their Clinical Prognostic Value , 2013, PloS one.

[8]  Mahadev Satyanarayanan,et al.  OpenSlide: A vendor-neutral software foundation for digital pathology , 2013, Journal of pathology informatics.

[9]  Anant Madabhushi,et al.  Accurate and reproducible invasive breast cancer detection in whole-slide images: A Deep Learning approach for quantifying tumor extent , 2017, Scientific Reports.

[10]  Dayong Wang,et al.  Deep Learning for Identifying Metastatic Breast Cancer , 2016, ArXiv.

[11]  George Loizou,et al.  Computer vision and pattern recognition , 2007, Int. J. Comput. Math..

[12]  Aristotelis Tsirigos,et al.  Classification and Mutation Prediction from Non-Small Cell Lung Cancer Histopathology Images using Deep Learning , 2017, bioRxiv.

[13]  Nasir M. Rajpoot,et al.  Locality Sensitive Deep Learning for Detection and Classification of Nuclei in Routine Colon Cancer Histology Images , 2016, IEEE Trans. Medical Imaging.

[14]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[15]  Ovidiu Daescu,et al.  Histopathological Diagnosis for Viable and Non-viable Tumor Prediction for Osteosarcoma Using Convolutional Neural Network , 2017, ISBRA.

[16]  Andrew H. Beck,et al.  Computational Pathology to Discriminate Benign from Malignant Intraductal Proliferations of the Breast , 2014, PloS one.

[17]  Ce Zhang,et al.  Predicting non-small cell lung cancer prognosis by fully automated microscopic pathology image features , 2016, Nature Communications.

[18]  G. Scagliotti,et al.  The differential efficacy of pemetrexed according to NSCLC histology: a review of two Phase III studies. , 2009, The oncologist.

[19]  Vivienne Sze,et al.  Hardware for machine learning: Challenges and opportunities , 2017, 2017 IEEE Custom Integrated Circuits Conference (CICC).

[20]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Michael Gadermayr,et al.  CNN cascades for segmenting sparse objects in gigapixel whole slide images , 2019, Comput. Medical Imaging Graph..

[22]  Manjiri Deshmukh,et al.  Refining the Diagnosis and EGFR Status of Non-small Cell Lung Carcinoma in Biopsy and Cytologic Material, Using a Panel of Mucin Staining, TTF-1, Cytokeratin 5/6, and P63, and EGFR Mutation Analysis , 2010, Journal of thoracic oncology : official publication of the International Association for the Study of Lung Cancer.

[23]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Heung-Il Suk,et al.  Deep Learning in Medical Image Analysis. , 2017, Annual review of biomedical engineering.

[25]  K. Aldape,et al.  Precision histology: how deep learning is poised to revitalize histomorphology for personalized cancer care , 2017, npj Precision Oncology.

[26]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[27]  S. Dwivedi,et al.  Obesity May Be Bad: Compressed Convolutional Networks for Biomedical Image Segmentation , 2020 .

[28]  N. Razavian,et al.  Classification and mutation prediction from non–small cell lung cancer histopathology images using deep learning , 2018, Nature Medicine.

[29]  Christopher R. Cabanski,et al.  Validation of interobserver agreement in lung cancer assessment: hematoxylin-eosin diagnostic reproducibility for non-small cell lung cancer: the 2004 World Health Organization classification and therapeutically relevant subsets. , 2013, Archives of pathology & laboratory medicine.

[30]  Ehsan Kazemi,et al.  Deep Convolutional Neural Networks Enable Discrimination of Heterogeneous Digital Pathology Images , 2017, bioRxiv.

[31]  Jason Cong,et al.  Minimizing Computation in Convolutional Neural Networks , 2014, ICANN.

[32]  Sebastian Thrun,et al.  Dermatologist-level classification of skin cancer with deep neural networks , 2017, Nature.

[33]  B. Chan,et al.  Targeted therapy for non-small cell lung cancer: current standards and the promise of the future. , 2015, Translational lung cancer research.

[34]  Ronald M. Summers,et al.  Deep Learning in Medical Imaging: Overview and Future Promise of an Exciting New Technique , 2016 .

[35]  Yinhai Wang,et al.  Automated tumor analysis for molecular profiling in lung cancer , 2015, Oncotarget.

[36]  Rajarsi R. Gupta,et al.  Spatial Organization and Molecular Correlation of Tumor-Infiltrating Lymphocytes Using Deep Learning on Pathology Images. , 2018, Cell reports.

[37]  Junzhou Huang,et al.  Comprehensive Computational Pathological Image Analysis Predicts Lung Cancer Prognosis , 2017, Journal of thoracic oncology : official publication of the International Association for the Study of Lung Cancer.

[38]  O. Elemento,et al.  Breast Cancer Histopathological Image Classification: A Deep Learning Approach , 2018, bioRxiv.

[39]  Joel H. Saltz,et al.  Research and applications: Cancer Digital Slide Archive: an informatics resource to support integrated in silico analysis of TCGA pathology data , 2013, J. Am. Medical Informatics Assoc..