PanNuke Dataset Extension, Insights and Baselines

The emerging area of computational pathology (CPath) is ripe ground for the application of deep learning (DL) methods to healthcare due to the sheer volume of raw pixel data in whole-slide images (WSIs) of cancerous tissue slides. However, it is imperative for the DL algorithms relying on nuclei-level details to be able to cope with data from `the clinical wild', which tends to be quite challenging. We study, and extend recently released PanNuke dataset consisting of ~200,000 nuclei categorized into 5 clinically important classes for the challenging tasks of segmenting and classifying nuclei in WSIs. Previous pan-cancer datasets consisted of only up to 9 different tissues and up to 21,000 unlabeled nuclei and just over 24,000 labeled nuclei with segmentation masks. PanNuke consists of 19 different tissue types that have been semi-automatically annotated and quality controlled by clinical pathologists, leading to a dataset with statistics similar to the clinical wild and with minimal selection bias. We study the performance of segmentation and classification models when applied to the proposed dataset and demonstrate the application of models trained on PanNuke to whole-slide images. We provide comprehensive statistics about the dataset and outline recommendations and research directions to address the limitations of existing DL tools when applied to real-world CPath applications.

[1]  Luke Oakden-Rayner,et al.  Exploring large scale public medical image datasets , 2019, Academic radiology.

[2]  Alexander W. Jung,et al.  Pan-cancer computational histopathology reveals mutations, tumor composition and prognosis , 2019, Nature Cancer.

[3]  Alejandro F. Frangi,et al.  Is the winner really the best? A critical analysis of common research practice in biomedical image analysis competitions , 2018, ArXiv.

[4]  Aaron Carass,et al.  Why rankings of biomedical image analysis competitions should be interpreted with care , 2018, Nature Communications.

[5]  Aleksey Boyko,et al.  Detecting Cancer Metastases on Gigapixel Pathology Images , 2017, ArXiv.

[6]  Surabhi Bhargava,et al.  A Dataset and a Technique for Generalized Nuclear Segmentation for Computational Pathology , 2017, IEEE Transactions on Medical Imaging.

[7]  Vincent Lepetit,et al.  You Should Use Regression to Detect Cells , 2015, MICCAI.

[8]  Yoshua Bengio,et al.  Measuring the tendency of CNNs to Learn Surface Statistical Regularities , 2017, ArXiv.

[9]  Sebastian Thrun,et al.  Dermatologist-level classification of skin cancer with deep neural networks , 2017, Nature.

[10]  Hao Chen,et al.  A Multi-Organ Nucleus Segmentation Challenge , 2020, IEEE Transactions on Medical Imaging.

[11]  Bin Xu,et al.  Large-Scale Annotation of Histopathology Images from Social Media , 2018, bioRxiv.

[12]  George Lee,et al.  Nuclear Shape and Architecture in Benign Fields Predict Biochemical Recurrence in Prostate Cancer Patients Following Radical Prostatectomy: Preliminary Findings. , 2016, European urology focus.

[13]  Konstantinos N. Plataniotis,et al.  Atlas of Digital Pathology: A Generalized Hierarchical Histological Tissue Type-Annotated Database for Deep Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Nasir M. Rajpoot,et al.  Locality Sensitive Deep Learning for Detection and Classification of Nuclei in Routine Colon Cancer Histology Images , 2016, IEEE Trans. Medical Imaging.

[15]  Rajarsi R. Gupta,et al.  Spatial Organization and Molecular Correlation of Tumor-Infiltrating Lymphocytes Using Deep Learning on Pathology Images. , 2018, Cell reports.

[16]  Gustavo Carneiro,et al.  Hidden stratification causes clinically meaningful failures in machine learning for medical imaging , 2019, CHIL.

[17]  Carsten Rother,et al.  Panoptic Segmentation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  András György,et al.  Detecting Overfitting via Adversarial Examples , 2019, NeurIPS.

[19]  Konstantinos N. Plataniotis,et al.  HistoSegNet: Semantic Segmentation of Histological Tissue Type in Whole Slide Images , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[20]  Thomas Walter,et al.  Segmentation of Nuclei in Histopathology Images by Deep Regression of the Distance Map , 2019, IEEE Transactions on Medical Imaging.

[21]  Hai Su,et al.  Efficient and robust cell detection: A structured regression approach , 2018, Medical Image Anal..

[22]  Roman Monczak,et al.  Computer-Aided Breast Cancer Diagnosis Based on the Analysis of Cytological Images of Fine Needle Biopsies , 2013, IEEE Transactions on Medical Imaging.

[23]  David B. A. Epstein,et al.  Cellular Community Detection for Tissue Phenotyping in Histology Images , 2018, COMPAY/OMIA@MICCAI.

[24]  Abubakar Abid,et al.  Interpretation of Neural Networks is Fragile , 2017, AAAI.

[25]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[26]  Andrew H. Beck,et al.  Abstract LB-285: Computational pathology for predicting prostate cancer recurrence , 2015 .

[27]  Matthias Bethge,et al.  Approximating CNNs with Bag-of-local-Features models works surprisingly well on ImageNet , 2019, ICLR.

[28]  Daniel Smilkov,et al.  Similar image search for histopathology: SMILY , 2019, npj Digital Medicine.

[29]  Ellery Wulczyn,et al.  Development and validation of a deep learning algorithm for improving Gleason scoring of prostate cancer , 2018, npj Digital Medicine.

[30]  Joel H. Saltz,et al.  Methods for Segmentation and Classification of Digital Microscopy Tissue Images , 2018, Front. Bioeng. Biotechnol..

[31]  Helen Pitman,et al.  Artificial intelligence in digital pathology: a roadmap to routine use in clinical practice , 2019, The Journal of pathology.

[32]  Matthias Bethge,et al.  Generalisation in humans and deep neural networks , 2018, NeurIPS.

[33]  Andrew H. Beck,et al.  Systematic Analysis of Breast Cancer Morphology Uncovers Stromal Features Associated with Survival , 2011, Science Translational Medicine.

[34]  David B. A. Epstein,et al.  Micro‐Net: A unified model for segmentation of various objects in microscopy images , 2018, Medical Image Anal..

[35]  Hao Chen,et al.  Gland segmentation in colon histology images: The glas challenge contest , 2016, Medical Image Anal..

[36]  Nasir M. Rajpoot,et al.  PanNuke: An Open Pan-Cancer Histology Dataset for Nuclei Instance Segmentation and Classification , 2019, ECDP.

[37]  Jonathan Baxter,et al.  A Model of Inductive Bias Learning , 2000, J. Artif. Intell. Res..

[38]  Nasir M. Rajpoot,et al.  Prognostic significance of automated score of tumor infiltrating lymphocytes in oral cancer. , 2018 .

[39]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[40]  Andre Esteva,et al.  A guide to deep learning in healthcare , 2019, Nature Medicine.

[41]  Adrian V. Lee,et al.  An Integrated TCGA Pan-Cancer Clinical Data Resource to Drive High-Quality Survival Outcome Analytics , 2018, Cell.

[42]  Metin Nafi Gürcan,et al.  Adaptive Discriminant Wavelet Packet Transform and Local Binary Patterns for Meningioma Subtype Classification , 2008, MICCAI.

[43]  Thomas J. Fuchs,et al.  Terabyte-scale Deep Multiple Instance Learning for Classification and Localization in Pathology , 2018, ArXiv.

[44]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[45]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[46]  Jin Tae Kwak,et al.  Hover-Net: Simultaneous segmentation and classification of nuclei in multi-tissue histology images , 2018, Medical Image Anal..

[47]  Jens Rittscher,et al.  Image-based consensus molecular subtype classification (imCMS) of colorectal cancer using deep learning , 2019, bioRxiv.

[48]  Alexander W. Jung,et al.  Pan-cancer computational histopathology reveals mutations, tumor composition and prognosis , 2019, Nature Cancer.

[49]  Jakob Nikolas Kather,et al.  Pan-cancer image-based detection of clinically actionable genetic alterations , 2019, Nature Cancer.

[50]  Bahram Parvin,et al.  Invariant Delineation of Nuclear Architecture in Glioblastoma Multiforme for Clinical and Molecular Association , 2013, IEEE Transactions on Medical Imaging.

[51]  Nasir Rajpoot,et al.  NuClick: From Clicks in the Nuclei to Nuclear Boundaries , 2019, ArXiv.