Lizard: A Large-Scale Dataset for Colonic Nuclear Instance Segmentation and Classification

The development of deep segmentation models for computational pathology (CPath) can help foster the investigation of interpretable morphological biomarkers. Yet, there is a major bottleneck in the success of such approaches be-cause supervised deep learning models require an abundance of accurately labelled data. This issue is exacerbated in the field of CPath because the generation of detailed annotations usually demands the input of a pathologist to be able to distinguish between different tissue constructs and nuclei. Manually labelling nuclei may not be a feasible approach for collecting large-scale annotated datasets, especially when a single image region can contain thousands of different cells. However, solely relying on automatic generation of annotations will limit the accuracy and reliability of ground truth. Therefore, to help overcome the above challenges, we propose a multi-stage annotation pipeline to enable the collection of large-scale datasets for histology image analysis, with pathologist-in-the-loop refinement steps. Using this pipeline, we generate the largest known nuclear instance segmentation and classification dataset, containing nearly half a million labelled nuclei in H&E stained colon tissue. We have released the dataset and encourage the research community to utilise it to drive forward the development of downstream cell-based models in CPath.

[1]  David B. A. Epstein,et al.  Micro‐Net: A unified model for segmentation of various objects in microscopy images , 2018, Medical Image Anal..

[2]  Thomas J. Fuchs,et al.  Clinical-grade computational pathology using weakly supervised deep learning on whole slide images , 2019, Nature Medicine.

[3]  M. Markman,et al.  Signet-Ring Cell Carcinoma of the Colon: A Case Report and Review of the Literature , 2015, Case Reports in Oncology.

[4]  Hao Chen,et al.  A Multi-Organ Nucleus Segmentation Challenge , 2020, IEEE Transactions on Medical Imaging.

[5]  Liron Pantanowitz,et al.  Artificial Intelligence and Digital Pathology: Challenges and Opportunities , 2018, Journal of pathology informatics.

[6]  Thomas Walter,et al.  Segmentation of Nuclei in Histopathology Images by Deep Regression of the Distance Map , 2019, IEEE Transactions on Medical Imaging.

[7]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[8]  Andrew H. Beck,et al.  Diagnostic Assessment of Deep Learning Algorithms for Detection of Lymph Node Metastases in Women With Breast Cancer , 2017, JAMA.

[9]  Lalana Kagal,et al.  Explaining Explanations: An Overview of Interpretability of Machine Learning , 2018, 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA).

[10]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[11]  Nasir Rajpoot,et al.  NuClick: A Deep Learning Framework for Interactive Segmentation of Microscopy Images , 2020, Medical Image Anal..

[12]  Nasir M. Rajpoot,et al.  Context-Aware Convolutional Neural Network for Grading of Colorectal Cancer Histology Images , 2019, IEEE Transactions on Medical Imaging.

[13]  Allison P. Heath,et al.  Toward a Shared Vision for Cancer Genomic Data. , 2016, The New England journal of medicine.

[14]  Andreas Holzinger,et al.  Towards the Augmented Pathologist: Challenges of Explainable-AI in Digital Pathology , 2017, ArXiv.

[15]  Ny,et al.  NuCLS: A scalable crowdsourcing approach and dataset for nucleus classification and segmentation in breast cancer , 2021, GigaScience.

[16]  B. van Ginneken,et al.  Automated deep-learning system for Gleason grading of prostate cancer using biopsies: a diagnostic study. , 2020, The Lancet. Oncology.

[17]  Zhiwen Yu,et al.  Triple U-net: Hematoxylin-aware nuclei segmentation with progressive dense feature aggregation , 2020, Medical Image Anal..

[18]  Nasir M. Rajpoot,et al.  Classification of lung cancer histology images using patch-level summary statistics , 2018, Medical Imaging.

[19]  Andrew H. Beck,et al.  Human-interpretable image features derived from densely mapped cancer pathology slides predict diverse molecular phenotypes , 2021, Nature Communications.

[20]  Hao Chen,et al.  MILD‐Net: Minimal information loss dilated network for gland instance segmentation in colon histology images , 2018, Medical Image Anal..

[21]  Ming Y. Lu,et al.  Deep Learning-based Computational Pathology Predicts Origins for Cancers of Unknown Primary , 2020, ArXiv.

[22]  Nasir M. Rajpoot,et al.  A bottom-up approach for tumour differentiation in whole slide images of lung adenocarcinoma , 2018, Medical Imaging.

[23]  Geoffrey E. Hinton,et al.  A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.

[24]  Carsten Rother,et al.  Panoptic Segmentation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  David B. A. Epstein,et al.  Rota-Net: Rotation Equivariant Network for Simultaneous Gland and Lumen Segmentation in Colon Histology Images , 2019, ECDP.

[26]  Joel H. Saltz,et al.  Methods for Segmentation and Classification of Digital Microscopy Tissue Images , 2018, Front. Bioeng. Biotechnol..

[27]  Hao Chen,et al.  DCAN: Deep contour‐aware networks for object instance segmentation from histology images , 2017, Medical Image Anal..

[28]  Nasir M. Rajpoot,et al.  Multi-Task Learning in Histo-pathology for Widely Generalizable Model , 2020, ArXiv.

[29]  Jin Tae Kwak,et al.  Hover-Net: Simultaneous segmentation and classification of nuclei in multi-tissue histology images , 2018, Medical Image Anal..

[30]  Nasir Rajpoot,et al.  PanNuke Dataset Extension, Insights and Baselines , 2020, ArXiv.

[31]  C. Sautès-Fridman,et al.  The immune contexture in human tumours: impact on clinical outcome , 2012, Nature Reviews Cancer.

[32]  Thomas J. Fuchs,et al.  Nuc2Vec: Learning Representations of Nuclei in Histopathology Images with Contrastive Loss , 2021, MIDL.

[33]  B. Everitt,et al.  Large sample standard errors of kappa and weighted kappa. , 1969 .

[34]  Jakob Nikolas Kather,et al.  Pan-cancer image-based detection of clinically actionable genetic alterations , 2019, Nature Cancer.

[35]  Jean-Philippe Thiran,et al.  Quantifying Explainers of Graph Neural Networks in Computational Pathology , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Hao Chen,et al.  Gland segmentation in colon histology images: The glas challenge contest , 2016, Medical Image Anal..

[37]  N. M. Rajpoot,et al.  FABnet: feature attention-based network for simultaneous segmentation of microvessels and nerves in routine histology images of oral cancer , 2019, Neural Computing and Applications.

[38]  Lisheng Wang,et al.  MoNuSAC2020: A Multi-Organ Nuclei Segmentation and Classification Challenge , 2021, IEEE Transactions on Medical Imaging.

[39]  D Lansing Taylor,et al.  Explainable AI (xAI) for Anatomic Pathology. , 2020, Advances in anatomic pathology.