A Petri Dish for Histopathology Image Analysis

With the rise of deep learning, there has been increased interest in using neural networks for histopathology image analysis, a field that investigates the properties of biopsy or resected specimens that are traditionally manually examined under a microscope by pathologists. In histopathology image analysis, however, challenges such as limited data, costly annotation, and processing high-resolution and variable-size images create a high barrier of entry and make it difficult to quickly iterate over model designs. Throughout scientific history, many significant research directions have leveraged small-scale experimental setups as petri dishes to efficiently evaluate exploratory ideas, which are then validated in large-scale applications. For instance, the Drosophila fruit fly in genetics and MNIST in computer vision are well-known petri dishes. In this paper, we introduce a minimalist histopathology image analysis dataset (MHIST), an analogous petri dish for histopathology image analysis. MHIST is a binary classification dataset of 3,152 fixed-size images of colorectal polyps, each with a gold-standard label determined by the majority vote of seven board-certified gastrointestinal pathologists. MHIST also includes each image’s annotator agreement level. As a minimalist dataset, MHIST occupies less than 400 MB of disk space, and a ResNet-18 baseline can be trained to convergence on MHIST in just 6 minutes using approximately 3.5 GB of memory on a NVIDIA RTX 3090. As example use cases, we use MHIST to study natural questions that arise in histopathology image classification such as how dataset size, network depth, transfer learning, and high-disagreement examples affect model performance. By introducing MHIST, we hope to not only help facilitate the work of current histopathology imaging researchers, but also make histopathology image analysis more accessible to the general computer vision community. Our dataset is available at https://bmirds. github.io/MHIST/. Figure 1: Key features of our minimalist histopathology image analysis dataset (MHIST).

[1]  Catarina Eloy,et al.  BACH: Grand Challenge on Breast Cancer Histology Images , 2018, Medical Image Anal..

[2]  Joel H. Saltz,et al.  Histopathological Image Analysis Using Model-Based Intermediate Representations and Color Texture: Follicular Lymphoma Grading , 2009, J. Signal Process. Syst..

[3]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[4]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .

[5]  Elizabeth L. Barry,et al.  Evaluation of a Deep Neural Network for Automated Classification of Colorectal Polyps on Histopathologic Slides , 2020, JAMA network open.

[6]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[7]  Andrew H. Beck,et al.  Diagnostic Assessment of Deep Learning Algorithms for Detection of Lymph Node Metastases in Women With Breast Cancer , 2017, JAMA.

[8]  B. Jennings,et al.  Drosophila-a versatile model in biology & medicine , 2011 .

[9]  Karl Rohr,et al.  Predicting breast tumor proliferation from whole‐slide images: The TUPAC16 challenge , 2018, Medical Image Anal..

[10]  Koji Yamazaki,et al.  Weakly-supervised learning for lung carcinoma classification using deep learning , 2020, Scientific Reports.

[11]  Max Welling,et al.  Rotation Equivariant CNNs for Digital Pathology , 2018, MICCAI.

[12]  Yifan Yu,et al.  CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison , 2019, AAAI.

[13]  N. Shepherd,et al.  Observer agreement in the diagnosis of serrated polyps of the large bowel , 2009, Histopathology.

[14]  Ming Zhou,et al.  Pathologist-Level Grading of Prostate Biopsies with Artificial Intelligence , 2019, ArXiv.

[15]  Sasank Chilamkurthy,et al.  Deep learning algorithms for detection of critical findings in head CT scans: a retrospective study , 2018, The Lancet.

[16]  Arndt Hartmann,et al.  A multinational, internet-based assessment of observer variability in the diagnosis of serrated colorectal polyps. , 2007, American journal of clinical pathology.

[17]  Sam Greydanus,et al.  Scaling *down* Deep Learning , 2020, ArXiv.

[18]  Saeed Hassanpour,et al.  Deep Learning for Classification of Colorectal Polyps on Whole-slide Images , 2017, Journal of pathology informatics.

[19]  David Lieberman,et al.  Colorectal Cancer Screening: Recommendations for Physicians and Patients From the U.S. Multi-Society Task Force on Colorectal Cancer. , 2017, Gastroenterology.

[20]  Geert J. S. Litjens,et al.  Learning to detect lymphocytes in immunohistochemistry with deep learning , 2019, Medical Image Anal..

[21]  Ananya Das,et al.  Sessile serrated adenomas: demographic, endoscopic and pathological characteristics. , 2010, World journal of gastroenterology.

[22]  Pheng-Ann Heng,et al.  Weakly supervised 3D deep learning for breast cancer classification and localization of the lesions in MR images , 2019, Journal of magnetic resonance imaging : JMRI.

[23]  Omer Khalid,et al.  Reinterpretation of histology of proximal colon polyps called hyperplastic in 2001. , 2009, World journal of gastroenterology.

[24]  Sebastian Thrun,et al.  Dermatologist-level classification of skin cancer with deep neural networks , 2017, Nature.

[25]  Dayong Wang,et al.  Deep learning assessment of tumor proliferation in breast cancer histological images , 2016, 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[26]  Charles P. Hawkins,et al.  Effects of sample size and network depth on a deep learning approach to species distribution modeling , 2020, Ecol. Informatics.

[27]  Joel Lehman,et al.  Synthetic Petri Dish: A Novel Surrogate Model for Rapid Architecture Search , 2020, ArXiv.

[28]  Saeed Hassanpour,et al.  Automated Detection of Celiac Disease on Duodenal Biopsy Slides: A Deep Learning Approach , 2019, Journal of pathology informatics.

[29]  N. Razavian,et al.  Classification and mutation prediction from non–small cell lung cancer histopathology images using deep learning , 2018, Nature Medicine.

[30]  Shaoqun Zeng,et al.  From Detection of Individual Metastases to Classification of Lymph Node Status at the Patient Level: The CAMELYON17 Challenge , 2019, IEEE Transactions on Medical Imaging.

[31]  B. van Ginneken,et al.  Automated deep-learning system for Gleason grading of prostate cancer using biopsies: a diagnostic study. , 2020, The Lancet. Oncology.

[32]  Luca Maria Gambardella,et al.  Mitosis Detection in Breast Cancer Histology Images with Deep Neural Networks , 2013, MICCAI.

[33]  Mari Mino-Kenudson,et al.  Sessile Serrated Adenoma: Challenging Discrimination From Other Serrated Colonic Polyps , 2008, The American journal of surgical pathology.

[34]  T. Hermanns,et al.  Automated Gleason grading of prostate cancer tissue microarrays via deep learning , 2018, Scientific Reports.

[35]  Quoc V. Le,et al.  Do Better ImageNet Models Transfer Better? , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Anne L. Martel,et al.  Deep neural network models for computational histopathology: A survey , 2019, Medical Image Anal..

[38]  Charles J Kahi,et al.  Sessile serrated polyp prevalence determined by a colonoscopist with a high lesion detection rate and an experienced pathologist. , 2015, Gastrointestinal endoscopy.

[39]  Hai Su,et al.  Pathologist-level interpretable whole-slide cancer diagnosis with deep learning , 2019, Nat. Mach. Intell..

[40]  Yuxiang Xing,et al.  Deep Convolutional Neural Network for Ulcer Recognition in Wireless Capsule Endoscopy: Experimental Feasibility and Optimization , 2019, Comput. Math. Methods Medicine.

[41]  Peter Schirmacher,et al.  The 2019 WHO classification of tumours of the digestive system , 2019, Histopathology.

[42]  S. Hassanpour,et al.  Difficulty Translation in Histopathology Images , 2020, AIME.

[43]  Subhashini Venugopalan,et al.  Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs. , 2016, JAMA.

[44]  Saeed Hassanpour,et al.  Pathologist-level classification of histologic patterns on resected lung adenocarcinoma slides with deep neural networks , 2019, Scientific Reports.

[45]  Saeed Hassanpour,et al.  Learn like a Pathologist: Curriculum Learning by Annotator Agreement for Histopathology Image Classification , 2020, 2021 IEEE Winter Conference on Applications of Computer Vision (WACV).

[46]  Yuan Liu,et al.  DermGAN: Synthetic Generation of Clinical Skin Images with Pathology , 2019, ML4H@NeurIPS.

[47]  Tim Holland-Letz,et al.  Pathologist-level classification of histopathological melanoma images with deep neural networks. , 2019, European journal of cancer.

[48]  Andrew McCallum,et al.  Energy and Policy Considerations for Deep Learning in NLP , 2019, ACL.