Combining weakly and strongly supervised learning improves strong supervision in Gleason pattern classification

Background One challenge to train deep convolutional neural network (CNNs) models with whole slide images (WSIs) is providing the required large number of costly, manually annotated image regions. Strategies to alleviate the scarcity of annotated data include: using transfer learning, data augmentation and training the models with less expensive image-level annotations (weakly-supervised learning). However, it is not clear how to combine the use of transfer learning in a CNN model when different data sources are available for training or how to leverage from the combination of large amounts of weakly annotated images with a set of local region annotations. This paper aims to evaluate CNN training strategies based on transfer learning to leverage the combination of weak and strong annotations in heterogeneous data sources. The trade-off between classification performance and annotation effort is explored by evaluating a CNN that learns from strong labels (region annotations) and is later fine-tuned on a dataset with less expensive weak (image-level) labels. Results As expected, the model performance on strongly annotated data steadily increases as the percentage of strong annotations that are used increases, reaching a performance comparable to pathologists ( $$\kappa = 0.691 \pm 0.02$$ κ = 0.691 ± 0.02 ). Nevertheless, the performance sharply decreases when applied for the WSI classification scenario with $$\kappa = 0.307 \pm 0.133$$ κ = 0.307 ± 0.133 . Moreover, it only provides a lower performance regardless of the number of annotations used. The model performance increases when fine-tuning the model for the task of Gleason scoring with the weak WSI labels $$\kappa = 0.528 \pm 0.05$$ κ = 0.528 ± 0.05 . Conclusion Combining weak and strong supervision improves strong supervision in classification of Gleason patterns using tissue microarrays (TMA) and WSI regions. Our results contribute very good strategies for training CNN models combining few annotated data and heterogeneous data sources. The performance increases in the controlled TMA scenario with the number of annotations used to train the model. Nevertheless, the performance is hindered when the trained TMA model is applied directly to the more challenging WSI classification problem. This demonstrates that a good pre-trained model for prostate cancer TMA image classification may lead to the best downstream model if fine-tuned on the WSI target dataset. We have made available the source code repository for reproducing the experiments in the paper: https://github.com/ilmaro8/Digital_Pathology_Transfer_Learning

[1]  B. Delahunt,et al.  International Society of Urological Pathology (ISUP) grading of prostate cancer – An ISUP consensus on contemporary grading , 2016, APMIS : acta pathologica, microbiologica, et immunologica Scandinavica.

[2]  Fabio A. González,et al.  Training Deep Convolutional Neural Networks with Active Learning for Exudate Classification in Eye Fundus Images , 2017, CVII-STENT/LABELS@MICCAI.

[3]  Andreas Holzinger,et al.  Augmentor: An Image Augmentation Library for Machine Learning , 2017, J. Open Source Softw..

[4]  Arkadiusz Gertych,et al.  An attention-based multi-resolution model for prostate whole slide imageclassification and localization , 2019, ArXiv.

[5]  Max Welling,et al.  Attention-based Deep Multiple Instance Learning , 2018, ICML.

[6]  Francesco Ciompi,et al.  Neural Image Compression for Gigapixel Histopathology Image Analysis , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Josien P. W. Pluim,et al.  Not‐so‐supervised: A survey of semi‐supervised, multi‐instance, and transfer learning in medical image analysis , 2018, Medical Image Anal..

[8]  Yoshua Bengio,et al.  Deep Learning of Representations for Unsupervised and Transfer Learning , 2011, ICML Unsupervised and Transfer Learning.

[9]  T. Hermanns,et al.  Automated Gleason grading of prostate cancer tissue microarrays via deep learning , 2018, Scientific Reports.

[10]  Ellery Wulczyn,et al.  Development and validation of a deep learning algorithm for improving Gleason scoring of prostate cancer , 2018, npj Digital Medicine.

[11]  L. Egevad,et al.  A Contemporary Prostate Cancer Grading System: A Validated Alternative to the Gleason Score. , 2016, European urology.

[12]  Sung Il Hwang,et al.  A Weak and Semi-supervised Segmentation Method for Prostate Cancer in TRUS Images , 2020, Journal of Digital Imaging.

[13]  Bram van Ginneken,et al.  A survey on deep learning in medical image analysis , 2017, Medical Image Anal..

[14]  Thomas J. Fuchs,et al.  Clinical-grade computational pathology using weakly supervised deep learning on whole slide images , 2019, Nature Medicine.

[15]  A. Madabhushi,et al.  HistoQC: An Open-Source Quality Control Tool for Digital Pathology Slides. , 2019, JCO clinical cancer informatics.

[16]  Benjamin Recht,et al.  Do ImageNet Classifiers Generalize to ImageNet? , 2019, ICML.

[17]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[18]  Francesco Ciompi,et al.  No pixel-level annotations needed , 2019, Nature Biomedical Engineering.

[19]  Henning Müller,et al.  Staining Invariant Features for Improving Generalization of Deep Convolutional Neural Networks in Computational Pathology , 2019, Front. Bioeng. Biotechnol..

[20]  Ming Zhou,et al.  Pathologist-Level Grading of Prostate Biopsies with Artificial Intelligence , 2019, ArXiv.

[21]  Raphaël Marée,et al.  Comparison of Deep Transfer Learning Strategies for Digital Pathology , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[22]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[23]  Henning Müller,et al.  Systematic comparison of deep learning strategies for weakly supervised Gleason grading , 2020, Medical Imaging: Digital Pathology.

[24]  J. Epstein An update of the Gleason grading system. , 2010, The Journal of urology.

[25]  Liron Pantanowitz,et al.  Routine Digital Pathology Workflow: The Catania Experience , 2017, Journal of pathology informatics.

[26]  Mats Andersson,et al.  Convolutional neural networks for an automatic classification of prostate tissue slides with high-grade Gleason score , 2017, Medical Imaging.

[27]  Nassir Navab,et al.  Structure-Preserving Color Normalization and Sparse Stain Separation for Histological Images , 2016, IEEE Transactions on Medical Imaging.

[28]  Su-Lin Lee,et al.  Intravascular Imaging and Computer Assisted Stenting, and Large-Scale Annotation of Biomedical Data and Expert Label Synthesis , 2017, Lecture Notes in Computer Science.

[29]  Yoshua Bengio,et al.  How transferable are features in deep neural networks? , 2014, NIPS.

[30]  A. Jemal,et al.  Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries , 2018, CA: a cancer journal for clinicians.

[31]  Franccois Fleuret,et al.  Processing Megapixel Images with Deep Attention-Sampling Models , 2019, ICML.

[32]  Nima Tajbakhsh,et al.  Convolutional Neural Networks for Medical Image Analysis: Full Training or Fine Tuning? , 2016, IEEE Transactions on Medical Imaging.

[33]  Manfred Claassen,et al.  Coupling weak and strong supervision for classification of prostate cancer histopathology images , 2018, ArXiv.

[34]  Manfredo Atzori,et al.  Deep Learning-Based Retrieval System for Gigapixel Histopathology Cases and the Open Access Literature , 2018, bioRxiv.

[35]  Hilde van der Togt,et al.  Publisher's Note , 2003, J. Netw. Comput. Appl..

[36]  Steven J. M. Jones,et al.  The Molecular Taxonomy of Primary Prostate Cancer , 2015, Cell.

[37]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[38]  Bram van Ginneken,et al.  Automated Gleason Grading of Prostate Biopsies using Deep Learning , 2019, ArXiv.

[39]  Mats Andersson,et al.  Tumor proliferation assessment of whole slide images , 2018, Medical Imaging.

[40]  Geert J. S. Litjens,et al.  Quantifying the effects of data augmentation and stain color normalization in convolutional neural networks for computational pathology , 2019, Medical Image Anal..

[41]  Daisuke Komura,et al.  Machine Learning Methods for Histopathological Image Analysis , 2017, Computational and structural biotechnology journal.

[42]  Mats Andersson,et al.  Segmenting Potentially Cancerous Areas in Prostate Biopsies using Semi-Automatically Annotated Data , 2019, MIDL.

[43]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[44]  Manfredo Atzori,et al.  Semi-weakly Supervised Learning for Prostate Cancer Image Classification with Teacher-Student Deep Convolutional Networks , 2020, iMIMIC/MIL3iD/LABELS@MICCAI.

[45]  Stephen M. Moore,et al.  The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository , 2013, Journal of Digital Imaging.