H and E stain augmentation improves generalization of convolutional networks for histopathological mitosis detection

The number of mitotic figures per tumor area observed in hematoxylin and eosin (H and E) histological tissue sections under light microscopy is an important biomarker for breast cancer prognosis. Whole-slide imaging and computational pathology have enabled the development of automatic mitosis detection algorithms based on convolutional neural networks (CNNs). These models can suffer from high generalization error, i.e. trained networks often underperform on datasets originating from pathology laboratories different than the one that provided the training data, mainly due to the presence of inter-laboratory stain variations. We propose a novel data augmentation strategy that exploits the properties of the H and E color space to simulate a broad range of realistic H and E stain variations. To our best knowledge, this is the first time that data augmentation is performed directly in the H and E color space, instead of RGB. The proposed technique uses color deconvolution to transform RGB images into the H and E color space, modifies the H and E color channels stochastically, and projects them back to RGB space. We trained a CNN-based mitosis detector on homogeneous data from a single institution, and tested its performance on an external, multicenter cohort that contained a wide range of unseen H and E stain variations. We compared CNNs trained with and without the proposed augmentation strategy and observed a significant improvement in performance and robustness to unseen stain variations when the new color augmentation technique was included. In essence, we have shown that CNNs can be made robust to inter-lab stain variation by incorporating extensive stain augmentation techniques.

[1]  Mitko Veta,et al.  Mitosis Counting in Breast Cancer: Object-Level Interobserver Agreement and Comparison to an Automatic Method , 2016, PloS one.

[2]  Nico Karssemeijer,et al.  Stain Specific Standardization of Whole-Slide Histopathological Images , 2016, IEEE Transactions on Medical Imaging.

[3]  Aleksey Boyko,et al.  Detecting Cancer Metastases on Gigapixel Pathology Images , 2017, ArXiv.

[4]  H. Bloom,et al.  Histological Grading and Prognosis in Breast Cancer , 1957, British Journal of Cancer.

[5]  Colin Raffel,et al.  Lasagne: First release. , 2015 .

[6]  Maria S. Kulikova,et al.  Mitosis detection in breast cancer histological images An ICPR 2012 contest , 2013, Journal of pathology informatics.

[7]  Patrice Y. Simard,et al.  Best practices for convolutional neural networks applied to visual document analysis , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[8]  Nasir M. Rajpoot,et al.  A Nonlinear Mapping Approach to Stain Normalization in Digital Histopathology Images Using Image-Specific Color Deconvolution , 2014, IEEE Transactions on Biomedical Engineering.

[9]  Luca Maria Gambardella,et al.  Mitosis Detection in Breast Cancer Histology Images with Deep Neural Networks , 2013, MICCAI.

[10]  John Salvatier,et al.  Theano: A Python framework for fast computation of mathematical expressions , 2016, ArXiv.

[11]  J. S. Marron,et al.  A method for normalizing histology slides for quantitative analysis , 2009, 2009 IEEE International Symposium on Biomedical Imaging: From Nano to Macro.

[12]  A. Ruifrok,et al.  Quantification of histochemical staining by color deconvolution. , 2001, Analytical and quantitative cytology and histology.

[13]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[14]  Luca Maria Gambardella,et al.  Assessment of algorithms for mitosis detection in breast cancer histopathology images , 2014, Medical Image Anal..

[15]  Nicolas Courty,et al.  Mitosis detection in breast cancer histological images with mathematical morphology , 2013, 2013 21st Signal Processing and Communications Applications Conference (SIU).