DSNet: A Dual-Stream Framework for Weakly-Supervised Gigapixel Pathology Image Analysis

We present a novel weakly-supervised framework for classifying whole slide images (WSIs). WSIs, due to their gigapixel resolution, are commonly processed by patch-wise classification with patch-level labels. However, patch-level labels require precise annotations, which is expensive and usually unavailable on clinical data. With image-level labels only, patch-wise classification would be sub-optimal due to inconsistency between the patch appearance and image-level label. To address this issue, we posit that WSI analysis can be effectively conducted by integrating information at both high magnification (local) and low magnification (regional) levels. We auto-encode the visual signals in each patch into a latent embedding vector representing local information, and down-sample the raw WSI to hardware-acceptable thumbnails representing regional information. The WSI label is then predicted with a Dual-Stream Network (DSNet), which takes the transformed local patch embeddings and multi-scale thumbnail images as inputs and can be trained by the image-level label only. Experiments conducted on two large-scale public datasets demonstrate that our method outperforms all recent stateof-the-art weakly-supervised WSI classification methods.

[1]  Kun Zhao,et al.  SOS: Selective Objective Switch for Rapid Immunofluorescence Whole Slide Image Classification , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  D. V. van Essen,et al.  A neurobiological model of visual attention and invariant pattern recognition based on dynamic routing of information , 1993, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[3]  Chaoyi Zhang,et al.  BiO-Net: Learning Recurrent Bi-directional Connections for Encoder-Decoder Architecture , 2020, MICCAI.

[4]  C. Koch,et al.  Computational modelling of visual attention , 2001, Nature Reviews Neuroscience.

[5]  Abhishek Das,et al.  Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[6]  Qilong Wang,et al.  Global Second-Order Pooling Convolutional Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Linda G. Shapiro,et al.  Multi-Instance Multi-Label Learning for Multi-Class Classification of Whole Slide Breast Histopathology Images , 2018, IEEE Transactions on Medical Imaging.

[8]  Kwang In Kim,et al.  Look here! A parametric learning based approach to redirect visual attention , 2020, ECCV.

[9]  Pheng-Ann Heng,et al.  Fast ScanNet: Fast and Dense Analysis of Multi-Gigapixel Whole-Slide Images for Cancer Metastasis Detection , 2019, IEEE Transactions on Medical Imaging.

[10]  Jianyuan Guo,et al.  GhostNet: More Features From Cheap Operations , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Joel H. Saltz,et al.  Sparse Autoencoder for Unsupervised Nucleus Detection and Representation in Histopathology Images , 2017, Pattern Recognit..

[12]  Aaas News,et al.  Book Reviews , 1893, Buffalo Medical and Surgical Journal.

[13]  S. Sankaran,et al.  Low-altitude, high-resolution aerial imaging systems for row and field crop phenotyping: A review , 2015 .

[14]  Xiaoning Qian,et al.  Collaborative Global-Local Networks for Memory-Efficient Segmentation of Ultra-High Resolution Images , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Joel H. Saltz,et al.  Efficient Multiple Instance Convolutional Neural Networks for Gigapixel Resolution Image Classification , 2015, ArXiv.

[16]  Francesco Ciompi,et al.  Neural Image Compression for Gigapixel Histopathology Image Analysis , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Xiaogang Wang,et al.  Residual Attention Network for Image Classification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Youmin Zhang,et al.  A survey on technologies for automatic forest fire monitoring, detection, and fighting using unmanned aerial vehicles and remote sensing techniques , 2015 .

[20]  Joel H. Saltz,et al.  Patch-Based Convolutional Neural Network for Whole Slide Tissue Image Classification , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Frank Hutter,et al.  Decoupled Weight Decay Regularization , 2017, ICLR.

[22]  Kevin Barraclough,et al.  I and i , 2001, BMJ : British Medical Journal.

[23]  Qilong Wang,et al.  ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[25]  Po-Sen Huang,et al.  Towards Robust Image Classification Using Sequential Attention Models , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[27]  Lu Yuan,et al.  Dynamic Convolution: Attention Over Convolution Kernels , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Catarina Eloy,et al.  Classification of breast cancer histology images using Convolutional Neural Networks , 2017, PloS one.

[29]  Jian Yang,et al.  Selective Kernel Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Andrew H. Beck,et al.  Diagnostic Assessment of Deep Learning Algorithms for Detection of Lymph Node Metastases in Women With Breast Cancer , 2017, JAMA.

[31]  Xiaofei Wang,et al.  Attention Based Glaucoma Detection: A Large-Scale Database and CNN Model , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Thomas Brox,et al.  Sparsity Invariant CNNs , 2017, 2017 International Conference on 3D Vision (3DV).

[33]  Philip Chikontwe,et al.  Multiple Instance Learning with Center Embeddings for Histopathology Classification , 2020, MICCAI.

[34]  Fan Yang,et al.  Predicting Lymph Node Metastasis Using Histopathological Images Based on Multiple Instance Learning With Deep Graph Convolution , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Enhua Wu,et al.  Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[37]  In-So Kweon,et al.  CBAM: Convolutional Block Attention Module , 2018, ECCV.

[38]  A. H. Robinson,et al.  Results of a prototype television bandwidth compression scheme , 1967 .

[39]  Pheng-Ann Heng,et al.  RMDL: Recalibrated multi-instance deep learning for whole slide gastric image classification , 2019, Medical Image Anal..

[40]  Zhiguo Jiang,et al.  Histopathological Whole Slide Image Analysis Using Context-Based CBIR , 2018, IEEE Transactions on Medical Imaging.

[41]  Mark Sandler,et al.  MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[42]  Qitao Huang,et al.  Weakly Supervised Learning for Whole Slide Lung Cancer Image Classification , 2018 .

[43]  Nassir Navab,et al.  Concurrent Spatial and Channel Squeeze & Excitation in Fully Convolutional Networks , 2018, MICCAI.

[44]  Fan Zhang,et al.  PDAM: A Panoptic-Level Feature Alignment Framework for Unsupervised Domain Adaptive Instance Segmentation in Microscopy Images , 2020, IEEE Transactions on Medical Imaging.

[45]  Junzhou Huang,et al.  Rectified Cross-Entropy and Upper Transition Loss for Weakly Supervised Whole Slide Image Classifier , 2019, MICCAI.

[46]  Mei Chen,et al.  Low Dimensional Representation of Fisher Vectors for Microscopy Image Classification , 2017, IEEE Transactions on Medical Imaging.

[47]  Zhenheng Yang,et al.  SPAN: Spatial Pyramid Attention Network forImage Manipulation Localization , 2020, ECCV.

[48]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[49]  Hidekata Hontani,et al.  Multi-scale Domain-adversarial Multiple-instance CNN for Cancer Subtype Classification with Unannotated Histopathological Images , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  Ming Y. Lu,et al.  AI-based pathology predicts origins for cancers of unknown primary , 2020, Nature.

[51]  Xiaoli Z. Fern,et al.  A Novel Attribute-Based Symmetric Multiple Instance Learning for Histopathological Image Analysis , 2020, IEEE Transactions on Medical Imaging.

[52]  Mahdi S. Hosseini,et al.  Focus Quality Assessment of High-Throughput Whole Slide Imaging in Digital Pathology , 2018, IEEE Transactions on Medical Imaging.

[53]  Neil Genzlinger A. and Q , 2006 .

[54]  W. Marsden I and J , 2012 .

[55]  Chongruo Wu,et al.  ResNeSt: Split-Attention Networks , 2020, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[56]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[57]  Heng Huang,et al.  BiX-NAS: Searching Efficient Bi-directional Architecture for Medical Image Segmentation , 2021, MICCAI.

[58]  Dayong Wang,et al.  Deep Learning for Identifying Metastatic Breast Cancer , 2016, ArXiv.

[59]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[60]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[61]  Ryoma Bise,et al.  Adaptive Weighting Multi-Field-Of-View CNN for Semantic Segmentation in Pathology , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).