BAYES-MIL: A NEW PROBABILISTIC PERSPECTIVE

Multiple instance learning (MIL) is a popular weakly-supervised learning model on the whole slide image (WSI) for AI-assisted pathology diagnosis. The recent advance in attention-based MIL allows the model to find its region-of-interest (ROI) for interpretation by learning the attention weights for image patches of WSI slides. However, we empirically find that the interpretability of some related methods is either untrustworthy as the principle of MIL is violated or unsatisfactory as the high-attention regions are not consistent with experts’ annotations. In this paper, we propose Bayes-MIL to address the problem from a probabilistic perspective. The induced patch-level uncertainty is proposed as a new measure of MIL interpretability, which outperforms previous methods in matching doctors annotations. We design a slide-dependent patch regularizer (SDPR) for the attention, imposing constraints derived from the MIL assumption, on the attention distribution. SDPR explicitly constrains the model to generate correct attention values. The spatial information is further encoded by an approximate convolutional conditional random field (CRF), for better interpretability. Experimental results show Bayes-MIL outperforms the related methods in patch-level and slide-level metrics and provides much better interpretable ROI on several large-scale WSI datasets.

[1]  R. G. Krishnan,et al.  Scaling Vision Transformers to Gigapixel Images via Hierarchical Self-Supervised Learning , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Yitian Zhao,et al.  DTFD-MIL: Double-Tier Feature Distillation Multiple Instance Learning for Histopathology Whole Slide Image Classification , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Xiangyang Ji,et al.  TransMIL: Transformer based Correlated Multiple Instance Learning for Whole Slide Image Classication , 2021, NeurIPS.

[4]  Dimitris Samaras,et al.  A Joint Spatial and Magnification Based Attention Framework for Large Scale Histopathology Classification , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[5]  Antoni B. Chan,et al.  Bayesian Nested Neural Networks for Uncertainty Calibration and Adaptive Compression , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  K. Eliceiri,et al.  Dual-stream Multiple Instance Learning Network for Whole Slide Image Classification with Self-supervised Contrastive Learning , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  R. Socher,et al.  Deep learning-enabled breast cancer hormonal receptor status determination from base-level H&E stains , 2020, Nature Communications.

[8]  Michael W. Dusenberry,et al.  Efficient and Scalable Bayesian Neural Nets with Rank-1 Factors , 2020, ICML.

[9]  Ming Y. Lu,et al.  Data-efficient and weakly supervised computational pathology on whole-slide images , 2020, Nature Biomedical Engineering.

[10]  Dustin Tran,et al.  BatchEnsemble: An Alternative Approach to Efficient Ensemble and Lifelong Learning , 2020, ICLR.

[11]  Thomas J. Fuchs,et al.  Clinical-grade computational pathology using weakly supervised deep learning on whole slide images , 2019, Nature Medicine.

[12]  Qiao Li,et al.  Accelerating Monte Carlo Bayesian Inference via Approximating Predictive Uncertainty over Simplex , 2019, ArXiv.

[13]  Hai Su,et al.  Pathologist-level interpretable whole-slide cancer diagnosis with deep learning , 2019, Nat. Mach. Intell..

[14]  Andrey Malinin,et al.  Ensemble Distribution Distillation , 2019, ICLR.

[15]  Shaoqun Zeng,et al.  From Detection of Individual Metastases to Classification of Lymph Node Status at the Patient Level: The CAMELYON17 Challenge , 2019, IEEE Transactions on Medical Imaging.

[16]  Roberto Cipolla,et al.  Convolutional CRFs for Semantic Segmentation , 2018, BMVC.

[17]  Mark J. F. Gales,et al.  Predictive Uncertainty Estimation via Prior Networks , 2018, NeurIPS.

[18]  Max Welling,et al.  Attention-based Deep Multiple Instance Learning , 2018, ICML.

[19]  Andrew H. Beck,et al.  Diagnostic Assessment of Deep Learning Algorithms for Detection of Lymph Node Metastases in Women With Breast Cancer , 2017, JAMA.

[20]  Alex Kendall,et al.  Concrete Dropout , 2017, NIPS.

[21]  Alex Kendall,et al.  What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? , 2017, NIPS.

[22]  Zoubin Ghahramani,et al.  Deep Bayesian Active Learning with Image Data , 2017, ICML.

[23]  Dmitry P. Vetrov,et al.  Variational Dropout Sparsifies Deep Neural Networks , 2017, ICML.

[24]  Diederik P. Kingma,et al.  Variational Dropout and the Local Reparameterization Trick , 2015, NIPS.

[25]  Vibhav Vineet,et al.  Conditional Random Fields as Recurrent Neural Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[26]  Zoubin Ghahramani,et al.  Bayesian Active Learning for Classification and Preference Learning , 2011, ArXiv.

[27]  Yarin Gal,et al.  Uncertainty in Deep Learning , 2016 .