HAL: Hybrid active learning for efficient labeling in medical domain

Abstract The success of the deep convolutional neural networks in computer vision tasks mainly relies on massive labeled training data. However, in the field of medical images, it is difficult to construct large labeled datasets since the labeling of medical images is time-consuming, labor-intensive, and medical expertise demanded. To meet the challenge, we propose a hybrid active learning framework HAL for efficient labeling in the medical domain, which integrates active learning into deep learning to reduce the cost of manual labeling and take the advantages of deep neural networks. The proposed HAL utilizes a hybrid sampling strategy considering both sample diversity and prediction loss simultaneously. The effectiveness and efficiency of proposed HAL are validated on three medical image datasets. The experimental results show that the proposed HAL outperforms several state-of-the-art active learning methods. On the Hyper-Kvasir Dataset, with only 10% of the labels, the HAL achieves 95% performance of the deep learning method trained on the entire dataset. The quantitative and qualitative analysis proves that HAL can greatly reduce the number of labels needed for training a deep neural network, which is robust to address efficient labeling problems even with imbalanced data distribution.

[1]  Guy Cazuguel,et al.  FEEDBACK ON A PUBLICLY DISTRIBUTED IMAGE DATABASE: THE MESSIDOR DATABASE , 2014 .

[2]  Nicholas Ayache,et al.  Fine-tuned convolutional neural nets for cardiac MRI acquisition plane recognition , 2017, Comput. methods Biomech. Biomed. Eng. Imaging Vis..

[3]  Hao Chen,et al.  Standard Plane Localization in Fetal Ultrasound via Domain Transferred Deep Neural Networks , 2015, IEEE Journal of Biomedical and Health Informatics.

[4]  Raquel Urtasun,et al.  Latent Structured Active Learning , 2013, NIPS.

[5]  Jin Yuan,et al.  Multi-criteria active deep learning for image classification , 2019, Knowl. Based Syst..

[6]  Catarina Eloy,et al.  BACH: Grand Challenge on Breast Cancer Histology Images , 2018, Medical Image Anal..

[7]  Ruimao Zhang,et al.  Cost-Effective Active Learning for Deep Image Classification , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[8]  Ronald M. Summers,et al.  Holistic classification of CT attenuation patterns for interstitial lung diseases via deep convolutional neural networks , 2018, Comput. methods Biomech. Biomed. Eng. Imaging Vis..

[9]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[10]  Lise Getoor,et al.  Link-based Active Learning , 2009, NIPS 2009.

[11]  Zongwei Zhou,et al.  Integrating Active Learning and Transfer Learning for Carotid Intima-Media Thickness Video Interpretation , 2018, Journal of Digital Imaging.

[12]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[13]  Arnold W. M. Smeulders,et al.  Active learning using pre-clustering , 2004, ICML.

[14]  Gustavo Carneiro,et al.  Unregistered Multiview Mammogram Analysis with Pre-trained Deep Learning Models , 2015, MICCAI.

[15]  Nikolaos Papanikolopoulos,et al.  Multi-class active learning for image classification , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  In So Kweon,et al.  Learning Loss for Active Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Jae Y. Shin,et al.  Convolutional Neural Networks for Medical Image Analysis: Full Training or Fine Tuning? , 2016, IEEE transactions on medical imaging.

[18]  Yi Yang,et al.  Multi-Class Active Learning by Uncertainty Sampling with Diversity Maximization , 2015, International Journal of Computer Vision.

[19]  Lei Zhang,et al.  Fine-Tuning Convolutional Neural Networks for Biomedical Image Analysis: Actively and Incrementally , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Kristen Grauman,et al.  Large-Scale Live Active Learning: Training Object Detectors with Crawled Data and Crowds , 2011, CVPR 2011.

[21]  Georg Langs,et al.  Unsupervised Pre-training Across Image Domains Improves Lung Tissue Classification , 2014, MCV.