Active convolutional neural networks for cancerous tissue recognition

Deep neural networks typically require large amounts of annotated data to be trained effectively. However, in several scientific disciplines, including medical image analysis, generating such large annotated datasets requires specialized domain knowledge, and hence is usually very expensive. In this work, we present a novel application of active learning to data sample selection for training Convolutional Neural Networks (CNN) for Cancerous Tissue Recognition (CTR). Our main idea is to steer annotation efforts towards selecting the most informative samples for training the CNN. To quantify informativeness, we explore three choices based on discrete entropy, best-vs-second-best, and k-nearest neighbor agreement. Our results on three different types of cancer datasets consistently demonstrate that under limited annotated samples, our proposed training scheme converges faster than classical randomized stochastic gradient descent, while achieving the same (or sometimes superior) classification accuracy.

[1]  Nikolaos Papanikolopoulos,et al.  Scalable Active Learning for Multiclass Image Classification , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Nikolaos Papanikolopoulos,et al.  Multi-class active learning for image classification , 2009, CVPR.

[3]  Trevor Darrell,et al.  Active Learning with Gaussian Processes for Object Categorization , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[4]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[5]  Vassilios Morellas,et al.  Evaluation of feature descriptors for cancerous tissue recognition , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[6]  Kristen Grauman,et al.  Large-scale live active learning: Training object detectors with crawled data and crowds , 2011, CVPR.

[7]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[8]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[9]  Peter Kaiser,et al.  Predicting Positive p53 Cancer Rescue Regions Using Most Informative Positive (MIP) Active Learning , 2009, PLoS Comput. Biol..

[10]  Vassilios Morellas,et al.  Active Constrained Clustering via non-iterative uncertainty sampling , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[11]  Xiaolong Wang,et al.  Active deep learning method for semi-supervised sentiment classification , 2013, Neurocomputing.

[12]  Ashish Kapoor,et al.  Active learning for large multi-class problems , 2009, CVPR.

[13]  Steve Hanneke,et al.  A bound on the label complexity of agnostic active learning , 2007, ICML '07.

[14]  Dan Wang,et al.  A new active labeling method for deep learning , 2014, 2014 International Joint Conference on Neural Networks (IJCNN).

[15]  Ruimao Zhang,et al.  Cost-Effective Active Learning for Deep Image Classification , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[16]  Fei-Fei Li,et al.  Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Ying Liu,et al.  Active Learning with Support Vector Machine Applied to Gene Expression Data for Cancer Classification , 2004, J. Chem. Inf. Model..

[18]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, CACM.

[19]  Alex Holub,et al.  Exploiting Unlabelled Data for Hybrid Object Classification , 2005 .

[20]  Daniel Cremers,et al.  CAPTCHA Recognition with Active Deep Learning , 2015 .

[21]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[22]  Edward Y. Chang,et al.  Support vector machine active learning for image retrieval , 2001, MULTIMEDIA '01.