Active deep learning: Improved training efficiency of convolutional neural networks for tissue classification in oral cavity cancer

Deep learning has yielded impressive performance on a variety of difficult machine learning tasks due to large, widely available annotated datasets. Unfortunately, acquiring such datasets is difficult in medical imaging. In particular, labels for computational pathology are tedious to create and require expert pathologists. In this work, we explore methods for efficiently training convolutional neural networks (CNNs) for tissue classification using Active Learning (AL) instead of the more common Random Learning (RL). Our dataset consists of 143 digitized images of hematoxylin and eosin-stained whole oral cavity cancer sections. We compare both AL and RL training in the task of using a CNN to identify seven tissue classes (stroma, lymphocytes, tumor, mucosa, keratin pearls, blood, and background / adipose). We find that the AL strategy provides an average 3.26% greater performance than RL for a given training set size.