Biomedical compound figure detection using deep learning and fusion techniques

Images contain significant amounts of information but present different challenges relative to textual information. One such challenge is compound figures or images made up of two or more subfigures. A deep learning model is proposed for compound figure detection (CFD) in the biomedical article domain. First, pre-trained convolutional neural networks (CNNs) are selected for transfer learning to take advantage of the image classification performance of CNNs and to overcome the limited dataset of the CFD problem. Next, the pre-trained CNNs are fine-tuned on the training data with early-stopping to avoid overfitting. Alternatively, layer activations of the pre-trained CNNs are extracted and used as input features to a support vector machine classifier. Finally, individual model outputs are combined with score-based fusion. The proposed combined model obtained a best test accuracy of 90.03 and 96.93% outperforming traditional hand-crafted and other deep learning representations on the ImageCLEF 2015 and 2016 CFD subtask datasets, respectively, by using AlexNet, VGG-16, VGG-19 pre-trained CNNs fine-tuned until best validation accuracy stops increasing combined with the combPROD score-based fusion operator.