Structure Guided Deep Neural Network for Unsupervised Active Learning

Unsupervised active learning has become an active research topic in the machine learning and computer vision communities, whose goal is to choose a subset of representative samples to be labeled in an unsupervised setting. Most of existing approaches rely on shallow linear models by assuming that each sample can be well approximated by the span (i.e., the set of all linear combinations) of the selected samples, and then take these selected samples as the representative ones for manual labeling. However, the data do not necessarily conform to the linear models in many real-world scenarios, and how to model nonlinearity of data often becomes the key point of unsupervised active learning. Moreover, the existing works often aim to well reconstruct the whole dataset, while ignore the important cluster structure, especially for imbalanced data. In this paper, we present a novel deep unsupervised active learning framework. The proposed method can explicitly learn a nonlinear embedding to map each input into a latent space via a deep neural network, and introduce a selection block to select the representative samples in the learnt latent space through a self-supervised learning strategy. In the selection block, we aim to not only preserve the global structure of the data, but also capture the cluster structure of the data in order to well handle the data imbalance issue during sample selection. Meanwhile, we take advantage of the clustering result to provide self-supervised information to guide the above processes. Finally, we attempt to preserve the local structure of the data, such that the data embedding becomes more precise and the model performance can be further improved. Extensive experimental results on several publicly available datasets clearly demonstrate the effectiveness of our method, compared with the state-of-the-arts.