More Attentional Local Descriptors for Few-Shot Learning

Learning from a few examples remains a key challenge for many computer vision tasks. Few-shot learning is proposed to tackle this problem. It aims to learn a classifier to classify images when each class contains only few samples with supervised information in image classification. So far, existing methods have achieved considerable progress, which use fully connected layer or global average pooling as the final classification method. However, due to the lack of samples, global feature may no longer be useful. In contrast, the local feature is more conductive to few-shot learning, but inevitably there will be some noises. In the meanwhile, inspired by human visual systems, the attention mechanism can obtain more valuable information and be widely used in various areas. Therefore, in this paper, we propose a method called More Attentional Deep Nearest Neighbor Neural Network (MADN4 in short) that combines the local descriptors with attention mechanism and is trained end-to-end from scratch. The experimental results on four benchmark datasets demonstrate the superior capability of our method.