Dynamic Attention Loss for Small-Sample Image Classification

Convolutional Neural Networks (CNNs) have been successfully used in various image classification tasks and gradually become one of the most powerful machine learning approaches. To improve the capability of model generalization and performance on small-sample image classification, a new trend is to learn discriminative features via CNNs. The idea of this paper is to decrease the confusion between categories to extract discriminative features and enlarge inter-class variance, especially for classes which have indistinguishable features. In this paper, we propose a loss function termed as Dynamic Attention Loss (DAL), which introduces confusion rate-weighted soft label (target) as the controller of similarity measurement between categories, dynamically giving corresponding attention to samples especially for those classified wrongly during the training process. Experimental results demonstrate that compared with Cross-Entropy Loss and Focal Loss, the proposed DAL achieved a better performance on the LabelMe dataset and the Caltech101 dataset.

[1]  Takashi Koda An introduction to the geometry of homogeneous spaces , 2009 .

[2]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[3]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Bhiksha Raj,et al.  SphereFace: Deep Hypersphere Embedding for Face Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Francisco Herrera,et al.  An insight into imbalanced Big Data classification: outcomes and challenges , 2017 .

[7]  Atsuto Maki,et al.  A systematic study of the class imbalance problem in convolutional neural networks , 2017, Neural Networks.

[8]  Gang Wang,et al.  Video Tracking Using Learned Hierarchical Features , 2015, IEEE Transactions on Image Processing.

[9]  Ting Liu,et al.  Recent advances in convolutional neural networks , 2015, Pattern Recognit..

[10]  Lucas Beyer,et al.  In Defense of the Triplet Loss for Person Re-Identification , 2017, ArXiv.

[11]  Nitesh V. Chawla,et al.  Data Mining for Imbalanced Datasets: An Overview , 2005, The Data Mining and Knowledge Discovery Handbook.

[12]  Jun Guo,et al.  SketchMate: Deep Hashing for Million-Scale Human Sketch Retrieval , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[13]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Antonio Torralba,et al.  LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.

[15]  Meng Yang,et al.  Large-Margin Softmax Loss for Convolutional Neural Networks , 2016, ICML.

[16]  Jie Cao,et al.  Dual Cross-Entropy Loss for Small-Sample Fine-Grained Vehicle Classification , 2019, IEEE Transactions on Vehicular Technology.