Hybrid-loss supervision for deep neural network

Abstract Multi-loss-joint-optimization has been proven to be valid in computer vision literature. However, the learned deep sub-features usually fit their disjoint constraints, which yield confrontation and spatial inconsistency among the sub-features with nonshared FC layers. In this paper, we propose a Hybrid-loss supervision (HLS) framework in order to obtain smoother and more spatially consistent features with shared FC layers. First, we analyze the shortcomings of the monitoring with single-loss in the existing framework theoretically. Then, we selected two notable loss functions (e.g., Center loss and Weighted loss) to instantiate the HLS framework by linear combination. By instantiating the framework with two standard loss functions, the network has learned more compact intra-class deep features and uniform inter-class deep features. The HLS framework can significantly boost the efficiency of existing convolution networks for both image classification task and object detection task without increasing network parameters and computational complexity. Extensive experimental results on different vision tasks demonstrate consistent improvement can be achieved across a variety of datasets (e.g., CIFAR-10/100, ImageNet-2012, PASCAL VOC and MS-COCO) and different convolutional neural network architectures.

[1]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Yu Qiao,et al.  A Discriminative Feature Learning Approach for Deep Face Recognition , 2016, ECCV.

[3]  Tomas Mikolov,et al.  Bag of Tricks for Efficient Text Classification , 2016, EACL.

[4]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[5]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[6]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[7]  Gang Sun,et al.  Squeeze-and-Excitation Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[8]  Shuicheng Yan,et al.  Multi-loss Regularized Deep Neural Network , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[9]  Kevin Scaman,et al.  Lipschitz regularity of deep neural networks: analysis and efficient estimation , 2018, NeurIPS.

[10]  Zhuowen Tu,et al.  Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Jun Zhao,et al.  Recurrent Convolutional Neural Networks for Text Classification , 2015, AAAI.

[13]  Xiang Zhang,et al.  Character-level Convolutional Networks for Text Classification , 2015, NIPS.

[14]  Christopher J. C. Burges,et al.  From RankNet to LambdaRank to LambdaMART: An Overview , 2010 .

[15]  Jürgen Schmidhuber,et al.  Training Very Deep Networks , 2015, NIPS.

[16]  Dacheng Tao,et al.  Deep Neural Network for Structural Prediction and Lane Detection in Traffic Scene , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[17]  Manaal Faruqui,et al.  Identifying Well-formed Natural Language Questions , 2018, EMNLP.

[18]  Li Wang,et al.  STRAINet: Spatially Varying sTochastic Residual AdversarIal Networks for MRI Pelvic Organ Segmentation , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[19]  Ulrike von Luxburg,et al.  Distance-Based Classification with Lipschitz Functions , 2004, J. Mach. Learn. Res..

[20]  Seunghoon Hong,et al.  Learning Deconvolution Network for Semantic Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[21]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[22]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[24]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[25]  Zhuowen Tu,et al.  Deeply-Supervised Nets , 2014, AISTATS.

[26]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Nanning Zheng,et al.  Fine-Grained Image Classification Using Modified DCNNs Trained by Cascaded Softmax and Generalized Large-Margin Losses , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[28]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[29]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[31]  Kilian Q. Weinberger,et al.  Deep Networks with Stochastic Depth , 2016, ECCV.

[32]  Nikos Komodakis,et al.  Wide Residual Networks , 2016, BMVC.

[33]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Zhuowen Tu,et al.  Deep FisherNet for Image Classification , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[35]  Guillermo Sapiro,et al.  Robust Large Margin Deep Neural Networks , 2016, IEEE Transactions on Signal Processing.

[36]  Meng Yang,et al.  Large-Margin Softmax Loss for Convolutional Neural Networks , 2016, ICML.

[37]  Hao Li,et al.  Visualizing the Loss Landscape of Neural Nets , 2017, NeurIPS.

[38]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[40]  Yi Yang,et al.  A Dual-Network Progressive Approach to Weakly Supervised Object Detection , 2017, ACM Multimedia.