Deep InterBoost Networks for Small-sample Image Classification

Abstract Deep neural networks have recently shown excellent performance on numerous image classification tasks. These networks often need to estimate a large number of parameters and require much training data. When the amount of training data is small, however, a network with high flexibility quickly overfits the training data, resulting in a large model variance and poor generalization. To address this problem, we propose a new, simple yet effective ensemble method called InterBoost for small-sample image classification. In the training phase, InterBoost first randomly generates two sets of complementary weights for training data, which are used for separately training two base networks of the same structure, and then the two sets of complementary weights are updated for refining the training of the networks through interaction between the two base networks previously trained. This interactive training process continues iteratively until a stop criterion is met. In the testing phase, the outputs of the two networks are combined to obtain one final score for classification. Experimental results on four small-sample datasets, UIUC-Sports, LabelMe, 15Scenes and Caltech101, demonstrate that the proposed ensemble method outperforms existing ones. Moreover, results from the Wilcoxon signed-rank tests show that our method is statistically significantly better than the methods compared. Detailed analysis is also provided for an in-depth understanding of the proposed method.

[1]  Honggang Zhang,et al.  Variational Bayesian Matrix Factorization for Bounded Support Data , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Pablo M. Granitto,et al.  Neural network ensembles: evaluation of aggregation algorithms , 2005, Artif. Intell..

[3]  Ivan Laptev,et al.  Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Justin Salamon,et al.  Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound Classification , 2016, IEEE Signal Processing Letters.

[6]  Timo Aila,et al.  Temporal Ensembling for Semi-Supervised Learning , 2016, ICLR.

[7]  Jiansheng Chen,et al.  Rethinking Feature Distribution for Loss Functions in Image Classification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[8]  David M. Blei,et al.  Stochastic Gradient Descent as Approximate Bayesian Inference , 2017, J. Mach. Learn. Res..

[9]  Xiaogang Wang,et al.  Factors in Finetuning Deep Model for Object Detection with Long-Tail Distribution , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[11]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[12]  Yong Luo,et al.  Large Margin Multi-Modal Multi-Task Feature Extraction for Image Classification. , 2016, IEEE transactions on image processing : a publication of the IEEE Signal Processing Society.

[13]  Jen-Tzung Chien,et al.  Image-text dual neural network with decision strategy for small-sample image classification , 2019, Neurocomputing.

[14]  Antonio Torralba,et al.  LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.

[15]  Kilian Q. Weinberger,et al.  Deep Networks with Stochastic Depth , 2016, ECCV.

[16]  Junwei Han,et al.  Duplex Metric Learning for Image Set Classification. , 2018, IEEE transactions on image processing : a publication of the IEEE Signal Processing Society.

[17]  Yi Yang,et al.  Few-shot Object Detection , 2017, ArXiv.

[18]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[19]  Sunok Kim,et al.  Feature Augmentation for Learning Confidence Measure in Stereo Matching. , 2017, IEEE transactions on image processing : a publication of the IEEE Signal Processing Society.

[20]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[21]  Maoguo Gong,et al.  RBoost: Label Noise-Robust Boosting Algorithm Based on a Nonconvex Loss Function and the Numerically Stable Base Learners , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[22]  Xiang Bai,et al.  An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Jian Yang,et al.  Boosted Convolutional Neural Networks , 2016, BMVC.

[24]  Fei-Fei Li,et al.  What, where and who? Classifying events by scene and object recognition , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[25]  Reza Ebrahimpour,et al.  Mixture of experts: a literature survey , 2014, Artificial Intelligence Review.

[26]  Jian-Huang Lai,et al.  Person Re-Identification by Camera Correlation Aware Feature Augmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Zhedong Zheng,et al.  CamStyle: A Novel Data Augmentation Method for Person Re-Identification , 2019, IEEE Transactions on Image Processing.

[28]  Fatih Porikli,et al.  A Unified Approach for Conventional Zero-Shot, Generalized Zero-Shot, and Few-Shot Learning , 2017, IEEE Transactions on Image Processing.

[29]  Jun Guo,et al.  Variational Bayesian Learning for Dirichlet Process Mixture of Inverted Dirichlet Distributions in Non-Gaussian Image Feature Modeling , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[30]  Kilian Q. Weinberger,et al.  Snapshot Ensembles: Train 1, get M for free , 2017, ICLR.

[31]  Joseph N. Wilson,et al.  Twenty Years of Mixture of Experts , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[32]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[33]  George D. Magoulas,et al.  Boosted Residual Networks , 2017, EANN.

[34]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[35]  Pietro Perona,et al.  One-shot learning of object categories , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Yoshua Bengio,et al.  Boosting Neural Networks , 2000, Neural Computation.

[37]  Jie Cao,et al.  Dual Cross-Entropy Loss for Small-Sample Fine-Grained Vehicle Classification , 2019, IEEE Transactions on Vehicular Technology.