Sample Balancing for Deep Learning-Based Visual Recognition

Sample balancing includes sample selection and sample reweighting. Sample selection aims to remove some bad samples that may lead to bad local optima. Sample reweighting aims to assign optimal weights to samples to improve performance. In this article, we integrate a sample selection method based on self-paced learning into deep learning frameworks and study the influence of different sample selection strategies on training deep networks. In addition, most of the existing sample reweighting methods mainly take per-class sample number as a metric, which does not fully consider sample qualities. To improve the performance, we propose a novel metric based on the multiview semantic encoders to reweight the samples more appropriately. Then, we propose an optimization mechanism to embed sample weights into loss functions of deep networks, which can be trained in end-to-end manners. We conduct experiments on the CIFAR data set and the ImageNet data set. The experimental results demonstrate that our proposed sample balancing method can improve the performances of deep learning methods in several visual recognition tasks.

[1]  Zhi-Hua Zhou,et al.  Isolation Forest , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[2]  H. Hotelling Analysis of a complex of statistical variables into principal components. , 1933 .

[3]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[5]  Geoffrey E. Hinton,et al.  Stochastic Neighbor Embedding , 2002, NIPS.

[6]  John W. Sammon,et al.  A Nonlinear Mapping for Data Structure Analysis , 1969, IEEE Transactions on Computers.

[7]  Bolei Zhou,et al.  Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  W. Torgerson Multidimensional scaling: I. Theory and method , 1952 .

[9]  Xiaoou Tang,et al.  Discriminative Sparse Neighbor Approximation for Imbalanced Learning , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[10]  Matthew B. Blaschko,et al.  Combining Local and Global Image Features for Object Class Recognition , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Workshops.

[11]  Ivan Laptev,et al.  Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Daniel Cremers,et al.  CAPTCHA Recognition with Active Deep Learning , 2015 .

[13]  Hans-Peter Kriegel,et al.  Angle-based outlier detection in high-dimensional data , 2008, KDD.

[14]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Vipin Kumar,et al.  Feature bagging for outlier detection , 2005, KDD '05.

[16]  Deva Ramanan,et al.  Self-Paced Learning for Long-Term Tracking , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Yan Wang,et al.  DeepContour: A deep convolutional feature learned by positive-sharing loss for contour detection , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[19]  Peter Kontschieder,et al.  Loss Max-Pooling for Semantic Image Segmentation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Dan Wang,et al.  A new active labeling method for deep learning , 2014, 2014 International Joint Conference on Neural Networks (IJCNN).

[21]  Ruimao Zhang,et al.  Cost-Effective Active Learning for Deep Image Classification , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[22]  Naif Alajlan,et al.  Deep learning approach for active classification of electrocardiogram signals , 2016, Inf. Sci..

[23]  Kilian Q. Weinberger,et al.  Learning a kernel matrix for nonlinear dimensionality reduction , 2004, ICML.

[24]  Zengyou He,et al.  Discovering cluster-based local outliers , 2003, Pattern Recognit. Lett..

[25]  Gang Hua,et al.  Learning Discriminative Reconstructions for Unsupervised Outlier Removal , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[26]  Li Fei-Fei,et al.  Learning to Learn from Noisy Web Videos , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Chen Sun,et al.  Revisiting Unreasonable Effectiveness of Data in Deep Learning Era , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[28]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Youbao Tang,et al.  Saliency Detection via Combining Region-Level and Pixel-Level Predictions with CNNs , 2016, ECCV.

[30]  Jieping Ye,et al.  Scaling Up Sparse Support Vector Machine by Simultaneous Feature and Sample Reduction , 2016, ICML.

[31]  Chang-Tsun Li,et al.  Trademark image retrieval using synthetic features for describing global shape and interior structure , 2009, Pattern Recognit..

[32]  Chen Huang,et al.  Learning Deep Representation for Imbalanced Classification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Raymond T. Ng,et al.  Algorithms for Mining Distance-Based Outliers in Large Datasets , 1998, VLDB.

[34]  Youbao Tang,et al.  Deeply-Supervised Recurrent Convolutional Neural Network for Saliency Detection , 2016, ACM Multimedia.

[35]  Prabhakar Raghavan,et al.  A Linear Method for Deviation Detection in Large Databases , 1996, KDD.

[36]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[37]  Anthony K. H. Tung,et al.  Mining top-n local outliers in large databases , 2001, KDD '01.

[38]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[39]  M. Shyu,et al.  A Novel Anomaly Detection Scheme Based on Principal Component Classifier , 2003 .

[40]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[42]  Yong Wang,et al.  Combining global, regional and contextual features for automatic image annotation , 2009, Pattern Recognit..

[43]  Bolei Zhou,et al.  Places: An Image Database for Deep Scene Understanding , 2016, ArXiv.

[44]  Luc Van Gool,et al.  Temporal Segment Networks: Towards Good Practices for Deep Action Recognition , 2016, ECCV.

[45]  Katrien van Driessen,et al.  A Fast Algorithm for the Minimum Covariance Determinant Estimator , 1999, Technometrics.

[46]  Deyu Meng,et al.  Easy Samples First: Self-paced Reranking for Zero-Example Multimedia Search , 2014, ACM Multimedia.

[47]  Mykola Pechenizkiy,et al.  The impact of sample reduction on PCA-based feature extraction for supervised learning , 2006, SAC '06.

[48]  Subhransu Maji,et al.  Bilinear Convolutional Neural Networks for Fine-Grained Visual Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[49]  Ning Ye,et al.  Boundary detection and sample reduction for one-class Support Vector Machines , 2014, Neurocomputing.

[50]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[51]  Zhengzhi Wang,et al.  Building global image features for scene recognition , 2012, Pattern Recognit..

[52]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[53]  Bolei Zhou,et al.  Scene Parsing through ADE20K Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[54]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[55]  Carla E. Brodley,et al.  Identifying and Eliminating Mislabeled Training Instances , 1996, AAAI/IAAI, Vol. 1.

[56]  Daphne Koller,et al.  Self-Paced Learning for Latent Variable Models , 2010, NIPS.

[57]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[58]  Sridhar Ramaswamy,et al.  Efficient algorithms for mining outliers from large data sets , 2000, SIGMOD '00.

[59]  Ronald M. Summers,et al.  Deep Learning in Medical Imaging: Overview and Future Promise of an Exciting New Technique , 2016 .

[60]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[61]  Clara Pizzuti,et al.  Fast Outlier Detection in High Dimensional Spaces , 2002, PKDD.

[62]  Li Fei-Fei,et al.  Unsupervised Learning of Long-Term Motion Dynamics for Videos , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[63]  Guo-Can Feng,et al.  Kernel Bisecting k-means clustering for SVM training sample reduction , 2008, 2008 19th International Conference on Pattern Recognition.

[64]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[65]  Raymond T. Ng,et al.  Distance-based outliers: algorithms and applications , 2000, The VLDB Journal.

[66]  Chao Li,et al.  A Self-Paced Multiple-Instance Learning Framework for Co-Saliency Detection , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[67]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[68]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[69]  J. Ma,et al.  Time-series novelty detection using one-class support vector machines , 2003, Proceedings of the International Joint Conference on Neural Networks, 2003..

[70]  Lei Zhang,et al.  Fine-Tuning Convolutional Neural Networks for Biomedical Image Analysis: Actively and Incrementally , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[71]  Jeanny Hérault,et al.  Curvilinear component analysis: a self-organizing neural network for nonlinear mapping of data sets , 1997, IEEE Trans. Neural Networks.

[72]  Laurent El Ghaoui,et al.  Safe Feature Elimination in Sparse Supervised Learning , 2010, ArXiv.

[73]  Bolei Zhou,et al.  Object Detectors Emerge in Deep Scene CNNs , 2014, ICLR.

[74]  Rob Fergus,et al.  Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-scale Convolutional Architecture , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[75]  Ichiro Takeuchi,et al.  Safe Screening of Non-Support Vectors in Pathwise SVM Computation , 2013, ICML.

[76]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[77]  Bolei Zhou,et al.  Network Dissection: Quantifying Interpretability of Deep Visual Representations , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[78]  Andreas Dengel,et al.  Histogram-based Outlier Score (HBOS): A fast Unsupervised Anomaly Detection Algorithm , 2012 .