Boosting Robust Learning Via Leveraging Reusable Samples in Noisy Web Data

Webly-supervised fine-grained visual classification (FGVC) has attracted increasing attention in recent years because of the unaffordable cost of obtaining correctly-labeled large-scale fine-grained datasets. However, due to the existence of label noise in web images and the high memorization capacity of deep neural networks, training deep fine-grained (FG) models directly through web images tends to have an inferior recognition ability. In the literature, to alleviate this issue, loss correction methods try to estimate the noise transition matrix, but the inevitable false correction would cause accumulated errors. Sample selection methods identify clean (“easy”) samples based on the fact that small losses can alleviate the accumulated errors. However, “hard” and mislabeled examples that can both boost the robustness of FG models are also dropped. To this end, we propose a certainty-based reusable sample selection and correction approach, termed as CRSSC, for coping with label noise in training deep FG models with web images. Our key idea is to additionally identify and correct reusable samples, and then leverage them together with clean examples to update the network. Furthermore, in order to endow our model with the capability to capture richer and more discriminative feature representations, we propose a cross-layer attention-based feature refinement (CLAR) block. We demonstrate the superiority of the proposed approach from both theoretical and experimental perspectives.

[1]  Fumin Shen,et al.  Exploiting Web Images for Fine-Grained Visual Recognition by Eliminating Open-Set Noise and Utilizing Hard Examples , 2021, IEEE Transactions on Multimedia.

[2]  Xiu-Shen Wei,et al.  CRSSC: Salvage Reusable Samples from Noisy Data for Robust Learning , 2020, ACM Multimedia.

[3]  Zheng Zhang,et al.  Web-Supervised Network with Softly Update-Drop Training for Fine-Grained Visual Classification , 2020, AAAI.

[4]  Lei Feng,et al.  Combating Noisy Labels by Agreement: A Joint Training Method with Co-Regularization , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Qiang Wu,et al.  Low-Rank Pairwise Alignment Bilinear Network For Few-Shot Fine-Grained Image Classification , 2019, IEEE Transactions on Multimedia.

[6]  Ashish Vaswani,et al.  Stand-Alone Self-Attention in Vision Models , 2019, NeurIPS.

[7]  Jasper Snoek,et al.  Likelihood Ratios for Out-of-Distribution Detection , 2019, NeurIPS.

[8]  Tao Mei,et al.  Destruction and Construction Learning for Fine-Grained Image Recognition , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Jae-Gil Lee,et al.  SELFIE: Refurbishing Unclean Samples for Robust Deep Learning , 2019, ICML.

[10]  Kun Yi,et al.  Probabilistic End-To-End Noise Correction for Learning With Noisy Labels , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Jiebo Luo,et al.  Looking for the Devil in the Details: Learning Trilinear Attention Sampling Network for Fine-Grained Image Recognition , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Yizhou Yu,et al.  Weakly Supervised Complementary Parts Models for Fine-Grained Image Classification From the Bottom Up , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Xingrui Yu,et al.  How does Disagreement Help Generalization against Label Corruption? , 2019, ICML.

[14]  Ling Shao,et al.  Extracting Multiple Visual Senses for Web Learning , 2019, IEEE Transactions on Multimedia.

[15]  Yafei Song,et al.  Learning From Large-Scale Noisy Web Data With Ubiquitous Reweighting for Image Classification , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Dong Wang,et al.  Learning to Navigate for Fine-grained Classification , 2018, ECCV.

[17]  Ramesh Raskar,et al.  Maximum-Entropy Fine-Grained Classification , 2018, NeurIPS.

[18]  In-So Kweon,et al.  CBAM: Convolutional Block Attention Module , 2018, ECCV.

[19]  Yu-Kun Lai,et al.  Recognition From Web Data: A Progressive Filtering Approach , 2018, IEEE Transactions on Image Processing.

[20]  Errui Ding,et al.  Multi-Attention Multi-Class Constraint for Fine-grained Image Recognition , 2018, ECCV.

[21]  Yang Song,et al.  Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[22]  Subhransu Maji,et al.  Bilinear Convolutional Neural Networks for Fine-Grained Visual Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Ashok Veeraraghavan,et al.  Webly Supervised Learning Meets Zero-shot Learning: A Hybrid Approach for Fine-Grained Classification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[24]  Dacheng Tao,et al.  Webly-Supervised Fine-Grained Visual Categorization via Deep Domain Adaptation , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Masashi Sugiyama,et al.  Co-teaching: Robust training of deep neural networks with extremely noisy labels , 2018, NeurIPS.

[26]  Xiu-Shen Wei,et al.  Mask-CNN: Localizing parts and selecting descriptors for fine-grained bird species categorization , 2018, Pattern Recognit..

[27]  Bin Yang,et al.  Learning to Reweight Examples for Robust Deep Learning , 2018, ICML.

[28]  Jianfei Cai,et al.  Decoupled Spatial Neural Attention for Weakly Supervised Semantic Segmentation , 2018, IEEE Transactions on Multimedia.

[29]  Li Fei-Fei,et al.  MentorNet: Learning Data-Driven Curriculum for Very Deep Neural Networks on Corrupted Labels , 2017, ICML.

[30]  Qilong Wang,et al.  Towards Faster Training of Global Covariance Pooling Networks by Iterative Matrix Square Root Normalization , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[31]  Tao Mei,et al.  Learning Multi-attention Convolutional Neural Network for Fine-Grained Image Recognition , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[32]  Jie Hu,et al.  Squeeze-and-Excitation Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[33]  Kaiming He,et al.  Focal Loss for Dense Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[34]  Subhransu Maji,et al.  Improved Bilinear Pooling with CNNs , 2017, BMVC.

[35]  Xiao Liu,et al.  Kernel Pooling for Convolutional Neural Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Michael Lam,et al.  Fine-Grained Recognition as HSnet Search for Informative Image Parts , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Tao Mei,et al.  Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-Grained Image Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Yang Song,et al.  The iNaturalist Species Classification and Detection Dataset , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[39]  Heng Tao Shen,et al.  Video Captioning With Attention-Based LSTM and Semantic Consistency , 2017, IEEE Transactions on Multimedia.

[40]  Yoshua Bengio,et al.  A Closer Look at Memorization in Deep Networks , 2017, ICML.

[41]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[42]  R. Srikant,et al.  Enhancing The Reliability of Out-of-distribution Image Detection in Neural Networks , 2017, ICLR.

[43]  Shai Shalev-Shwartz,et al.  Decoupling "when to update" from "how to update" , 2017, NIPS.

[44]  Weiyao Lin,et al.  Picking Neural Activations for Fine-Grained Recognition , 2017, IEEE Transactions on Multimedia.

[45]  Xiaogang Wang,et al.  Residual Attention Network for Image Classification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Andrew McCallum,et al.  Active Bias: Training More Accurate Neural Networks by Emphasizing High Variance Samples , 2017, NIPS.

[47]  Ramakant Nevatia,et al.  DECK: Discovering Event Composition Knowledge from Web Images for Zero-Shot Event Detection and Recounting in Videos , 2017, AAAI.

[48]  Larry S. Davis,et al.  Learning a Discriminative Filter Bank Within a CNN for Fine-Grained Recognition , 2016, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[49]  Jian Zhang,et al.  Exploiting Web Images for Dataset Construction: A Domain Robust Approach , 2016, IEEE Transactions on Multimedia.

[50]  Shu Kong,et al.  Low-Rank Bilinear Pooling for Fine-Grained Classification , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Samy Bengio,et al.  Understanding deep learning requires rethinking generalization , 2016, ICLR.

[52]  Jacob Goldberger,et al.  Training deep neural-networks using a noise adaptation layer , 2016, ICLR.

[53]  Chen Sun,et al.  Webly-Supervised Video Recognition by Mutually Voting for Relevant Web Images and Web Video Frames , 2016, ECCV.

[54]  Richard Nock,et al.  Making Deep Neural Networks Robust to Label Noise: A Loss Correction Approach , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[55]  Yongdong Zhang,et al.  Coarse-to-Fine Description for Fine-Grained Visual Categorization , 2016, IEEE Transactions on Image Processing.

[56]  Yi Yang,et al.  You Lead, We Exceed: Labor-Free Video Concept Learning by Jointly Exploiting Web Videos and Images , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[57]  Abhinav Gupta,et al.  Training Region-Based Object Detectors with Online Hard Example Mining , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[58]  Ya Zhang,et al.  Part-Stacked CNN for Fine-Grained Visual Categorization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[59]  Feng Zhou,et al.  Fine-Grained Categorization and Dataset Bootstrapping Using Deep Metric Learning with Humans in the Loop , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[60]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[61]  Ya Zhang,et al.  Augmenting Strong Supervision Using Web Data for Fine-Grained Categorization , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[62]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[63]  Jonathan Krause,et al.  The Unreasonable Effectiveness of Noisy Data for Fine-Grained Recognition , 2015, ECCV.

[64]  Yang Gao,et al.  Compact Bilinear Pooling , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[65]  Cewu Lu,et al.  Deep LAC: Deep localization, alignment and classification for fine-grained recognition , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[66]  Dumitru Erhan,et al.  Training Deep Neural Networks on Noisy Labels with Bootstrapping , 2014, ICLR.

[67]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[68]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[69]  Trevor Darrell,et al.  Part-Based R-CNNs for Fine-Grained Category Detection , 2014, ECCV.

[70]  Pietro Perona,et al.  Bird Species Categorization Using Pose Normalized Deep Convolutional Nets , 2014, ArXiv.

[71]  Jonathan Krause,et al.  3D Object Representations for Fine-Grained Categorization , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[72]  Subhransu Maji,et al.  Fine-Grained Visual Classification of Aircraft , 2013, ArXiv.

[73]  Pietro Perona,et al.  The Caltech-UCSD Birds-200-2011 Dataset , 2011 .

[74]  Daphne Koller,et al.  Self-Paced Learning for Latent Variable Models , 2010, NIPS.

[75]  Joseph Picone,et al.  Support vector machines for automatic data cleanup , 2000, INTERSPEECH.

[76]  Xiangtao Zheng,et al.  Fine-Grained Visual Categorization by Localizing Object Parts With Single Image , 2021, IEEE Transactions on Multimedia.

[77]  Dacheng Tao,et al.  Webly-supervised Fine-grained Visual Categorization via Deep Domain Adaptation. , 2016, IEEE transactions on pattern analysis and machine intelligence.

[78]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .