A Recursive Ensemble Learning Approach With Noisy Labels or Unlabeled Data

For many tasks, the successful application of deep learning relies on having large amounts of training data, labeled to a high standard. But much of the data in real-world applications suffer from label noise. Data annotation is much more expensive and resource-consuming than data collection, somewhat restricting the successful deployment of deep learning to applications where there are very large and well-labeled datasets. To address this problem, we propose a recursive ensemble learning approach in order to maximize the utilization of data. A disagreement-based annotation method and different voting strategies are the core ideas of the proposed method. Meanwhile, we provide guidelines for how to choose the most suitable among many candidate neural networks, with a pruning strategy that provides convenience. The approach is effective especially when the original dataset contains a significant label noise. We conducted experiments on the datasets of Cats versus Dogs, in which significant amounts of label noise were present, and on the CIFAR-10 dataset, achieving promising results.

[1]  Zhi-Hua Zhou,et al.  Ensemble Methods: Foundations and Algorithms , 2012 .

[2]  Aritra Ghosh,et al.  Making risk minimization tolerant to label noise , 2014, Neurocomputing.

[3]  Albert Fornells,et al.  A study of the effect of different types of noise on the precision of supervised learning techniques , 2010, Artificial Intelligence Review.

[4]  Fan Yang,et al.  Good Semi-supervised Learning That Requires a Bad GAN , 2017, NIPS.

[5]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[6]  Gholamreza Haffari,et al.  Analysis of Semi-Supervised Learning with the Yarowsky Algorithm , 2007, UAI.

[7]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Guangjie Han,et al.  Improving Label Noise Filtering by Exploiting Unlabeled Data , 2018, IEEE Access.

[9]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[10]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[11]  Dumitru Erhan,et al.  Training Deep Neural Networks on Noisy Labels with Bootstrapping , 2014, ICLR.

[12]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[13]  Victor S. Sheng,et al.  Label noise correction methods , 2015, 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA).

[14]  Xindong Wu,et al.  Imbalanced Multiple Noisy Labeling , 2015, IEEE Transactions on Knowledge and Data Engineering.

[15]  Matthew S. Nokleby,et al.  Learning Deep Networks from Noisy Labels with Dropout Regularization , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[16]  Qingshan She,et al.  Safe Semi-Supervised Extreme Learning Machine for EEG Signal Classification , 2018, IEEE Access.

[17]  Victor S. Sheng,et al.  Label noise correction and application in crowdsourcing , 2016, Expert Syst. Appl..

[18]  Bing Liu,et al.  Lifelong machine learning: a paradigm for continuous learning , 2017, Frontiers of Computer Science.

[19]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[20]  Varun Jampani,et al.  Training Deep Networks with Synthetic Data: Bridging the Reality Gap by Domain Randomization , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[21]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[22]  Xiaogang Wang,et al.  Learning from massive noisy labeled data for image classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  J. Paul Brooks,et al.  Support Vector Machines with the Ramp Loss and the Hard Margin Loss , 2011, Oper. Res..

[24]  N. Mati,et al.  Discovering Informative Patterns and Data Cleaning , 1996 .

[25]  Dong-Hyun Lee,et al.  Pseudo-Label : The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Networks , 2013 .

[26]  Sivaraman Balakrishnan,et al.  How Many Samples are Needed to Estimate a Convolutional Neural Network? , 2018, NeurIPS.

[27]  Aritra Ghosh,et al.  Robust Loss Functions under Label Noise for Deep Neural Networks , 2017, AAAI.

[28]  Xindong Wu,et al.  Improving Crowdsourced Label Quality Using Noise Correction , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[29]  Stefan Winkler,et al.  Deep Learning for Emotion Recognition on Small Datasets using Transfer Learning , 2015, ICMI.

[30]  Bohyung Han,et al.  Learning Multi-domain Convolutional Neural Networks for Visual Tracking , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Philip Bachman,et al.  Learning with Pseudo-Ensembles , 2014, NIPS.

[32]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[33]  Richard Nock,et al.  Making Deep Neural Networks Robust to Label Noise: A Loss Correction Approach , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Xiaojin Zhu,et al.  --1 CONTENTS , 2006 .

[35]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[36]  Minhaz Uddin Ahmed,et al.  Incremental Deep Learning for Robust Object Detection in Unknown Cluttered Environments , 2018, IEEE Access.

[37]  Xindong Wu,et al.  Active Learning With Imbalanced Multiple Noisy Labeling , 2015, IEEE Transactions on Cybernetics.

[38]  Atsuto Maki,et al.  From generic to specific deep representations for visual recognition , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[39]  Aditya Krishna Menon,et al.  Learning with Symmetric Label Noise: The Importance of Being Unhinged , 2015, NIPS.

[40]  Le Song,et al.  Iterative Learning with Open-set Noisy Labels , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[41]  Ivan Laptev,et al.  Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[42]  Kiyoharu Aizawa,et al.  Joint Optimization Framework for Learning with Noisy Labels , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.