Impact of biased mislabeling on learning with deep networks

The aim of machine learning is to obtain a good model to correctly predict unseen data. In order to train such models, one needs a sufficient number of clean examples of the ground truth. However, some applications' datasets are not guaranteed to consist entirely of pure examples and might contain mislabeled data. Handling mislabeled data is a domain of outlier statistics and have been studied to some extent in the context of machine learning. Here we ask how does mislabeled data in a training set effect classification performance in deep neural networks. More specifically, motivated by an industrial application, we consider the case where the probability of the class mislabeling in the training set varies considerably between each class. We hence contrast in this paper the case of systematic mislabeling of one class to the more commonly studied situation of a uniform mislabeling between all classes. We demonstrate that the non-uniform mislabeling is more challenging than the more commonly studied uniform case. We also explicitly explore the dependence of our findings to the size of the training data which is not only a common limiting factor in industrial applications but which also has a large effect on the results. We demonstrate that deep networks have an inherent robustness when large datasets are available.

[1]  Nagarajan Natarajan,et al.  Learning with Noisy Labels , 2013, NIPS.

[2]  Xiaogang Wang,et al.  Learning from massive noisy labeled data for image classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Mykola Pechenizkiy,et al.  Class Noise and Supervised Learning in Medical Domains: The Effect of Feature Extraction , 2006, 19th IEEE Symposium on Computer-Based Medical Systems (CBMS'06).

[4]  Xindong Wu,et al.  Eliminating Class Noise in Large Datasets , 2003, ICML.

[5]  Albert Fornells,et al.  A study of the effect of different types of noise on the precision of supervised learning techniques , 2010, Artificial Intelligence Review.

[6]  Choh-Man Teng,et al.  A Comparison of Noise Handling Techniques , 2001, FLAIRS.

[7]  Xiaojin Zhu,et al.  Semi-Supervised Learning , 2010, Encyclopedia of Machine Learning.

[8]  M. Verleysen,et al.  Classification in the Presence of Label Noise: A Survey , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[9]  Martial Hebert,et al.  Watch and learn: Semi-supervised learning of object detectors from videos , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Alexander Zien,et al.  Semi-Supervised Learning , 2006 .

[11]  Joan Bruna,et al.  Training Convolutional Networks with Noisy Labels , 2014, ICLR 2014.

[12]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[13]  Rob Fergus,et al.  Learning from Noisy Labels with Deep Neural Networks , 2014, ICLR.

[14]  Atsuto Maki,et al.  From generic to specific deep representations for visual recognition , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[15]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[16]  Geoffrey E. Hinton,et al.  Learning to Label Aerial Images from Noisy Data , 2012, ICML.

[17]  Hossein Mobahi,et al.  Deep Learning via Semi-supervised Embedding , 2012, Neural Networks: Tricks of the Trade.

[18]  Thomas Brox,et al.  Discriminative Unsupervised Feature Learning with Convolutional Neural Networks , 2014, NIPS.

[19]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[20]  Zoubin Ghahramani,et al.  Learning from labeled and unlabeled data with label propagation , 2002 .

[21]  Adrian Popescu,et al.  On Deep Representation Learning from Noisy Web Images , 2015, ArXiv.

[22]  Xingquan Zhu,et al.  Class Noise vs. Attribute Noise: A Quantitative Study , 2003, Artificial Intelligence Review.

[23]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[24]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[25]  Ivan Laptev,et al.  Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.