Back-End: A Noise Rate Estimation Method in the Presence of Class Conditional Noise

In this era, the big data has brought us massive information as well as challenges in example annotation. Since label noise is commonly seen in dataset, weakly supervised learning is becoming more and more popular. In this paper, we discussed the issue of estimating noise rate matrix with small proportion of clean dataset, which is of great significance for learning in the presence of class conditional noise. With several recent brilliant methods reviewed, a more comprehensive and explainable algorithm called Back-End was proposed. This method attempts to capture the noise characteristics from the discrimination between noisy and clean dataset. Meanwhile some novel metrics in evaluation were firstly developed from the perspective of matrix distance. By performing experiments on binary and multi-class dataset, we verified the effectiveness of Back-End algorithm. Future works would focus on adaptations of such algorithm.

[1]  Tom Fawcett,et al.  Data Science and its Relationship to Big Data and Data-Driven Decision Making , 2013, Big Data.

[2]  Joan Bruna,et al.  Training Convolutional Networks with Noisy Labels , 2014, ICLR 2014.

[3]  Benoît Frénay Uncertainty and label noise in machine learning , 2013 .

[4]  Clayton Scott,et al.  Class Proportion Estimation with Application to Multiclass Anomaly Rejection , 2013, AISTATS.

[5]  Clayton Scott,et al.  A Rate of Convergence for Mixture Proportion Estimation, with Application to Learning from Noisy Labels , 2015, AISTATS.

[6]  Zhi-Hua Zhou,et al.  Crowdsourcing label quality: a theoretical analysis , 2015, Science China Information Sciences.

[7]  Michael I. Jordan,et al.  Machine learning: Trends, perspectives, and prospects , 2015, Science.

[8]  Tailin Wu,et al.  Learning with Confident Examples: Rank Pruning for Robust Classification with Noisy Labels , 2017, UAI.

[9]  Francisco Herrera,et al.  Enabling Smart Data: Noise filtering in Big Data classification , 2017, Inf. Sci..

[10]  Rob Fergus,et al.  Learning from Noisy Labels with Deep Neural Networks , 2014, ICLR.

[11]  Ata Kabán,et al.  Boosting in the presence of label noise , 2013, UAI.

[12]  Gilles Blanchard,et al.  Classification with Asymmetric Label Noise: Consistency and Maximal Denoising , 2013, COLT.

[13]  Larry L. Peterson,et al.  Reasoning about naming systems , 1993, TOPL.

[14]  Yang Song,et al.  Handling label noise in video classification via multiple instance learning , 2011, 2011 International Conference on Computer Vision.

[15]  Nada Lavrac,et al.  Advances in Class Noise Detection , 2010, ECAI.

[16]  Richard Nock,et al.  Making Deep Neural Networks Robust to Label Noise: A Loss Correction Approach , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Dacheng Tao,et al.  Classification with Noisy Labels by Importance Reweighting , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Okyay Kaynak,et al.  Big Data for Modern Industry: Challenges and Trends [Point of View] , 2015, Proc. IEEE.

[19]  Dennis L. Wilson,et al.  Asymptotic Properties of Nearest Neighbor Rules Using Edited Data , 1972, IEEE Trans. Syst. Man Cybern..

[20]  Nagarajan Natarajan,et al.  Learning with Noisy Labels , 2013, NIPS.

[21]  Dacheng Tao,et al.  Multiclass Learning With Partially Corrupted Labels , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[22]  Ata Kabán,et al.  Label-Noise Robust Logistic Regression and Its Applications , 2012, ECML/PKDD.

[23]  Derong Liu,et al.  A novel policy iteration based deterministic Q-learning for discrete-time nonlinear systems , 2015, Science China Information Sciences.