Robust Semi-Supervised Classification for Noisy Labels Based on Self-Paced Learning

Data labeling is a tedious and subjective task that can be time consuming and error-prone; however, most learning algorithms are sensitive to noisy labels. This problem raises the need to develop algorithms that can exploit large amount of unlabeled data and also be robust to noisy label information. In this letter, we propose a novel semi-supervised classification framework that is robust to noisy labels, named self-paced manifold regularization. The proposed framework naturally integrates self-paced learning regime into the manifold regularization framework for selecting labeled training samples in a theoretically sound manner, and utilizes locally linear reconstructions to control the smoothness of the classifier with respect to the manifold structure of data. Finally, the alternative search strategy is adopted for the proposed framework to obtain the classifier. The proposed method can not only suppress the negative effect of noisy initial labels in semi-supervised learning, but also obtain an explicit multiclass classifier for newly coming data points. Experimental results demonstrate the effectiveness of the proposed method.

[1]  Jonathan J. Hull,et al.  A Database for Handwritten Text Recognition Research , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Deyu Meng,et al.  Easy Samples First: Self-paced Reranking for Zero-Example Multimedia Search , 2014, ACM Multimedia.

[3]  Deyu Meng,et al.  Co-Saliency Detection via a Self-Paced Multiple-Instance Learning Framework , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Dae-Won Kim,et al.  SICAGO: Semi-supervised cluster analysis using semantic distance between gene pairs in Gene Ontology , 2010, Bioinform..

[5]  Xiaojin Zhu,et al.  Semi-Supervised Learning , 2010, Encyclopedia of Machine Learning.

[6]  Andy Harter,et al.  Parameterisation of a stochastic model for human face identification , 1994, Proceedings of 1994 IEEE Workshop on Applications of Computer Vision.

[7]  Daphne Koller,et al.  Self-Paced Learning for Latent Variable Models , 2010, NIPS.

[8]  Wotao Yin,et al.  A Block Coordinate Descent Method for Regularized Multiconvex Optimization with Applications to Nonnegative Tensor Factorization and Completion , 2013, SIAM J. Imaging Sci..

[9]  Shiguang Shan,et al.  Self-Paced Learning with Diversity , 2014, NIPS.

[10]  Qi Xie,et al.  Self-Paced Learning for Matrix Factorization , 2015, AAAI.

[11]  Tommy W. S. Chow,et al.  Graph Based Constrained Semi-Supervised Learning Framework via Label Propagation over Adaptive Neighborhood , 2015, IEEE Transactions on Knowledge and Data Engineering.

[12]  N. Altman An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression , 1992 .

[13]  Chong-Wah Ngo,et al.  Towards optimal bag-of-features for object categorization and semantic video retrieval , 2007, CIVR '07.

[14]  Zhiwu Lu,et al.  Noise-robust semi-supervised learning via fast sparse coding , 2015, Pattern Recognit..

[15]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[16]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[17]  Fei Wang,et al.  Label Propagation through Linear Neighborhoods , 2006, IEEE Transactions on Knowledge and Data Engineering.

[18]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[19]  Deyu Meng,et al.  What Objective Does Self-paced Learning Indeed Optimize? , 2015, ArXiv.

[20]  Wei Liu,et al.  Multi-Modal Curriculum Learning for Semi-Supervised Image Classification , 2016, IEEE Transactions on Image Processing.

[21]  Katta G. Murty,et al.  Nonlinear Programming Theory and Algorithms , 2007, Technometrics.

[22]  Deva Ramanan,et al.  Self-Paced Learning for Long-Term Tracking , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.