Semi-Supervised Learning Through Label Propagation on Geodesics

Graph-based semi-supervised learning (SSL) has attracted great attention over the past decade. However, there are still several open problems in this paper, including: 1) how to construct an effective graph over data with complex distribution and 2) how to define and effectively use pair-wise similarity for robust label propagation. In this paper, we utilize a simple and effective graph construction method to construct the graph over data lying on multiple data manifolds. The method can guarantee the connectiveness between pair-wise data points. Then, the global pair-wise data similarity is naturally characterized by geodesic distance-based joint probability, where the geodesic distance is approximated by the graph distance. The new data similarity is much more effective than previous Euclidean distance-based similarities. To apply data structure for robust label propagation, Kullback–Leibler divergence is utilized to measure the inconsistency between the input pair-wise similarity and the output similarity. In order to further consider intraclass and interclass variances, a novel regularization term on sample-wise margins is introduced to the objective function. This enables the proposed method fully utilizes the input data structure and the label information for classification. An efficient optimization method and the convergence analysis have been proposed for our problem. Besides, out-of-sample extension is discussed and addressed. Comparisons with the state-of-the-art SSL methods on image classification tasks have been presented to show the effectiveness of the proposed method.

[1]  Zhixun Su,et al.  Linearized Alternating Direction Method with Adaptive Penalty for Low-Rank Representation , 2011, NIPS.

[2]  Shih-Fu Chang,et al.  Graph transduction via alternating minimization , 2008, ICML '08.

[3]  Shuicheng Yan,et al.  Semi-supervised Learning by Sparse Representation , 2009, SDM.

[4]  Shih-Fu Chang,et al.  Graph construction and b-matching for semi-supervised learning , 2009, ICML '09.

[5]  Laurens van der Maaten,et al.  Learning a Parametric Embedding by Preserving Local Structure , 2009, AISTATS.

[6]  Xiaojin Zhu,et al.  --1 CONTENTS , 2006 .

[7]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[8]  Ran He,et al.  Nonnegative sparse coding for discriminative semi-supervised learning , 2011, CVPR 2011.

[9]  Robert D. Nowak,et al.  Multi-Manifold Semi-Supervised Learning , 2009, AISTATS.

[10]  Thomas H. Cormen,et al.  Introduction to algorithms [2nd ed.] , 2001 .

[11]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[12]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[13]  Jan Kautz,et al.  Hierarchical Subquery Evaluation for Active Learning on a Graph , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[15]  Helen C. Shen,et al.  Semi-Supervised Classification Using Linear Neighborhood Propagation , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[16]  Francisco Herrera,et al.  Self-labeled techniques for semi-supervised learning: taxonomy, software and empirical study , 2015, Knowledge and Information Systems.

[17]  Gang Hua,et al.  Semi-Supervised Learning with Manifold Fitted Graphs , 2013, IJCAI.

[18]  Jianguo Zhang,et al.  The PASCAL Visual Object Classes Challenge , 2006 .

[19]  Li Yang Building k edge-disjoint spanning trees of minimum total length for isometric data embedding , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Zhi-Hua Zhou,et al.  Graph Quality Judgement: A Large Margin Expedition , 2016, IJCAI.

[21]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[22]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[23]  Bo Wang,et al.  Dynamic Label Propagation for Semi-supervised Multi-class Multi-label Classification , 2013, 2013 IEEE International Conference on Computer Vision.

[24]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[25]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[26]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[27]  Hujun Bao,et al.  A Regularized Approach for Geodesic-Based Semisupervised Multimanifold Learning , 2014, IEEE Transactions on Image Processing.

[28]  Zenglin Xu,et al.  Heavy-Tailed Symmetric Stochastic Neighbor Embedding , 2009, NIPS.

[29]  Ivor W. Tsang,et al.  Convex and scalable weakly labeled SVMs , 2013, J. Mach. Learn. Res..

[30]  Feiping Nie,et al.  Semi-Supervised Classification via Local Spline Regression , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Zhi-Hua Zhou,et al.  SETRED: Self-training with Editing , 2005, PAKDD.

[32]  Zhi-Hua Zhou,et al.  Tri-training: exploiting unlabeled data using three classifiers , 2005, IEEE Transactions on Knowledge and Data Engineering.

[33]  Nenghai Yu,et al.  Non-negative low rank and sparse graph for semi-supervised learning , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Fei Wang,et al.  Label Propagation through Linear Neighborhoods , 2008, IEEE Trans. Knowl. Data Eng..

[35]  Francisco Herrera,et al.  SEG-SSC: A Framework Based on Synthetic Examples Generation for Self-Labeled Semi-Supervised Classification , 2015, IEEE Transactions on Cybernetics.

[36]  R. Vidal,et al.  Sparse Subspace Clustering: Algorithm, Theory, and Applications. , 2013, IEEE transactions on pattern analysis and machine intelligence.

[37]  David Yarowsky,et al.  Unsupervised Word Sense Disambiguation Rivaling Supervised Methods , 1995, ACL.

[38]  Joshua B. Tenenbaum,et al.  The Isomap Algorithm and Topological Stability , 2002, Science.

[39]  Helen C. Shen,et al.  Linear Neighborhood Propagation and Its Applications , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Sebastian Thrun,et al.  Text Classification from Labeled and Unlabeled Documents using EM , 2000, Machine Learning.

[41]  Bo Zhang,et al.  Sparse regularization for semi-supervised classification , 2011, Pattern Recognit..

[42]  Edsger W. Dijkstra,et al.  A note on two problems in connexion with graphs , 1959, Numerische Mathematik.

[43]  Patrick Fox-Roberts,et al.  Unbiased generative semi-supervised learning , 2014, J. Mach. Learn. Res..

[44]  Jonathan J. Hull,et al.  A Database for Handwritten Text Recognition Research , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[45]  Jorma Rissanen,et al.  The Minimum Description Length Principle in Coding and Modeling , 1998, IEEE Trans. Inf. Theory.

[46]  Zhi-Hua Zhou,et al.  Towards Making Unlabeled Data Never Hurt , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[47]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[48]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[49]  Fei Wang,et al.  Robust self-tuning semi-supervised learning , 2007, Neurocomputing.

[50]  Zhaohong Deng,et al.  Semi-Supervised SVM With Extended Hidden Features , 2016, IEEE Transactions on Cybernetics.

[51]  Yide Wang,et al.  Progressive Semisupervised Learning of Multiple Classifiers , 2018, IEEE Transactions on Cybernetics.

[52]  Rui Xu,et al.  Survey of clustering algorithms , 2005, IEEE Transactions on Neural Networks.

[53]  Zhi-Hua Zhou,et al.  Semi-supervised learning using label mean , 2009, ICML '09.

[54]  Sameer A. Nene,et al.  Columbia Object Image Library (COIL100) , 1996 .

[55]  Miguel Á. Carreira-Perpiñán,et al.  A fast, universal algorithm to learn parametric nonlinear embeddings , 2015, NIPS.

[56]  Konstantinos N. Plataniotis,et al.  Face recognition using LDA-based algorithms , 2003, IEEE Trans. Neural Networks.

[57]  Li Yang Building k-connected neighborhood graphs for isometric data embedding , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[58]  Shih-Fu Chang,et al.  Semi-supervised learning using greedy max-cut , 2013, J. Mach. Learn. Res..

[59]  Gang Wang,et al.  Solution Path for Manifold Regularized Semisupervised Classification , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[60]  Wei Liu,et al.  Large Graph Construction for Scalable Semi-Supervised Learning , 2010, ICML.

[61]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[62]  Zhi-Hua Zhou,et al.  Semi-supervised learning by disagreement , 2010, Knowledge and Information Systems.