Semi-supervised classification by discriminative regularization

Abstract One basic assumption in graph-based semi-supervised classification is manifold assumption, which assumes nearby samples should have similar outputs (or labels). However, manifold assumption may not always hold for samples lying nearby but across the boundary of different classes. As a consequence, samples close to the boundary are quite likely to be misclassified. In this paper, we introduce an approach called semi-supervised classification by discriminative regularization (SSCDR for short) to address this problem. SSCDR first constructs a k nearest neighborhood graph to capture the local manifold structure of samples, and a discriminative graph to encode the discriminative information derived from constrained clustering on labeled and unlabeled samples. Next, it separately treats the discriminative graph and the neighborhood graph in a discriminative regularization framework for semi-supervised classification, and forces nearby samples across the boundary to have different labels. Experimental results on various datasets collected from UCI, LibSVM and facial image datasets demonstrate that SSCDR achieves better performance than other related methods, and it is also robust to the input values of parameter k .

[1]  Liangxiao Jiang,et al.  Randomly selected decision tree for test-cost sensitive learning , 2017, Appl. Soft Comput..

[2]  Qiang Yang,et al.  Discriminatively regularized least-squares classification , 2009, Pattern Recognit..

[3]  Fei Wang,et al.  On Discriminative Semi-Supervised Classification , 2008, AAAI.

[4]  Hui Xue,et al.  Semi-supervised classification learning by discrimination-aware manifold regularization , 2015, Neurocomputing.

[5]  Bo Zhang,et al.  Sparse regularization for semi-supervised classification , 2011, Pattern Recognit..

[6]  Yong Yu,et al.  Robust Recovery of Subspace Structures by Low-Rank Representation , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Stephen J. Wright,et al.  Dissimilarity in Graph-Based Semi-Supervised Classification , 2007, AISTATS.

[8]  Jesús Alcalá-Fdez,et al.  KEEL Data-Mining Software Tool: Data Set Repository, Integration of Algorithms and Experimental Analysis Framework , 2011, J. Multiple Valued Log. Soft Comput..

[9]  Liangxiao Jiang,et al.  Learning Instance Weighted Naive Bayes from labeled and unlabeled data , 2011, Journal of Intelligent Information Systems.

[10]  Shasha Wang,et al.  Structure extended multinomial naive Bayes , 2016, Inf. Sci..

[11]  Lawrence Carin,et al.  Semi-Supervised Classification , 2004, Encyclopedia of Database Systems.

[12]  David J. Kriegman,et al.  From Few to Many: Illumination Cone Models for Face Recognition under Variable Lighting and Pose , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Horace Ho-Shing Ip,et al.  Graph-Based Label Propagation with Dissimilarity Regularization , 2013, PCM.

[14]  Helen C. Shen,et al.  Linear Neighborhood Propagation and Its Applications , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  F. Girosi,et al.  Some Extensions of the K-Means Algorithm for Image Segmentation and Pattern Classification , 1993 .

[16]  Shasha Wang,et al.  Deep feature weighting for naive Bayes and its application to text classification , 2016, Eng. Appl. Artif. Intell..

[17]  Jane You,et al.  Semi-supervised classification based on random subspace dimensionality reduction , 2012, Pattern Recognit..

[18]  Wei Liu,et al.  Robust multi-class transductive learning with graphs , 2009, CVPR.

[19]  Avinash C. Kak,et al.  PCA versus LDA , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  Tommy W. S. Chow,et al.  Automatic image annotation via compact graph based semi-supervised learning , 2015, Knowl. Based Syst..

[21]  Jane You,et al.  Semi-supervised ensemble classification in subspaces , 2012, Appl. Soft Comput..

[22]  Claire Cardie,et al.  Proceedings of the Eighteenth International Conference on Machine Learning, 2001, p. 577–584. Constrained K-means Clustering with Background Knowledge , 2022 .

[23]  Wenhua Wang,et al.  Classification by semi-supervised discriminative regularization , 2010, Neurocomputing.

[24]  Xiaojin Zhu,et al.  --1 CONTENTS , 2006 .

[25]  J. Lythgoe,et al.  The spectral clustering of visual pigments. , 1965, Vision research.

[26]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[27]  Shuicheng Yan,et al.  Semi-supervised Learning by Sparse Representation , 2009, SDM.

[28]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[29]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[30]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[31]  Bernhard Schölkopf,et al.  Introduction to Semi-Supervised Learning , 2006, Semi-Supervised Learning.

[32]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[33]  Licheng Jiao,et al.  A simplified low rank and sparse graph for semi-supervised learning , 2014, Neurocomputing.

[34]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[35]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Masayuki Karasuyama,et al.  Multiple Graph Label Propagation by Sparse Integration , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[37]  Kurt Driessens,et al.  Using Weighted Nearest Neighbor to Benefit from Unlabeled Data , 2006, PAKDD.

[38]  Sunil Arya,et al.  An optimal algorithm for approximate nearest neighbor searching fixed dimensions , 1998, JACM.

[39]  Zili Zhang,et al.  Semi-supervised classification based on subspace sparse representation , 2013, Knowledge and Information Systems.