Support vector machine with manifold regularization and partially labeling privacy protection

A novel support vector machine with manifold regularization and partially labeling privacy protection, termed as SVM-MR&PLPP, is proposed for semi-supervised learning (SSL) scenarios where only few labeled data and the class proportion of unlabeled data, due to privacy protection concerns, are available. It integrates manifold regularization and privacy protection regularization into the Laplacian support vector machine (LapSVM) to improve the classification accuracy. Privacy protection here refers to use only the class proportion of data. In order to circumvent the high computational burden of the matrix inversion operation involved in SVM-MR&PLPP, its scalable version called SSVM-MR&PLPP is further developed by introducing intermediate decision variables into the original regularization framework so that the computational burden of the corresponding transformed kernel in SSVM-MR&PLPP can be greatly reduced, making it highly scalable to large datasets. The experimental results on numerous datasets show the effectiveness of the proposed classifiers.

[1]  Xizhao Wang,et al.  Fast Fuzzy Multicategory SVM Based on Support Vector Domain Description , 2008, Int. J. Pattern Recognit. Artif. Intell..

[2]  Yi Yang,et al.  Ranking with local regression and global alignment for cross media retrieval , 2009, ACM Multimedia.

[3]  Bernhard Schölkopf,et al.  A Generalized Representer Theorem , 2001, COLT/EuroCOLT.

[4]  Daniel S. Yeung,et al.  A genetic algorithm for solving the inverse problem of support vector machines , 2005, Neurocomputing.

[5]  Ivor W. Tsang,et al.  Flexible Manifold Embedding: A Framework for Semi-Supervised and Unsupervised Dimension Reduction , 2010, IEEE Transactions on Image Processing.

[6]  Mikhail Belkin,et al.  Laplacian Support Vector Machines Trained in the Primal , 2009, J. Mach. Learn. Res..

[7]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[8]  Mikhail Belkin,et al.  Semi-supervised Learning by Higher Order Regularization , 2011, AISTATS.

[9]  Han-Xiong Li,et al.  Probabilistic support vector machines for classification of noise affected data , 2013, Inf. Sci..

[10]  Stefan Rüping,et al.  SVM Classifier Estimation from Group Probabilities , 2010, ICML.

[11]  Xiaofei He,et al.  Locality Preserving Projections , 2003, NIPS.

[12]  Jin Zhang,et al.  A heuristic approach for λ-representative information retrieval from large-scale data , 2014, Inf. Sci..

[13]  Ivor W. Tsang,et al.  Large-Scale Sparsified Manifold Regularization , 2006, NIPS.

[14]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[15]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[16]  Alain Biem,et al.  Semisupervised Least Squares Support Vector Machine , 2009, IEEE Transactions on Neural Networks.

[17]  Katharina Morik,et al.  Learning from Label Proportions by Optimizing Cluster Model Selection , 2011, ECML/PKDD.

[18]  Yue-Shi Lee,et al.  A support vector machine-based context-ranking model for question answering , 2013, Inf. Sci..

[19]  Shitong Wang,et al.  p-Margin Kernel Learning Machine with Magnetic Field Effect for Both Binary Classification and Novelty Detection , 2010, Int. J. Softw. Informatics.

[20]  Dong Xu,et al.  Semi-Supervised Dimension Reduction Using Trace Ratio Criterion , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[21]  Yuan-Hai Shao,et al.  Nonparallel hyperplane support vector machine for binary classification problems , 2014, Inf. Sci..

[22]  Yuan-Hai Shao,et al.  Laplacian smooth twin support vector machine for semi-supervised classification , 2013, International Journal of Machine Learning and Cybernetics.

[23]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[24]  Samy Bengio,et al.  A Parallel Mixture of SVMs for Very Large Scale Problems , 2001, Neural Computation.

[25]  Alexander J. Smola,et al.  Estimating labels from label proportions , 2008, ICML '08.

[26]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[27]  Chih-Jen Lin,et al.  A comparison of methods for multiclass support vector machines , 2002, IEEE Trans. Neural Networks.

[28]  Vikas Sindhwani,et al.  An RKHS for multi-view learning and manifold co-regularization , 2008, ICML '08.

[29]  Zhaohong Deng,et al.  Scalable TSK Fuzzy Modeling for Very Large Datasets Using Minimal-Enclosing-Ball Approximation , 2011, IEEE Transactions on Fuzzy Systems.

[30]  Pengjiang Qian,et al.  Fast Graph-Based Relaxed Clustering for Large Data Sets Using Minimal Enclosing Ball , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[31]  Li Sun,et al.  A new privacy-preserving proximal support vector machine for classification of vertically partitioned data , 2014, International Journal of Machine Learning and Cybernetics.

[32]  Yi Yang,et al.  A Multimedia Retrieval Framework Based on Semi-Supervised Ranking and Relevance Feedback , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Geoffrey E. Hinton,et al.  Modeling the manifolds of images of handwritten digits , 1997, IEEE Trans. Neural Networks.

[34]  Wei Liu,et al.  Large Graph Construction for Scalable Semi-Supervised Learning , 2010, ICML.

[35]  Chih-Jen Lin,et al.  A Comparison of Methods for Multi-class Support Vector Machines , 2015 .

[36]  Mikhail Belkin,et al.  Beyond the point cloud: from transductive to semi-supervised learning , 2005, ICML.

[37]  F. O’Sullivan A Statistical Perspective on Ill-posed Inverse Problems , 1986 .

[38]  Bernhard Schölkopf,et al.  Training Invariant Support Vector Machines , 2002, Machine Learning.

[39]  Xiaolan Liu,et al.  Graph-based semi-supervised learning by mixed label propagation with a soft constraint , 2014, Inf. Sci..