How about utilizing ordinal information from the distribution of unlabeled data

Problems of ordinal regression arise in many fields such as information retrieval, data mining and knowledge management. In this paper, we consider ordinal regression in a semi-supervised scenario, i.e., we try to utilize the ordinal information from the distribution of unlabeled data. Semi-supervised ordinal regression is more applicable than traditional supervised ordinal regression, because nowadays labeled data is expensive and time-consuming as it needs human labor, whereas a large amount of unlabeled data are far accessible with the development of internet technology. We construct a general semi-supervised ordinal regression framework to formulate this problem. Based on the framework, we then propose a semi-supervised ordinal regression method called Semi-supervised Ordinal SVM (SOSVM). Additionally, in order to make our proposed method more applicable to problems with large scaled labeled data, we put forward a kernel based dual coordinate descent algorithm to efficiently solve SOSVM. Both rigorous theoretical analysis and promising experimental evaluations on real world datasets show the great performance and remarkable efficiency of SOSVM.

[1]  Chih-Jen Lin,et al.  A dual coordinate descent method for large-scale linear SVM , 2008, ICML '08.

[2]  Chih-Jen Lin,et al.  A sequential dual method for large scale multi-class linear svms , 2008, KDD.

[3]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[4]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[5]  Gang Chen,et al.  Efficient multi-label classification with hypergraph regularization , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Feiping Nie,et al.  Efficient multi-class unlabeled constrained semi-supervised SVM , 2009, CIKM.

[7]  Klaus Obermayer,et al.  Support vector learning for ordinal regression , 1999 .

[8]  Amnon Shashua,et al.  Ranking with Large Margin Principle: Two Approaches , 2002, NIPS.

[9]  Ling Li,et al.  Ordinal Regression by Extended Binary Classification , 2006, NIPS.

[10]  Sameer A. Nene,et al.  Columbia Object Image Library (COIL100) , 1996 .

[11]  Jason Weston,et al.  Solving multiclass support vector machines with LaRank , 2007, ICML '07.

[12]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[13]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[14]  Thore Graepel,et al.  Large Margin Rank Boundaries for Ordinal Regression , 2000 .

[15]  Fei Wang,et al.  Label Propagation through Linear Neighborhoods , 2006, IEEE Transactions on Knowledge and Data Engineering.

[16]  Xiaojin Zhu,et al.  Kernel Regression with Order Preferences , 2007, AAAI.

[17]  Feiping Nie,et al.  Probabilistic Labeled Semi-supervised SVM , 2009, 2009 IEEE International Conference on Data Mining Workshops.

[18]  Wei Chu,et al.  New approaches to support vector ordinal regression , 2005, ICML.

[19]  Koby Crammer,et al.  Pranking with Ranking , 2001, NIPS.

[20]  Bo Pang,et al.  Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales , 2005, ACL.

[21]  P. Tseng,et al.  On the convergence of the coordinate descent method for convex differentiable minimization , 1992 .