Quadruply Stochastic Gradient Method for Large Scale Nonlinear Semi-Supervised Ordinal Regression AUC Optimization

Semi-supervised ordinal regression (S$^2$OR) problems are ubiquitous in real-world applications, where only a few ordered instances are labeled and massive instances remain unlabeled. Recent researches have shown that directly optimizing concordance index or AUC can impose a better ranking on the data than optimizing the traditional error rate in ordinal regression (OR) problems. In this paper, we propose an unbiased objective function for S$^2$OR AUC optimization based on ordinal binary decomposition approach. Besides, to handle the large-scale kernelized learning problems, we propose a scalable algorithm called QS$^3$ORAO using the doubly stochastic gradients (DSG) framework for functional optimization. Theoretically, we prove that our method can converge to the optimal solution at the rate of $O(1/t)$, where $t$ is the number of iterations for stochastic data sampling. Extensive experimental results on various benchmark and real-world datasets also demonstrate that our method is efficient and effective while retaining similar generalization performance.

[1]  Jian Pei,et al.  Tackle Balancing Constraint for Incremental Semi-Supervised Support Vector Learning , 2019, KDD.

[2]  Bin Gu,et al.  Incremental Support Vector Learning for Ordinal Regression , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[3]  Alexander Zien,et al.  Semi-Supervised Learning , 2006 .

[4]  W. Rudin,et al.  Fourier Analysis on Groups. , 1965 .

[5]  Bin Gu,et al.  A regularization path algorithm for support vector ordinal regression , 2018, Neural Networks.

[6]  Gang Hua,et al.  Ordinal Regression with Multiple Output CNN for Age Estimation , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  WangShijun,et al.  Optimizing area under the ROC curve using semi-supervised learning , 2015 .

[8]  P. K. Srijith,et al.  Semi-supervised Gaussian Process Ordinal Regression , 2013, ECML/PKDD.

[9]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[10]  Bin Gu,et al.  Accelerated Asynchronous Greedy Coordinate Descent Algorithm for SVMs , 2018, IJCAI.

[11]  Zhi-Hua Zhou,et al.  One-Pass AUC Optimization , 2013, ICML.

[12]  Petros Drineas,et al.  On the Nyström Method for Approximating a Gram Matrix for Improved Kernel-Based Learning , 2005, J. Mach. Learn. Res..

[13]  Xiaojin Zhu,et al.  Semi-Supervised Learning , 2010, Encyclopedia of Machine Learning.

[14]  Yoonkyung Lee,et al.  Statistical Optimality in Multipartite Ranking and Ordinal Regression , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Tongliang Liu,et al.  Positive and Unlabeled Learning with Label Disambiguation , 2019, IJCAI.

[16]  Bin Gu,et al.  Quadruply Stochastic Gradients for Large Scale Nonlinear Semi-Supervised AUC Optimization , 2019, IJCAI.

[17]  Bin Gu,et al.  Triply Stochastic Gradients on Multiple Kernel Learning , 2017, UAI.

[18]  Eyke Hüllermeier,et al.  Binary Decomposition Methods for Multipartite Ranking , 2009, ECML/PKDD.

[19]  Le Song,et al.  Scalable Kernel Methods via Doubly Stochastic Gradients , 2014, NIPS.

[20]  Ming Li,et al.  Semi-Supervised AUC Optimization Without Guessing Labels of Unlabeled Data , 2018, AAAI.

[21]  Yan Liu,et al.  Semi-supervised manifold ordinal regression for image ranking , 2011, MM '11.

[22]  Katya Scheinberg,et al.  Efficient SVM Training Using Low-Rank Kernel Representations , 2002, J. Mach. Learn. Res..

[23]  Naonori Ueda,et al.  A Semi-Supervised AUC Optimization Method with Generative Models , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[24]  Brian D. Ziebart,et al.  Adversarial Surrogate Losses for Ordinal Regression , 2017, NIPS.

[25]  Bin Gu,et al.  Scalable and Efficient Pairwise Learning to Achieve Statistical Accuracy , 2019, AAAI.

[26]  Bernard De Baets,et al.  A Survey on ROC-based Ordinal Regression , 2010, Preference Learning.

[27]  Gang Niu,et al.  Semi-supervised AUC optimization based on positive-unlabeled learning , 2017, Machine Learning.

[28]  Bin Gu,et al.  Asynchronous Doubly Stochastic Sparse Kernel Learning , 2018, AAAI.

[29]  Kyoung-jae Kim,et al.  A corporate credit rating model using multi-class support vector machines with an ordinal pairwise partitioning approach , 2012, Comput. Oper. Res..

[30]  Haibin Yan,et al.  Cost-sensitive ordinal regression for fully automatic facial beauty assessment , 2014, Neurocomputing.

[31]  Xingrui Yu,et al.  Co-teaching: Robust training of deep neural networks with extremely noisy labels , 2018, NeurIPS.

[32]  Masashi Sugiyama,et al.  Semisupervised Ordinal Regression Based on Empirical Risk Minimization , 2019, Neural Computation.

[33]  Bin Gu,et al.  Scalable Semi-Supervised SVM via Triply Stochastic Gradients , 2019, IJCAI.

[34]  Zhi-Hua Zhou,et al.  On the Consistency of AUC Pairwise Optimization , 2012, IJCAI.

[35]  Benjamin Recht,et al.  Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[36]  Jun Xu,et al.  The proportional odds with partial proportionality constraints model for ordinal response variables. , 2012, Social science research.

[37]  Ivor W. Tsang,et al.  Transductive Ordinal Regression , 2011, IEEE Transactions on Neural Networks and Learning Systems.

[38]  Wei Chu,et al.  Support Vector Ordinal Regression , 2007, Neural Computation.

[39]  Bernard De Baets,et al.  ROC analysis in ordinal regression learning , 2008, Pattern Recognit. Lett..

[40]  Bernhard Schölkopf,et al.  Sparse Greedy Matrix Approximation for Machine Learning , 2000, International Conference on Machine Learning.