Less but Better: Generalization Enhancement of Ordinal Embedding via Distributional Margin

In the absence of prior knowledge, ordinal embedding methods obtain new representation for items in a low-dimensional Euclidean space via a set of quadruple-wise comparisons. These ordinal comparisons often come from human annotators, and sufficient comparisons induce the success of classical approaches. However, collecting a large number of labeled data is known as a hard task, and most of the existing work pay little attention to the generalization ability with insufficient samples. Meanwhile, recent progress in large margin theory discloses that rather than just maximizing the minimum margin, both the margin mean and variance, which characterize the margin distribution, are more crucial to the overall generalization performance. To address the issue of insufficient training samples, we propose a margin distribution learning paradigm for ordinal embedding, entitled Distributional Margin based Ordinal Embedding (DMOE). Precisely, we first define the margin for ordinal embedding problem. Secondly, we formulate a concise objective function which avoids maximizing margin mean and minimizing margin variance directly but exhibits the similar effect. Moreover, an Augmented Lagrange Multiplier based algorithm is customized to seek the optimal solution of DMOE effectively. Experimental studies on both simulated and realworld datasets are provided to show the effectiveness of the proposed algorithm.

[1]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[2]  Patrick J. F. Groenen,et al.  Modern Multidimensional Scaling: Theory and Applications , 2003 .

[3]  Danna Zhou,et al.  d. , 1934, Microbial pathogenesis.

[4]  Serge J. Belongie,et al.  Cost-Effective HITs for Relative Similarity Comparisons , 2014, HCOMP.

[5]  Zhiyong Yang,et al.  Extreme Large Margin Distribution Machine and its applications for biomedical datasets , 2016, 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[6]  Milos Hauskrecht,et al.  Efficient Online Relative Comparison Kernel Learning , 2015, SDM.

[7]  Thorsten Joachims,et al.  Learning a Distance Metric from Relative Comparisons , 2003, NIPS.

[8]  Ulrike von Luxburg,et al.  Uniqueness of Ordinal Embedding , 2014, COLT.

[9]  N. Higham COMPUTING A NEAREST SYMMETRIC POSITIVE SEMIDEFINITE MATRIX , 1988 .

[10]  Hady Wirawan Lauw,et al.  Euclidean Co-Embedding of Ordinal Data for Multi-Type Visualization , 2016, SDM.

[11]  Ery Arias-Castro,et al.  Some theory for ordinal embedding , 2015, 1501.02861.

[12]  Tony Jebara,et al.  Structure preserving embedding , 2009, ICML '09.

[13]  Rongrong Ji,et al.  Towards Optimal Binary Code Learning via Ordinal Embedding , 2016, AAAI.

[14]  Lalit Jain,et al.  Finite Sample Prediction and Recovery Bounds for Ordinal Embedding , 2016, NIPS.

[15]  Ehsan Amid,et al.  Multiview Triplet Embedding: Learning Attributes in Multiple Maps , 2015, ICML.

[16]  Alexander J. Smola,et al.  Support Vector Regression Machines , 1996, NIPS.

[17]  Daniel P. W. Ellis,et al.  The Quest for Ground Truth in Musical Artist Similarity , 2002, ISMIR.

[18]  Adam Tauman Kalai,et al.  Adaptively Learning the Crowd Kernel , 2011, ICML.

[19]  R. Shepard The analysis of proximities: Multidimensional scaling with an unknown distance function. II , 1962 .

[20]  Hannes Heikinheimo,et al.  The Crowd-Median Algorithm , 2013, HCOMP.

[21]  Robert D. Nowak,et al.  Low-dimensional embedding using adaptively selected ordinal data , 2011, 2011 49th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[22]  Ulrike von Luxburg,et al.  Kernel functions based on triplet comparisons , 2016, NIPS.

[23]  David J. Kriegman,et al.  Generalized Non-metric Multidimensional Scaling , 2007, AISTATS.

[24]  Zhi-Hua Zhou,et al.  On the doubt about margin explanation of boosting , 2010, Artif. Intell..

[25]  Zhi-Hua Zhou,et al.  Optimal Margin Distribution Machine , 2016, IEEE Transactions on Knowledge and Data Engineering.

[26]  Kilian Q. Weinberger,et al.  Stochastic triplet embedding , 2012, 2012 IEEE International Workshop on Machine Learning for Signal Processing.

[27]  R. A. Bradley,et al.  RANK ANALYSIS OF INCOMPLETE BLOCK DESIGNS , 1952 .

[28]  Wei Liu,et al.  Stochastic Non-convex Ordinal Embedding with Stabilized Barzilai-Borwein Step Size , 2017, AAAI.

[29]  R. Duncan Luce,et al.  Individual Choice Behavior , 1959 .

[30]  J. Kruskal Nonmetric multidimensional scaling: A numerical method , 1964 .

[31]  Ulrike von Luxburg,et al.  Local Ordinal Embedding , 2014, ICML.

[32]  David J. Kriegman,et al.  Learning Concept Embeddings with Combined Human-Machine Expertise , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[33]  R. A. Bradley,et al.  RANK ANALYSIS OF INCOMPLETE BLOCK DESIGNS THE METHOD OF PAIRED COMPARISONS , 1952 .

[34]  J. Kruskal Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis , 1964 .

[35]  Zhi-Hua Zhou,et al.  Large margin distribution machine , 2013, KDD.

[36]  Robert E. Schapire,et al.  How boosting the margin can also boost classifier complexity , 2006, ICML.

[37]  R. Shepard The analysis of proximities: Multidimensional scaling with an unknown distance function. I. , 1962 .

[38]  Gordon D. A. Brown,et al.  Absolute identification by relative judgment. , 2005, Psychological review.