A Combinatorial Approach to Multidimensional Scaling

In standard Multidimensional Scaling (MDS) one is concerned with finding a low-dimensional representation of a set of n objects, so that pairwise dissimilarities among the original objects are realized as distances in the embedded space with minimum error. We propose an MDS algorithm that, in addition to minimizing a usual Stress function, can accommodate additional optimization criteria, as well as side constraints associated with the underlying visualization task. We present an application in which we attempt to minimize a secondary objective funcion: the cluster membership discrepancy between a given cluster structure in the original data and the resulting cluster structure in the low-dimensional embedding. Preliminary computational experiments show that the algorithm is able to find MDS embeddings that preserve the original cluster structure while incurring a relatively small increase in Stress, as compared to standard MDS. Finally, we discuss a few properties of the algorithm that make it an interesting choice for Big Data visualization.

[1]  Gintautas Dzemyda,et al.  Multidimensional Data Visualization: Methods and Applications , 2012 .

[2]  Leo Liberti,et al.  On the Number of Solutions of the Discretizable Molecular Distance Geometry Problem , 2011, COCOA.

[3]  Leo Liberti,et al.  Discretization orders for distance geometry problems , 2012, Optim. Lett..

[4]  Leo Liberti,et al.  Euclidean Distance Geometry and Applications , 2012, SIAM Rev..

[5]  Patrick J. F. Groenen,et al.  Modern Multidimensional Scaling: Theory and Applications , 2003 .

[6]  Robert Tibshirani,et al.  Supervised multidimensional scaling for visualization, classification, and bipartite ranking , 2011, Comput. Stat. Data Anal..

[7]  Monique Laurent,et al.  Matrix Completion Problems , 2009, Encyclopedia of Optimization.

[8]  Gintautas Dzemyda,et al.  Multidimensional Data Visualization , 2013 .

[9]  Qunfeng Dong,et al.  A linear-time algorithm for solving the molecular distance geometry problem with exact inter-atomic distances , 2002, J. Glob. Optim..

[10]  Leo Liberti,et al.  Counting the Number of Solutions of KDMDGP Instances , 2013, GSI.

[11]  Leo Liberti,et al.  The discretizable molecular distance geometry problem , 2006, Computational Optimization and Applications.

[12]  I. J. Schoenberg Remarks to Maurice Frechet's Article ``Sur La Definition Axiomatique D'Une Classe D'Espace Distances Vectoriellement Applicable Sur L'Espace De Hilbert , 1935 .

[13]  I. Borg Multidimensional similarity structure analysis , 1987 .

[14]  Max A. Little,et al.  Accurate Telemonitoring of Parkinson's Disease Progression by Noninvasive Speech Tests , 2009, IEEE Transactions on Biomedical Engineering.

[15]  Henry Wolkowicz,et al.  On the Embeddability of Weighted Graphs in Euclidean Spaces , 2007 .

[16]  Joachim M. Buhmann,et al.  Optimal Cluster Preserving Embedding of Nonmetric Proximity Data , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  P. Groenen,et al.  Applied Multidimensional Scaling , 2012 .

[18]  P. Groenen,et al.  Cluster differences scaling with a within-clusters loss component and a fuzzy successive approximation strategy to avoid local minima , 1997 .

[19]  Leo Liberti,et al.  Molecular distance geometry methods: from continuous to discrete , 2010, Int. Trans. Oper. Res..

[20]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .