XGvis: Interactive Data Visualization with Multidimensional Scaling

We discuss interactive techniques for multidimensional scaling (MDS) and a system, named \XGvis", that implements these techniques. MDS is a method for visualizing proximity data, that is, data where objects are characterized by dissimilarity values for all pairs of objects. MDS constructs maps of these objects in IR k by interpreting the dissimilarities as distances. MDS in its conventional batch implementations is prone to uncertainties with regard to 1) local minima in the underlying optimization, 2) sensitivity to the choice of the optimization criterion, 3) artifacts in point conngurations, and 4) local inadequacy of the point conngurations. These uncertainties will be addressed by the following interactive techniques: 1) algorithm animation, random restarts, and manual editing of conngurations, 2) interactive control over parameters that determine the criterion and its minimization, 3) diagnostics for pinning down artifactual point conngurations, and 4) restricting MDS to subsets of objects and subsets of pairs of objects. MDS was originally developed for the social sciences, but it is now also used for laying out graphs. Graph layout is usually done in 2D, but we allow layouts in arbitrary dimensions. We permit missing values, which can be used to implement multidimensional unfolding. We show applications to the mapping of computer usage data, to the dimension reduction of marketing segmentation data, to the layout of mathematical graphs and social network graphs, and nally to the reconstruction of molecules in nano technology. 1 XGvis uses the XGobi system for visualizing point conngurations. The XGvis system, which implements these techniques, is freely available with the XGobi distribution from

[1]  R. Shepard The analysis of proximities: Multidimensional scaling with an unknown distance function. I. , 1962 .

[2]  J. Kruskal Nonmetric multidimensional scaling: A numerical method , 1964 .

[3]  J. Kruskal Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis , 1964 .

[4]  Ramanathan Gnanadesikan,et al.  Methods for statistical data analysis of multivariate observations , 1977, A Wiley publication in applied statistics.

[5]  J. Graef,et al.  Using Distance Information in the Design of Large Multidimensional Scaling Experiments , 1979 .

[6]  Forrest W. Young,et al.  Introduction to Multidimensional Scaling: Theory, Methods, and Applications , 1981 .

[7]  M. Greenacre,et al.  Topics in Applied Multivariate Analysis: SCALING A DATA MATRIX IN A LOW-DIMENSIONAL EUCLIDEAN SPACE , 1982 .

[8]  J. Edward Jackson,et al.  Key Texts in Multidimensional Scaling , 1982 .

[9]  G. Seber Multivariate observations / G.A.F. Seber , 1983 .

[10]  J. Leeuw,et al.  Upper bounds for Kruskal's stress , 1984 .

[11]  H. Macfie,et al.  Key Texts in Multidimensional Scaling , 1984 .

[12]  Richard A. Becker,et al.  Brushing scatterplots , 1987 .

[13]  I. Borg Multidimensional similarity structure analysis , 1987 .

[14]  Michael L. Littman,et al.  Visualizing the embedding of objects in Euclidean space , 1992 .

[15]  Patrick J. F. Groenen,et al.  The majorization approach to multidimensional scaling : some problems and extensions , 1993 .

[16]  G. L. Thompson Generalized Permutation Polytopes and Exploratory Graphical Methods for Ranked Data , 1993 .

[17]  A. Buja,et al.  Inequalities and Positive-Definite Functions Arising from a Problem in Multidimensional Scaling , 1994 .

[18]  Ioannis G. Tollis,et al.  Algorithms for Drawing Graphs: an Annotated Bibliography , 1988, Comput. Geom..

[19]  Andreas Buja,et al.  Grand tour and projection pursuit , 1995 .

[20]  Andreas Buja,et al.  Interactive High-Dimensional Data Visualization , 1996 .

[21]  R. Gnanadesikan Wiley Series in Probability and Statistics , 1997 .

[22]  Dianne Cook,et al.  Manual Controls for High-Dimensional Data Projections , 1997 .

[23]  Deborah F. Swayne,et al.  Missing Data in Interactive High-Dimensional Data Visualization , 1998 .

[24]  William DuMouchel,et al.  A Comparison of Test Statistics for Computer Intrusion Detection Based on Principal Components Regre , 1998 .

[25]  P. Groenen,et al.  Modern Multidimensional Scaling: Theory and Applications , 1999 .