Multidimensional Scaling by Deterministic Annealing

Multidimensional scaling addresses the problem how proximity data can be faithfully visualized as points in a low-dimensional Euclidian space. The quality of a data embedding is measured by a cost function called stress which compares proximity values with Euclidian distances of the respective points. We present a novel deterministic annealing algorithm to efficiently determine embedding coordinates for this continuous optimization problem. Experimental results demonstrate the superiority of the optimization technique compared to conventional gradient descent methods. Furthermore, we propose a transformation of dissimilarities to reduce the mismatch between a high-dimensional data space and a low-dimensional embedding space.

[1]  Donald Geman,et al.  Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images , 1984 .

[2]  Yoshua Bengio,et al.  Pattern Recognition and Neural Networks , 1995 .

[3]  Peter E. Hart,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[4]  Radford M. Neal A new view of the EM algorithm that justifies incremental and other variants , 1993 .

[5]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[6]  J. Kruskal Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis , 1964 .

[7]  R. Shepard The analysis of proximities: Multidimensional scaling with an unknown distance function. II , 1962 .

[8]  Rose,et al.  Statistical mechanics and phase transitions in clustering. , 1990, Physical review letters.

[9]  E. Jaynes Information Theory and Statistical Mechanics , 1957 .

[10]  J. Leeuw,et al.  An upper bound for sstress , 1986 .

[11]  G. Grabherr,et al.  Climate effects on mountain plants , 1994, Nature.

[12]  J. Kruskal Nonmetric multidimensional scaling: A numerical method , 1964 .

[13]  J. Hartigan REPRESENTATION OF SIMILARITY MATRICES BY TREES , 1967 .

[14]  Joachim M. Buhmann,et al.  Pairwise Data Clustering by Deterministic Annealing , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  David J. Willshaw,et al.  Scaling and brain connectivity , 1994, Nature.

[16]  T. Hofmann,et al.  Correction to "Pairwise Data Clustering by Deterministic Annealing" , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  John G. Proakis,et al.  Probability, random variables and stochastic processes , 1985, IEEE Trans. Acoust. Speech Signal Process..

[18]  Joachim M. Buhmann,et al.  Vector quantization with complexity costs , 1993, IEEE Trans. Inf. Theory.

[19]  Forrest W. Young,et al.  Nonmetric individual differences multidimensional scaling: An alternating least squares method with optimal scaling features , 1977 .

[20]  Steven Gold,et al.  A Graduated Assignment Algorithm for Graph Matching , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[21]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  R. Shepard The analysis of proximities: Multidimensional scaling with an unknown distance function. I. , 1962 .

[23]  Joachim M. Buhmann,et al.  Central and Pairwise Data Clustering by Competitive Neural Networks , 1993, NIPS.

[24]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[25]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[26]  John W. Sammon,et al.  A Nonlinear Mapping for Data Structure Analysis , 1969, IEEE Transactions on Computers.