A simple and efficient Bayesian procedure for selecting dimensionality in multidimensional scaling

Multidimensional scaling (MDS) is a technique which retrieves the locations of objects in a Euclidean space (the object configuration) from data consisting of the dissimilarities between pairs of objects. An important issue in MDS is finding an appropriate dimensionality underlying these dissimilarities. In this paper, we propose a simple and efficient Bayesian approach for selecting dimensionality in MDS. For each column (attribute) vector of an MDS configuration, we assume a prior that is a mixture of the point mass at 0 and a continuous distribution for the rest of the parameter space. Then the marginal posterior distribution of each column vector is also a mixture of the same form, in which the mixing weight of the continuous distribution is a measure of significance for the column vector. We propose an efficient Markov chain Monte Carlo (MCMC) method for estimating the mixture posterior distribution. The proposed method is fully Bayesian. It takes parameter estimation error into account when computing penalties for complex models and provides an uncertainty measure for the choice of dimensionality. Also, the MCMC algorithm is computationally very efficient since it visits various dimensional models in one MCMC procedure. A simulation study compares the proposed method with the Bayesian method of Oh and Raftery (2001). Three real data sets are analysed by using the proposed method.

[1]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[2]  J. Kruskal Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis , 1964 .

[3]  J. Graef,et al.  THE DETERMINATION OF THE UNDERLYING DIMENSIONALITY OF AN EMPIRICALLY OBTAINED MATRIX OF PROXIMITIES. , 1974, Multivariate behavioral research.

[4]  A. Raftery,et al.  Bayesian Multidimensional Scaling and Choice of Dimension , 2001 .

[5]  D. E Welchew,et al.  Multidimensional Scaling of Integrated Neurocognitive Function and Schizophrenia as a Disconnexion Disorder , 2002, NeuroImage.

[6]  P. Groenen,et al.  Modern multidimensional scaling , 1996 .

[7]  Barbara P. Buttenfield,et al.  Loglinear and multidimensional scaling models of digital library navigation , 2002 .

[8]  Pierre Courrieu,et al.  Straight monotonic embedding of data sets in Euclidean spaces , 2002, Neural Networks.

[9]  John A. Hartigan,et al.  Clustering Algorithms , 1975 .

[10]  Kensuke Okada,et al.  BMDS: A Collection of R Functions for Bayesian Multidimensional Scaling , 2009 .

[11]  W. Torgerson Multidimensional scaling: I. Theory and method , 1952 .

[12]  M. Savage,et al.  Ascription into Achievement: Models of Career Systems at Lloyds Bank, 1890-1970 , 1996, American Journal of Sociology.

[13]  Richard L. Priem,et al.  Executives’ Perceptions of Uncertainty Sources: A Numerical Taxonomy and Underlying Dimensions , 2002 .

[14]  Brita Elvevåg,et al.  Scaling and clustering in the study of semantic disruptions in patients with schizophrenia: a re-evaluation , 2003, Schizophrenia Research.

[15]  Shijin Ren,et al.  Use of multidimensional scaling in the selection of wastewater toxicity test battery components. , 2003, Water research.

[16]  Forrest W. Young Multidimensional Scaling: History, Theory, and Applications , 1987 .

[17]  Joseph L. Zinnes,et al.  Theory and Methods of Scaling. , 1958 .

[18]  S. Raghavan,et al.  A visualization model based on adjacency data , 2002, Decision Support Systems.

[19]  Raphael Gottardo,et al.  Markov Chain Monte Carlo With Mixtures of Mutually Singular Distributions , 2008 .