HORSESHOES IN MULTIDIMENSIONAL SCALING AND LOCAL KERNEL METHODS

Classical multidimensional scaling (MDS) is a method for visualizing high-dimensional point clouds by mapping to low-dimensional Euclidean space. This mapping is defined in terms of eigenfunctions of a matrix of interpoint dissimilarities. In this paper we analyze in detail multidimensional scaling applied to a specific dataset: the 2005 United States House of Representatives roll call votes. Certain MDS and kernel projections output “horseshoes” that are characteristic of dimensionality reduction techniques. We show that, in general, a latent ordering of the data gives rise to these patterns when one only has local information. That is, when only the interpoint distances for nearby points are known accurately. Our results provide a rigorous set of results and insight into manifold learning in the special case where the manifold is a curve. 1. Introduction. Classical multidimensional scaling is a widely used technique for dimensionality reduction in complex data sets, a central problem in pattern recognition and machine learning. In this paper we carefully analyze the output of MDS applied to the 2005 United States House of Representatives roll call votes [Office of the Clerk—U.S. House of Representatives (2005)]. The results we find seem stable over recent years. The resultant 3-dimensional mapping of legislators shows “horseshoes” that are characteristic of a number of dimensionality reduction techniques, including principal components analysis and correspondence analysis. These patterns are heuristically attributed to a latent ordering of the data, for example, the ranking of politicians within a left-right spectrum. Our work lends insight into this heuristic, and we present a rigorous analysis of the “horseshoe phenomenon.”

[1]  R.M. Woolley,et al.  Ordination , 1832, Vergleichende Darstellung aller allgemein verbindlichen und provinciellen Kirchensatzungen der katholischen Kirche durch alle Jahrhunderte.

[2]  I. J. Schoenberg Remarks to Maurice Frechet's Article ``Sur La Definition Axiomatique D'Une Classe D'Espace Distances Vectoriellement Applicable Sur L'Espace De Hilbert , 1935 .

[3]  M. Fréchet Sur La Definition Axiomatique D'Une Classe D'Espaces Vectoriels Distancies Applicables Vectoriellement Sur L'Espace de Hilbert , 1935 .

[4]  A. Householder,et al.  Discussion of a set of points in terms of their mutual distances , 1938 .

[5]  W. Torgerson Multidimensional scaling: I. Theory and method , 1952 .

[6]  N. L. Johnson,et al.  Multivariate Analysis , 1958, Nature.

[7]  C. Coombs A theory of data. , 1965, Psychology Review.

[8]  R. Shepard The analysis of proximities: Multidimensional scaling with an unknown distance function. II , 1962 .

[9]  R. Shepard The analysis of proximities: Multidimensional scaling with an unknown distance function. I. , 1962 .

[10]  L. Guttman A general nonmetric technique for finding the smallest coordinate space for a configuration of points , 1968 .

[11]  I. J. Good,et al.  The Inverse of a Centrosymmetric Matrix , 1970 .

[12]  David G. Kendall,et al.  A mathematical approach to seriation , 1970, Philosophical Transactions of the Royal Society of London. Series A, Mathematical and Physical Sciences.

[13]  A. Cantoni,et al.  Eigenvalues and eigenvectors of symmetric centrosymmetric matrices , 1976 .

[14]  B. Parlett The Symmetric Eigenvalue Problem , 1981 .

[15]  J. Weaver Centrosymmetric (Cross-Symmetric) Matrices, Their Basic Properties, Eigenvalues, and Eigenvectors , 1985 .

[16]  C. Braak Correspondence Analysis of Incidence and Abundance Data:Properties in Terms of a Unimodal Response Model , 1985 .

[17]  S. Ferson,et al.  Putting Things in Order: A Critique of Detrended Correspondence Analysis , 1987, The American Naturalist.

[18]  M. Hill,et al.  Data analysis in community and landscape ecology , 1987 .

[19]  Marc Dufrêne,et al.  Geographic Structure and Potential Ecological Factors in Belgium , 1991 .

[20]  J. Heckman,et al.  Linear Probability Models of the Demand for Attributes with an Empirical Application to Estimating the Preferences of Legislators , 1996 .

[21]  Patrick J. F. Groenen,et al.  Modern Multidimensional Scaling: Theory and Applications , 2003 .

[22]  James J. Heckman,et al.  Linear Probability Models of the Demand for Attributes with an Empirical Application to Estimating the Preferences of Legislators , 1996 .

[23]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[24]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[25]  P. Groenen,et al.  Modern Multidimensional Scaling: Theory and Applications , 1999 .

[26]  Gregory A. Caldeira,et al.  Measuring the Ideologies of U. S. Senators: The Song Remains the Same , 2000 .

[27]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[28]  János Podani,et al.  RESEMBLANCE COEFFICIENTS AND THE HORSESHOE EFFECT IN PRINCIPAL COORDINATES ANALYSIS , 2002 .

[29]  O. Bohigas,et al.  Spectral properties of distance matrices , 2003, nlin/0301044.

[30]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[31]  C.J.F. ter Braak,et al.  A Theory of Gradient Analysis , 2004 .

[32]  M. Hill,et al.  Detrended correspondence analysis: An improved ordination technique , 2004, Vegetatio.

[33]  Joshua D. Clinton,et al.  The Statistical Analysis of Roll Call Data , 2004, American Political Science Review.

[34]  Christopher K. I. Williams On a Connection between Kernel PCA and Metric Multidimensional Scaling , 2004, Machine Learning.

[35]  A. Albouy Mutual Distances in Celestial Mechanics , 2006 .

[36]  Mikhail Belkin,et al.  Consistency of spectral clustering , 2008, 0804.0678.