Cluster Optimized Proximity Scaling

Abstract Proximity scaling methods such as multidimensional scaling represent objects in a low-dimensional configuration so that fitted object distances optimally approximate object proximities. Besides finding the optimal configuration, an additional goal may be to make statements about the cluster arrangement of objects. This fails if the configuration lacks appreciable clusteredness. We present cluster optimized proximity scaling (COPS), which attempts to find a configuration that exhibits clusteredness. In COPS, a flexible parameterized scaling loss function that may emphasize differentiation information in the proximities is augmented with an index (OPTICS Cordillera) that penalizes lack of clusteredness of the configuration. We present two variants of this, one for finding a configuration directly and one for hyperparameter selection for parametric stresses. We apply both to a functional magnetic resonance imaging dataset on neural representations of mental states in a social cognition task and show that COPS improves clusteredness of the configuration, enabling visual identification of clusters of mental states. Online supplementary materials are available including an R package and a document with additional details.

[1]  Joseph L. Zinnes,et al.  Theory and Methods of Scaling. , 1958 .

[2]  Robert Hooke,et al.  `` Direct Search'' Solution of Numerical and Statistical Problems , 1961, JACM.

[3]  J. Kruskal Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis , 1964 .

[4]  Victor E. McGee,et al.  THE MULTIDIMENSIONAL ANALYSIS OF ‘ELASTIC’ DISTANCES , 1966 .

[5]  John W. Sammon,et al.  A Nonlinear Mapping for Data Structure Analysis , 1969, IEEE Transactions on Computers.

[6]  T. H. I. Jaakola,et al.  Optimization by direct search and systematic reduction of the size of search region , 1973 .

[7]  Forrest W. Young,et al.  Nonmetric individual differences multidimensional scaling: An alternating least squares method with optimal scaling features , 1977 .

[8]  J. Ramsay Maximum likelihood estimation in multidimensional scaling , 1977 .

[9]  P. Bickel,et al.  DESCRIPTIVE STATISTICS FOR NONPARAMETRIC MODELS IV. SPREAD , 1979 .

[10]  J. Leeuw,et al.  Upper bounds for Kruskal's stress , 1984 .

[11]  J. Carroll,et al.  K-means clustering in a low-dimensional Euclidean space , 1994 .

[12]  A. Buja,et al.  Inequalities and Positive-Definite Functions Arising from a Problem in Multidimensional Scaling , 1994 .

[13]  Russell C. Eberhart,et al.  A new optimizer using particle swarm theory , 1995, MHS'95. Proceedings of the Sixth International Symposium on Micro Machine and Human Science.

[14]  Rudolf Mathar,et al.  Least Squares Multidimensional Scaling with Transformed Distances , 1996 .

[15]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[16]  P. Groenen,et al.  Cluster differences scaling with a within-clusters loss component and a fuzzy successive approximation strategy to avoid local minima , 1997 .

[17]  Hans-Peter Kriegel,et al.  OPTICS: ordering points to identify the clustering structure , 1999, SIGMOD '99.

[18]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[19]  J. Leeuw Applications of Convex Analysis to Multidimensional Scaling , 2000 .

[20]  H. Kiers,et al.  Factorial k-means analysis for two-way data , 2001 .

[21]  J. A. Lozano,et al.  Estimation of Distribution Algorithms: A New Tool for Evolutionary Computation , 2001 .

[22]  Nikolaus Hansen,et al.  Completely Derandomized Self-Adaptation in Evolution Strategies , 2001, Evolutionary Computation.

[23]  Pedro Larrañaga,et al.  Estimation of Distribution Algorithms , 2002, Genetic Algorithms and Evolutionary Computation.

[24]  Andreas Buja,et al.  Visualization Methodology for Multidimensional Scaling , 2002, J. Classif..

[25]  Tena I. Katsaounis,et al.  Analyzing Multivariate Data , 2004, Technometrics.

[26]  H. Kiers,et al.  Simultaneous classification and multidimensional scaling with external information , 2005 .

[27]  David E. Goldberg,et al.  Genetic algorithms and Machine Learning , 1988, Machine Learning.

[28]  F. Scholz Maximum Likelihood Estimation , 2006 .

[29]  J. Douglas Carroll,et al.  PARAMAP vs. Isomap: A Comparison of Two Nonlinear Mapping Algorithms , 2006, J. Classif..

[30]  M. Powell The NEWUOA software for unconstrained optimization without derivatives , 2006 .

[31]  Willem J. Heiser,et al.  Global Optimization in Any Minkowski Metric: A Permutation-Translation Simulated Annealing Algorithm for Multidimensional Scaling , 2007, J. Classif..

[32]  Willem J. Heiser,et al.  A Latent Class Multidimensional Scaling Model for Two-Way One-Mode Continuous Rating Dissimilarity Data , 2009 .

[33]  Deborah F. Swayne,et al.  Data Visualization With Multidimensional Scaling , 2008 .

[34]  J. Vera,et al.  Non-stationary spatial covariance structure estimation in oversampled domains by cluster differences scaling with spatial constraints , 2008 .

[35]  Hadley Wickham,et al.  ggplot2 - Elegant Graphics for Data Analysis (2nd Edition) , 2017 .

[36]  A. Buja,et al.  Local Multidimensional Scaling for Nonlinear Dimension Reduction, Graph Drawing, and Proximity Analysis , 2009 .

[37]  J. Vera,et al.  A latent class MDS model with spatial constraints for non-stationary spatial covariance estimation , 2009 .

[38]  Robert Tibshirani,et al.  Supervised multidimensional scaling for visualization, classification, and bipartite ranking , 2011, Comput. Stat. Data Anal..

[39]  Michael C. Hout,et al.  Multidimensional Scaling , 2003, Encyclopedic Dictionary of Archaeology.

[40]  Andreas Buja,et al.  Stress functions for nonlinear dimension reduction, proximity analysis, and graph drawing , 2013, J. Mach. Learn. Res..

[41]  Kurt Hornik,et al.  The grand old party – a party of values? , 2014, SpringerPlus.

[42]  Juan Manuel Contreras,et al.  Neural evidence that three dimensions organize mental state representation: Rationality, social impact, and valence , 2015, Proceedings of the National Academy of Sciences.

[43]  K. Hornik,et al.  COPS: Cluster optimized proximity scaling , 2015 .

[44]  Patrick Mair,et al.  Goodness-of-Fit Assessment in Multidimensional Scaling and Unfolding , 2016, Multivariate behavioral research.

[45]  I. Borg,et al.  The Choice of Initial Configurations in Multidimensional Scaling: Local Minima, Fit, and Interpretability , 2017 .

[46]  Nyoman Gunantara,et al.  A review of multi-objective optimization: Methods and its applications , 2018 .

[47]  K. Hornik,et al.  Assessing and Quantifying Clusteredness: The OPTICS Cordillera , 2018 .

[48]  J. Leeuw,et al.  Cluster Optimized Proximity Scaling [R package cops version 1.0-2] , 2019 .

[49]  K. Slowikowski,et al.  Automatically Position Non-Overlapping Text Labels with 'ggplot2' [R package ggrepel version 0.8.2] , 2020 .

[50]  P. Mair Modern Psychometrics with R [R package MPsychoR version 0.10-8] , 2020 .