Matrix comparison, Part 2: Measuring the resemblance between proximity measures or ordination results by use of the mantel and procrustes statistics

The present two-part article introduces matrix comparison as a formal means for evaluation purposes in informetric studies such as cocitation analysis. In the first part, the motivation behind introducing matrix comparison to informetric studies, as well as two important issues influencing such comparisons, matrix generation, and the composition of proximity measures, are introduced and discussed. In this second part, the authors introduce and thoroughly demonstrate two related matrix comparison techniques the Mantel test and Procrustes analysis, respectively. These techniques can compare and evaluate the degree of monotonicity between different proximity measures or their ordination results. In common with these techniques is the application of permutation procedures to test hypotheses about matrix resemblances. The choice of technique is related to the validation at hand. In the case of the Mantel test, the degree of resemblance between two measures forecast their potentially different affect upon ordination and clustering results. In principle, two proximity measures with a very strong resemblance most likely produce identical results, thus, choice of measure between the two becomes less important. Alternatively, or as a supplement, Procrustes analysis compares the actual ordination results without investigating the underlying proximity measures, by matching two configurations of the same objects in a multidimensional space. An advantage of the Procrustes analysis though, is the graphical solution provided by the superimposition plot and the resulting decomposition of variance components. Accordingly, the Procrustes analysis provides not only a measure of general fit between configurations, but also values for individual objects enabling more elaborate validations. As such, the Mantel test and Procrustes analysis can be used as statistical validation tools in informetric studies and thus help choosing suitable proximity measures. © 2007 Wiley Periodicals, Inc.

[1]  Robert R. Sokal,et al.  Testing Statistical Significance of Geographic Variation Patterns , 1979 .

[2]  P. S. Nagpaul Exploring a pseudo-regression model of transnational cooperation in science , 2003, Scientometrics.

[3]  S. Siegel,et al.  Nonparametric Statistics for the Behavioral Sciences , 2022, The SAGE Encyclopedia of Research Design.

[4]  Barry Wellman,et al.  Does citation reflect social structure?: Longitudinal evidence from the Globenet interdisciplinary research group , 2004, J. Assoc. Inf. Sci. Technol..

[5]  L. Hubert,et al.  Combinatorial Data Analysis , 1992 .

[6]  Markus Gmür,et al.  Co-citation analysis and the search for invisible colleges: A methodological evaluation , 2004, Scientometrics.

[7]  François-Joseph Lapointe,et al.  Statistical Significance of the Matrix Correlation Coefficient for Comparing Independent Phylogenetic Trees , 1992 .

[8]  R. Sibson Studies in the Robustness of Multidimensional Scaling: Procrustes Statistics , 1978 .

[9]  Ronald Rousseau,et al.  Similarity measures in scientometric research: The Jaccard index versus Salton's cosine formula , 1989, Inf. Process. Manag..

[10]  J. Chang,et al.  Analysis of individual differences in multidimensional scaling via an n-way generalization of “Eckart-Young” decomposition , 1970 .

[11]  Henk F. Moed,et al.  Mapping of Science : Critical elaboration and new approaches, a case study in agricultural biochemistry , 1988 .

[12]  Jonathan Scott Friedlaender,et al.  Biological divergences in south-central Bougainville: an analysis of blood polymorphism gene frequencies and anthropometric measurements utilizing tree models, and a comparison of these variables with linguistic, geographic, and migrational "distances". , 1971, American journal of human genetics.

[13]  Alesia Zuccala,et al.  Modeling the invisible college , 2006 .

[14]  John A. Endler,et al.  Quantitative Matrix Comparisons in Ecological and Evolutionary Investigations , 1982 .

[15]  B. Manly Randomization, Bootstrap and Monte Carlo Methods in Biology , 2018 .

[16]  Wei-Chih Liu,et al.  Visualizing the scientific world and its evolution , 2006, J. Assoc. Inf. Sci. Technol..

[17]  P. Schönemann,et al.  A generalized solution of the orthogonal procrustes problem , 1966 .

[18]  P. H. A. Sneath,et al.  Trend‐surface analysis of transformation grids , 2009 .

[19]  N Mantel,et al.  A technique of nonparametric multivariate analysis. , 1970, Biometrics.

[20]  Donald A. Jackson PROTEST: A PROcrustean Randomization TEST of community environment concordance , 1995 .

[21]  E. C. Pielou The Interpretation of Ecological Data: A Primer on Classification and Ordination , 1984 .

[22]  Peter H. A. Sneath,et al.  Numerical Taxonomy: The Principles and Practice of Numerical Classification , 1973 .

[23]  Stephen P. Borgatti,et al.  A Statistical Method for Comparing Aggregate Data Across A Priori Groups , 2002 .

[24]  P. Mielke,et al.  Clarification and Appropriate Inferences for Mantel and Valand's Nonparametric Multivariate Analysis Technique , 1978 .

[25]  R. Sokal,et al.  Multiple regression and correlation extensions of the mantel test of matrix correspondence , 1986 .

[26]  J. Berge,et al.  Orthogonal procrustes rotation for two or more matrices , 1977 .

[27]  J. Gower Generalized procrustes analysis , 1975 .

[28]  R. Sokal,et al.  THE COMPARISON OF DENDROGRAMS BY OBJECTIVE METHODS , 1962 .

[29]  R. Cattell,et al.  The Procrustes Program: Producing direct rotation to test a hypothesized factor structure. , 2007 .

[30]  A. Siegel,et al.  A robust comparison of biological shapes. , 1982, Biometrics.

[31]  P. Jaccard Distribution de la flore alpine dans le bassin des Dranses et dans quelques régions voisines , 1901 .

[32]  Jesper W. Schneider,et al.  Matrix comparison, Part 1: Motivation and important issues for measuring the resemblance between proximity measures or ordination results , 2007 .

[33]  Lawrence Hubert,et al.  Combinatorial data analysis: Association and partial association , 1985 .

[34]  C. I. Mosier,et al.  Determining a simple structure when loadings for certain tests are known , 1939 .

[35]  Wojtek J. Krzanowski,et al.  Principles of multivariate analysis : a user's perspective. oxford , 1988 .

[36]  H. T. Clifford,et al.  An Introduction to Numerical Classification. , 1976 .

[37]  Jeff White Readings in agents , 1998 .

[38]  Lawrence Hubert,et al.  The comparison and fitting of given classification schemes , 1977 .

[39]  Jesper W. Schneider,et al.  Verification of bibliometric methods' applicability for thesaurus construction , 2005, SIGF.

[40]  P. Schönemann,et al.  Fitting one matrix to another under choice of a central dilation and a rigid motion , 1970 .

[41]  J. Leeuw Applications of Convex Analysis to Multidimensional Scaling , 2000 .

[42]  F. Rohlf,et al.  Extensions of the Procrustes Method for the Optimal Superimposition of Landmarks , 1990 .

[43]  David Krackardt,et al.  QAP partialling as a test of spuriousness , 1987 .

[44]  L. Hubert,et al.  Quadratic assignment as a general data analysis strategy. , 1976 .

[45]  Dietmar Wolfram Applied informetrics for information retrieval research , 2003 .

[46]  Howard D. White,et al.  Author cocitation analysis and Pearson's r , 2003, J. Assoc. Inf. Sci. Technol..

[47]  Kevin W. Boyack,et al.  Identifying a better measure of relatedness for mapping science , 2006 .

[48]  Gerard Salton,et al.  Associative Document Retrieval Techniques Using Bibliographic Information , 1963, JACM.

[49]  B. Green THE ORTHOGONAL APPROXIMATION OF AN OBLIQUE STRUCTURE IN FACTOR ANALYSIS , 1952 .

[50]  L. Hubert Assignment methods in combinatorial data analysis , 1986 .

[51]  Donald A. Jackson,et al.  Are probability estimates from the permutation model of Mantel's test stable? , 1989 .

[52]  N. Mantel The detection of disease clustering and a generalized regression approach. , 1967, Cancer research.

[53]  Katherine W. McCain,et al.  Visualizing a discipline: an author co-citation analysis of information science, 1972–1995 , 1998 .

[54]  A. Ochiai Zoogeographical Studies on the Soleoid Fishes Found in Japan and its Neighbouring Regions-III , 1957 .

[55]  E. Dietz Permutation Tests for Association Between Two Distance Matrices , 1983 .

[56]  June M. Verner,et al.  The use of bibliometric and knowledge elicitation techniques to map a knowledge domain: Software Engineering in the 1990s , 2005, Scientometrics.

[57]  L. Hubert Generalized proximity function comparisons , 1978 .