Geometric representation of association between categories

Categories can be counted, rated, or ranked, but they cannot be measured. Likewise, persons or individuals can be counted, rated, or ranked, but they cannot be measured either. Nevertheless, psychology has realized early on that it can take an indirect road to measurement: What can be measured is the strength of association between categories in samples or populations, and what can be quantitatively compared are counts, ratings, or rankings made under different circumstances, or originating from different persons. The strong demand for quantitative analysis of categorical data has thus created a variety of statistical methods, with substantial contributions from psychometrics and sociometrics. What is the common basis of these methods dealing with categories? The basic element they share is that the sample space has a special geometry, in which categories (or persons) are point masses forming a simplex, while distributions of counts or profiles of ratings are centers of gravity, which are also point masses. Rankings form a discrete subset in the interior of the simplex, known as the permutation polytope, and paired comparisons form another subset on the edges of the simplex. Distances between point masses form the basic tool of analysis. The paper gives some history of major concepts, which naturally leads to a new concept: the shadow point. It is then shown how loglinear models, Luce and Rasch models, unfolding models, correspondence analysis and homogeneity analysis, forced classification and classification trees, as well as other models and methods, fit into this particular geometrical framework.

[1]  Carl P. M. Rijkes,et al.  Loglinear multidimensional IRT models for polytomously scored items , 1988 .

[2]  S. S. Stevens Mathematics, measurement, and psychophysics. , 1951 .

[3]  J. Maxwell,et al.  The Scientific Papers of James Clerk Maxwell: Experiments on Colour as perceived by the Eye, with remarks on Colour-Blindness , 2011 .

[4]  M. Kendall Rank Correlation Methods , 1949 .

[5]  A. D. Lovie,et al.  Who discovered Spearman's rank correlation? , 1995 .

[6]  G. H. Fischer,et al.  The linear logistic test model as an instrument in educational research , 1973 .

[7]  E. Boring Sensation and Perception. (Scientific Books: Sensation and Perception in the History of Experimental Psychology) , 1943 .

[8]  C. L. Mallows NON-NULL RANKING MODELS. I , 1957 .

[9]  S. Fienberg An Iterative Procedure for Estimation in Contingency Tables , 1970 .

[10]  Forrest W. Young,et al.  The principal components of mixed measurement level multivariate data: An alternating least squares method with optimal scaling features , 1978 .

[11]  S. Fienberg,et al.  The Geometry of a Two by Two Contingency Table , 1970 .

[12]  G. Rasch,et al.  An item analysis which takes individual differences into account. , 1966, The British journal of mathematical and statistical psychology.

[13]  F. Galton I. Co-relations and their measurement, chiefly from anthropometric data , 1889, Proceedings of the Royal Society of London.

[14]  Leland Wilkinson The Grammar of Graphics , 1999 .

[15]  Ludovic Lebart,et al.  Correspondence Analysis, Discrimination, and Neural Networks , 1998 .

[16]  H. E. Daniels,et al.  Rank Correlation and Population Models , 1950 .

[17]  Peter M. Bentler,et al.  Structural equation models with continuous and polytomous variables , 1992 .

[18]  D. J. Bartholomew,et al.  Factor Analysis for Categorical Data , 1980 .

[19]  S. Fienberg,et al.  Log linear representation for paired and multiple comparisons models , 1976 .

[20]  T. Wickens Multiway Contingency Tables Analysis for the Social Sciences , 1989 .

[21]  R. A. Bradley,et al.  RANK ANALYSIS OF INCOMPLETE BLOCK DESIGNS THE METHOD OF PAIRED COMPARISONS , 1952 .

[22]  Yoshio Takane,et al.  Analysis of contingency tables by ideal point discriminant analysis , 1987 .

[23]  K. Pearson Mathematical Contributions to the Theory of Evolution. III. Regression, Heredity, and Panmixia , 1896 .

[24]  R. Shepard,et al.  A nonmetric variety of linear factor analysis , 1974 .

[25]  Ulf Böckenholt,et al.  Applications of Thurstonian Models to Ranking Data , 1993 .

[26]  Ulf Böckenholt,et al.  A Thurstonian analysis of preference change , 2002 .

[27]  C. Braak Canonical Correspondence Analysis: A New Eigenvector Technique for Multivariate Direct Gradient Analysis , 1986 .

[28]  Joseph S. Verducci,et al.  Probability Models and Statistical Analyses for Ranking Data , 1992 .

[29]  Abby Israëls,et al.  Eigenvalue techniques for qualitative data , 1988 .

[30]  C H COOMBS,et al.  Psychological scaling without a unit of measurement. , 1950, Psychological review.

[31]  James Clerk Maxwell,et al.  On the theory of compound colours, and the relations of the colours of the spectrum , 1860, Proceedings of the Royal Society of London.

[32]  W. Heiser,et al.  Graphical representations and odds ratios in a distance-association model for the analysis of cross-classified data , 2005 .

[33]  J. Aitchison,et al.  Biplots of Compositional Data , 2002 .

[34]  Jay Magidson,et al.  Latent Class Factor and Cluster Models, Bi-Plots, and Related Graphical Displays , 2001 .

[35]  J. F. C. Kingman,et al.  The analysis of binary data , 1971 .

[36]  Ayala Cohen,et al.  On a Model for Concordance between Judges , 1978 .

[37]  C. Spearman ‘FOOTRULE’ FOR MEASURING CORRELATION , 1906 .

[38]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[39]  S. Nishisato Forced classification: A simple application of a quantification method , 1984 .

[40]  H. van Groenewoud,et al.  A Multivariate Ordering of Vegetation Data Based on Gaussian Type Gradient Response Curves , 1975 .

[41]  Willem J. Heiser,et al.  Multidimensional Scaling and Unfolding of Symmetric and Asymmetric Proximity Relations , 2004 .

[42]  Willem J. Heiser,et al.  Order Invariant Unfolding Analysis Under Smoothness Restrictions , 1989 .

[43]  R. A. Bradley,et al.  Rank Analysis of Incomplete Block Designs: I. The Method of Paired Comparisons , 1952 .

[44]  Willem J. Heiser,et al.  Analyzing rectangular tables by joint and constrained multidimensional scaling , 1983 .

[45]  B. Mirkin Eleven Ways to Look at the Chi-Squared Coefficient for Contingency Tables , 2001 .

[46]  W. Heiser,et al.  Visual Display of Interaction in Multiway Contingency Tables by Use of Homogeneity Analysis , 1998 .

[47]  L. Hubert,et al.  Additive two-mode clustering: The error-variance approach revisited , 1995 .

[48]  Wayne S. DeSarbo,et al.  A Quasi-Metric Approach to Multidimensional Unfolding for Reducing the Occurrence of Degenerate Solutions. , 1999, Multivariate behavioral research.

[49]  Michael Greenacre,et al.  Biplots in correspondence analysis , 1993 .

[50]  Willem J. Heiser,et al.  Interpreting degenerate solutions in unfolding by use of the vector model and the compensatory distance model , 2005 .

[51]  D. Critchlow Metric Methods for Analyzing Partially Ranked Data , 1986 .

[52]  David Andrich,et al.  The Application of an Unfolding Model of the PIRT Type to the Measurement of Attitude , 1988 .

[53]  L. A. van der Ark,et al.  Graphical display of latent budget analysis and latent class analysis, with special reference to correspondence analysis , 1998 .

[54]  R. Luce,et al.  Individual Choice Behavior: A Theoretical Analysis. , 1960 .

[55]  David Andrich,et al.  A hyperbolic cosine latent trait model for unfolding polytomous responses: Reconciling Thurstone and Likert methodologies , 1996 .

[56]  Y. Takane AN ITEM RESPONSE MODEL FOR MULTIDIMENSIONAL ANALYSIS OF MULTIPLE-CHOICE DATA , 1996 .

[57]  P. Schönemann ON METRIC MULTIDIMENSIONAL UNFOLDING , 1970 .

[58]  S. Embretson A general latent trait model for response processes , 1984 .

[59]  J. S. Roberts,et al.  A Unidimensional Item Response Model for Unfolding Responses From a Graded Disagree-Agree Response Scale , 1996 .

[60]  Jaewun Cho,et al.  A stochastic multidimensional scaling vector threshold model for the spatial representation of “pick any/n” data , 1989 .

[61]  N. Cliff,et al.  A generalization of the interpoint distance model , 1964 .

[62]  W. Heiser,et al.  A latent class unfolding model for analyzing single stimulus preference ratings , 1993 .

[63]  K. Pearson The Grammar of Science , 1900 .

[64]  Pieter M. Kroonenberg,et al.  Cross-cultural patterns of attachment: A meta-analysis of the strange situation. , 1988 .

[65]  David R. Cox The analysis of binary data , 1970 .

[66]  Wayne S. DeSarbo,et al.  Simple and Weighted Unfolding Threshold Models for the Spatial Representation of Binary Choice Data , 1986 .

[67]  C. Lewis Test theory and psychometrika: The past twenty-five years , 1986 .

[68]  M. Fligner,et al.  Multistage Ranking Models , 1988 .

[69]  Brian W. Junker,et al.  Using Data Augmentation and Markov Chain Monte Carlo for the Estimation of Unfolding Response Models , 2003 .

[70]  J. S. Roberts,et al.  A General Item Response Theory Model for Unfolding Unidimensional Polytomous Responses , 2000 .

[71]  P. Groenen,et al.  Avoiding degeneracy in multidimensional unfolding by penalizing on the coefficient of variation , 2005 .

[72]  P. V. D. van der Heijden,et al.  On the Identifiability in the Latent Budget Model , 1999 .

[73]  P. Diaconis Group representations in probability and statistics , 1988 .

[74]  Michael Greenacre,et al.  Clustering the rows and columns of a contingency table , 1988 .

[75]  D. S. Sivia,et al.  Data Analysis , 1996, Encyclopedia of Evolutionary Psychological Science.

[76]  J. Michell Measurement in psychology: A critical history of a methodological concept. , 1999 .

[77]  Shin-ichi Mayekawa,et al.  Relationships among several methods of linearly constrained correspondence analysis , 1991 .

[78]  C. Coombs A theory of data. , 1965, Psychology Review.

[79]  R. Duncan Luce,et al.  Individual Choice Behavior: A Theoretical Analysis , 1979 .

[80]  Bruno D. Zumbo,et al.  Charles Spearman: British Behavioral Scientist , 2003 .

[81]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[82]  Willem J. Heiser,et al.  Principal Components Analysis With Nonlinear Optimal Scaling Transformations for Ordinal and Nominal Data , 2005 .

[83]  M. Hill,et al.  Nonlinear Multivariate Analysis. , 1990 .

[84]  V. Rao,et al.  GENFOLD2: A set of models and algorithms for the general UnFOLDing analysis of preference/dominance data , 1984 .

[85]  R. Plackett The Analysis of Permutations , 1975 .

[86]  M. Anglin,et al.  The effect of parole on methadone patient behavior. , 1981, The American journal of drug and alcohol abuse.

[87]  Sherman K. Stein,et al.  Archimedes : What Did He Do Besides Cry Eureka? , 1999 .

[88]  L. A. Goodman The Analysis of Cross-Classified Data Having Ordered and/or Unordered Categories: Association Models, Correlation Models, and Asymmetry Models for Contingency Tables With or Without Missing Entries , 1985 .

[89]  Patrick Slater,et al.  THE ANALYSIS OF PERSONAL PREFERENCES , 1960 .

[90]  N. Cliff,et al.  An Ordinal I Scaling Method for Questionnaire and Other Ordinal I Data , 1988 .

[91]  J. Marden Analyzing and Modeling Rank Data , 1996 .

[92]  M. Fligner,et al.  Distance Based Ranking Models , 1986 .

[93]  G. Yule On the Association of Attributes in Statistics: With Illustrations from the Material of the Childhood Society, &c , 1900 .

[94]  Stephen E. Fienberg,et al.  The Geometry of an $r \times c$ Contingency Table , 1968 .

[95]  D. McFadden Conditional logit analysis of qualitative choice behavior , 1972 .

[96]  J. Gibbs On the equilibrium of heterogeneous substances , 1878, American Journal of Science and Arts.

[97]  D. Andrich Hyperbolic Cosine Latent Trait Models for Unfolding Direct Responses and Pairwise Preferences , 1995 .

[98]  G. L. Thompson Generalized Permutation Polytopes and Exploratory Graphical Methods for Ranked Data , 1993 .

[99]  G. Yule,et al.  On the association of attributes in statistics, with examples from the material of the childhood society, &c , 1900, Proceedings of the Royal Society of London.

[100]  Jun Zhang Binary choice, subset choice, random utility, and ranking: A unified perspective using the permutahedron , 2004 .

[101]  A. Agresti,et al.  Categorical Data Analysis , 1991, International Encyclopedia of Statistical Science.

[102]  Ayala Cohen,et al.  Assessing Goodness of Fit of Ranking Models to Data , 1983 .

[103]  P. Holland,et al.  Simultaneous Estimation of Multinomial Cell Probabilities , 1973 .

[104]  G. Ziegler Lectures on Polytopes , 1994 .

[105]  Wendelina Jantina Post Nonparametric unfolding models. A latent structure approach. , 1992 .

[106]  C. Spearman General intelligence Objectively Determined and Measured , 1904 .