Correspondence analysis and the Freeman-Tukey statistic: A study of archaeological data

Abstract Traditionally, simple correspondence analysis is performed by decomposing a matrix of standardised residuals using singular value decomposition where the sum-of-squares of these residuals gives Pearson’s chi-squared statistic. Such residuals, which are treated as being asymptotically normally distributed, arise by assuming that the cell frequencies are Poisson random variables so that their mean and variance are the same. However, studies in the past reveal that this is not the case and that the cell frequencies are prone to overdispersion. There are a growing number of remedies that have been proposed in the statistics, and allied, literature. One such remedy, and the focus of this paper, is to stabilise the variance using the Freeman–Tukeytransformation. Therefore, the properties that stem from performing correspondence analysis will be examined by decomposing the Freeman–Tukey residuals of a two-way contingency table. The application of this strategy shall be made by studying one large, and sparse, set of archaeological data.

[1]  E. Beh Simple correspondence analysis using adjusted residuals , 2012 .

[2]  D. Pollard A User's Guide to Measure Theoretic Probability by David Pollard , 2001 .

[3]  E. Beh,et al.  A GENEALOGY OF CORRESPONDENCE ANALYSIS , 2012 .

[4]  Joseph L. Zinnes,et al.  Theory and Methods of Scaling. , 1958 .

[5]  Carles M. Cuadras,et al.  A parametric approach to correspondence analysis , 2006 .

[6]  Gilbert Saporta,et al.  L'analyse des données , 1981 .

[7]  D. Kendall Abundance matrices and seriation in archaeology , 1971 .

[8]  S. S. Wilks The Large-Sample Distribution of the Likelihood Ratio for Testing Composite Hypotheses , 1938 .

[9]  J. Sheil,et al.  The Distribution of Non‐Negative Quadratic Forms in Normal Variables , 1977 .

[10]  Susan Holmes,et al.  Multivariate data analysis: The French way , 2008, 0805.2879.

[11]  Domenges,et al.  Analyse factorielle sphérique: Une exploration , 1979 .

[12]  P. Kroonenberg,et al.  Modelling Trends in Ordered Correspondence Analysis Using Orthogonal Polynomials , 2016, Psychometrika.

[13]  Stephen E. Fienberg,et al.  Discrete Multivariate Analysis: Theory and Practice , 1976 .

[14]  J. Tukey,et al.  Transformations Related to the Angular and the Square Root , 1950 .

[15]  K. Larntz Small-Sample Comparisons of Exact Levels for Chi-Squared Goodness-of-Fit Statistics , 1978 .

[16]  Pieter M. Kroonenberg,et al.  Nonsymmetric Correspondence Analysis: A Tool for Analysing Contingency TablesWith a Dependence Structure , 1999 .

[17]  Patrick J. F. Groenen,et al.  Modern Multidimensional Scaling: Theory and Applications , 2003 .

[18]  Bradley Efron,et al.  Poisson Overdispersion Estimates Based on the Method of Asymmetric Maximum Likelihood , 1992 .

[19]  R. Clarke,et al.  Theory and Applications of Correspondence Analysis , 1985 .

[20]  Stephen N Freeman,et al.  On smoothing trends in population index modeling. , 2007, Biometrics.

[21]  C. Read Freeman--Tukey chi-squared goodness-of-fit statistics , 1993 .

[22]  Byron J. T. Morgan,et al.  Bayesian Animal Survival Estimation , 2000 .

[23]  R. Beran Minimum Hellinger distance estimates for parametric models , 1977 .

[24]  Solomon Kullback,et al.  Information Theory and Statistics , 1960 .

[25]  C. R. Rao,et al.  An Alternative to Correspondence Analysis Using Hellinger Distance. , 1997 .

[26]  Eric J. Beh,et al.  Simple Correspondence Analysis: A Bibliographic Review , 2004 .

[27]  A. Morineau,et al.  Multivariate descriptive statistical analysis , 1984 .

[28]  L. A. Goodman,et al.  Measures of association for cross classifications , 1979 .

[29]  Michael Greenacre,et al.  Biplots in Practice , 2009 .

[30]  D. G. Simpson,et al.  Minimum Hellinger Distance Estimation for the Analysis of Count Data , 1987 .

[31]  Eric J. Beh,et al.  Correspondence Analysis: Theory, Practice and New Strategies , 2014 .

[32]  S. Haberman The Analysis of Residuals in Cross-Classified Tables , 1973 .

[33]  B. Lindsay Efficiency versus robustness : the case for minimum Hellinger distance and related methods , 1994 .

[34]  Erik Bølviken,et al.  Correspondence Analysis: An Alternative To Principal Components , 1982 .

[35]  Timothy R. C. Read,et al.  Multinomial goodness-of-fit tests , 1984 .

[36]  Alison L Gibbs,et al.  On Choosing and Bounding Probability Metrics , 2002, math/0209021.

[37]  Gianmarco Alberti New light on old data : toward understanding settlement and social organization in Middle Bronze Age Aeolian Islands (Sicily) through quantitative and multivariate analysis , 2017 .

[38]  H. Lawal Comparisons of the X2, Y2, Freeman-Tukey and Williams's improved G2 test statistics in small samples of one-way multinomials , 1984 .