Correspondence analysis is a useful tool to uncover the relationships among categorical variables.

OBJECTIVE Correspondence analysis (CA) is a multivariate graphical technique designed to explore the relationships among categorical variables. Epidemiologists frequently collect data on multiple categorical variables with the goal of examining associations among these variables. Nevertheless, CA appears to be an underused technique in epidemiology. The objective of this article is to present the utility of CA in an epidemiological context. STUDY DESIGN AND SETTING The theory and interpretation of CA in the case of two and more than two variables are illustrated through two examples. RESULTS The outcome from CA is a graphical display of the rows and columns of a contingency table that is designed to permit visualization of the salient relationships among the variable responses in a low-dimensional space. Such a representation reveals a more global picture of the relationships among row-column pairs, which would otherwise not be detected through a pairwise analysis. CONCLUSION When the study variables of interest are categorical, CA is an appropriate technique to explore the relationships among variable response categories and can play a complementary role in analyzing epidemiological data.

[1]  B. F. M. Bakker A new measure of social status for men and women : the social distance scale , 1993 .

[2]  François Béland,et al.  A correspondence analysis revealed frailty deficits aggregate and are multidimensional. , 2010, Journal of clinical epidemiology.

[3]  Richard Kahn,et al.  The metabolic syndrome: time for a critical appraisal: joint statement from the American Diabetes Association and the European Association for the Study of Diabetes. , 2005, Diabetes care.

[4]  J. P. Benzécri,et al.  Sur le calcul des taux d'inertie dans l'analyse d'un questionnaire, addendum et erratum à [BIN. MULT.] , 1979 .

[5]  Bernard Harris,et al.  Tetrachoric Correlation Coefficient , 2006 .

[6]  A. D. Gordon,et al.  Correspondence Analysis Handbook. , 1993 .

[7]  J Coste,et al.  Clinical and psychological diversity of non-specific low-back pain. A new approach towards the classification of clinical subgroups. , 1991, Journal of clinical epidemiology.

[8]  P.G.M. Van der Heijden,et al.  A Combined Approach to Contingency Table Analysis Using Correspondence Analysis and Log-Linear Analysis , 1989 .

[9]  Leo A. Goodman,et al.  Association Models and Canonical Correlation in the Analysis of Cross-Classifications Having Ordered Categories , 1981 .

[10]  E. Ford,et al.  Factor analysis and defining the metabolic syndrome. , 2003, Ethnicity & disease.

[11]  I. D'Agnano,et al.  Evaluation of multiple bio‐pathological factors in colorectal adenocarcinomas: Independent prognostic role of p53 and bcl‐2 , 1999, International journal of cancer.

[12]  M. Hill Correspondence Analysis: A Neglected Multivariate Method , 1974 .

[13]  S. Clausen,et al.  Applied correspondence analysis , 1998 .

[14]  François Béland,et al.  A system of integrated care for older persons with disabilities in Canada: results from a randomized controlled trial. , 2006, The journals of gerontology. Series A, Biological sciences and medical sciences.

[15]  M. Greenacre Correspondence analysis in practice , 1993 .

[16]  N. T. Higgs,et al.  Practical and Innovative Uses of Correspondence Analysis , 1991 .

[17]  Luigi Ferrucci,et al.  Frailty: an emerging research and clinical paradigm--issues and controversies. , 2006, The journals of gerontology. Series A, Biological sciences and medical sciences.

[18]  G. M. Southward,et al.  Analysis of Categorical Data: Dual Scaling and Its Applications , 1981 .

[19]  S. Mastana,et al.  Molecular genetic variation in the East Midlands, England: analysis of VNTR, STR and Alu insertion/deletion polymorphisms , 2003, Annals of human biology.

[20]  Craig Hayward,et al.  Mapping the Ethnic Landscape: Personal Beliefs About Own Group’s and Other Groups’ Traits , 2005 .

[21]  Demosthenes B. Panagiotakos,et al.  Interpretation of Epidemiological Data Using Multiple Correspondence Analysis and Log-linear Models , 2021, Journal of Data Science.

[22]  B. Muthén Contributions to factor analysis of dichotomous variables , 1978 .

[23]  Michael Friendly,et al.  Visualizing Categorical Data , 2009, Encyclopedia of Database Systems.

[24]  A Ciampi,et al.  Cluster analysis of an insulin-dependent diabetic cohort towards the definition of clinical subtypes. , 1990, Journal of clinical epidemiology.

[25]  H. Hirschfeld A Connection between Correlation and Contingency , 1935, Mathematical Proceedings of the Cambridge Philosophical Society.

[26]  R. Clarke,et al.  Theory and Applications of Correspondence Analysis , 1985 .

[27]  M. Greenacre,et al.  Correspondence Analysis in the Social Sciences. , 1995 .

[28]  R. Cattell The Scree Test For The Number Of Factors. , 1966, Multivariate behavioral research.

[29]  J. Meigs,et al.  Invited commentary: insulin resistance syndrome? Syndrome X? Multiple metabolic syndrome? A syndrome at all? Factor analysis reveals patterns in the fabric of correlated metabolic risk factors. , 2000, American journal of epidemiology.