Thirteen ways to look at the correlation coefficient

Abstract In 1885, Sir Francis Galton first defined the term “regression” and completed the theory of bivariate correlation. A decade later, Karl Pearson developed the index that we still use to measure correlation, Pearson's r. Our article is written in recognition of the 100th anniversary of Galton's first discussion of regression and correlation. We begin with a brief history. Then we present 13 different formulas, each of which represents a different computational and conceptual definition of r. Each formula suggests a different way of thinking about this index, from algebraic, geometric, and trigonometric settings. We show that Pearson's r (or simple functions of r) may variously be thought of as a special type of mean, a special type of variance, the ratio of two means, the ratio of two variances, the slope of a line, the cosine of an angle, and the tangent to an ellipse, and may be looked at from several other interesting perspectives.

[1]  F. Galton Regression Towards Mediocrity in Hereditary Stature. , 1886 .

[2]  K. Pearson VII. Note on regression and inheritance in the case of two parents , 1895, Proceedings of the Royal Society of London.

[3]  K. Pearson NOTES ON THE HISTORY OF CORRELATION , 1920 .

[4]  H. M. Walker,et al.  Studies in the history of statistical method , 1930 .

[5]  H. E. Brogden,et al.  On the interpretation of the correlation coefficient as a measure of predictive efficiency. , 1946, Journal of educational psychology.

[6]  M. Nefzger,et al.  The needless assumption of normality in Pearson's r. , 1957 .

[7]  J. Carroll The nature of the data, or how to choose a correlation coefficient , 1961 .

[8]  J. Stanley Quasi-Experimentation , 1965, The School Review.

[9]  N. Draper,et al.  Applied Regression Analysis , 1966 .

[10]  H. Seal Studies in the history of probability and statistics. XV. The historical velopment of the Gauss linear model. , 1967, Biometrika.

[11]  M. R. Novick,et al.  Statistical Theories of Mental Test Scores. , 1971 .

[12]  S. Mulaik,et al.  Foundations of Factor Analysis , 1975 .

[13]  Joan Fisher Box,et al.  R. A. Fisher, the Life of a Scientist , 1978 .

[14]  D. J. Finney,et al.  Life of a Scientist , 1979, Asia-Pacific Biotech News.

[15]  Newton E. Morton,et al.  Fisher, the life of a scientist. , 1979 .

[16]  Donald MacKenzie,et al.  Statistics in Britain, 1865-1930 , 1981 .

[17]  W. A. Nicewander,et al.  The Correlation Coefficient as the Ratio of Two Means: An Interpretation Due to Galton and Brogden. , 1982 .

[18]  E. Marks A Note on a Geometric Interpretation of the Correlation Coefficient , 1982 .

[19]  A. Cohen On the Effect of Class Size on the Evaluation of Lecturers' Performance , 1983 .

[20]  J. Rodgers,et al.  Linearly Independent, Orthogonal, and Uncorrelated Variables , 1984 .

[21]  G. Châtillon The Balloon Rules for a Rough Estimate of the Correlation Coefficient , 1984 .

[22]  Daniel J. Ozer,et al.  Correlation and the coefficient of determination , 1985 .