A Second Category of Limitations in the Applicability of the Contingency Coefficient

Pearson in his fundamental paper on the theory of contingency, indicated clearly the difficulties of comparing coefficients of association, correlation and contingency. He then wrote (p. 22): "The degree of approach of both C1 and C2 to the correlation must be studied for each special class of cases, and only when this has been done will their use be really legitimate and effective." This was written twenty-five years ago, but notwithstanding various subsequent contributions by Pearson himself, much remains to be done in the study of the special classes of cases to which it may seem desirable to apply contingency methods. This is unfortunate because of the fact that the contingency coefficient is the most general measure of the relationship between two variables. The contingency coefficient is independent of the nature of the variables (whether quantitatively measurable or only describable in categories), of the form of the frequency distribution, of the order of arrangement of the classes, and of the nature of the regression curve. In an earlier paper the inapplicability of the contingency method in its usual form to cases in which x is the limiting value of y was indicated.2 The purpose of the present paper is to call attention to another category of cases in which the application of the contingency method may lead to coefficients differing widely from the correlation coefficient and the correlation ratio. The present category has in common with that already discussed the condition that certain cells of the contingency surface (showing the distribution of frequencies of individuals with the attributes x and y, each of several classes) are void of actual frequencies because of physical reasons but are necessarily assigned theoretical frequencies in the conventional method of calculating the independent probabilities of occurrence of combinations of x and y. In the case already considered the void cells are distributed in the form of a triangle occupying the region of the surface above or below the diagonal row of cells, and the limitation in the applicability of the contingency method results from the fact that y is necessarily equal to