Approximating the Tetrachoric Correlation Coefficient

Samples (x, y), taken from a bivariate normal distribution with correlation, p, can be allocated to one of the cells of a 2 X 2 contingency table according to whether x t xo or x > xO and whether y yo, where xo and yo are cut-off values. Such a contingency table is shown in Table 1, where a is the number of samples for which x > xo and y > yO, and Si is the proportion of samples for which x > xo. The tetrachoric correlation coefficient, r, is obtained from a 2 x 2 contingency table and provides an estimate of the underlying correlation, p. Everitt (1910) tabulated the parameters of a kth-order polynomial in r for k 6 and gave details of the parameters for 7 24; to obviate this, additional tables for use when r 2 0.8 were given by Everitt (1912). These tables are also given in the compilation by Pearson (1914), where they cover 16 pages. Clearly, this method of calculating r is cumbersome, and Brown (1977) has given an algorithm for finding the tetrachoric correlation; thus the original laborious method is replaced by a computer program.