A comparison of tests for Hardy-Weinberg equilibrium.

The various tests for the goodness of fit of a population to Hardy-Weinberg equilibrium are compared with respect to their power and the accuracy of their distributional approximations. The tests are similar in terms of power. However, when one or more alleles are rare (with frequencies of less than .45 to .21 for n = 10 and 200, respectively) some of the tests are not able to detect outbreeding. When all but one allele are rare (frequencies of less than .40 to .15 for n = 10 and 200, respectively) none of the tests are able to detect outbreeding. In terms of distributional assumptions, there are two situations: (i) Test of hypothesis (when a preset significance level is chosen and the null hypothesis accepted or rejected at that level); here the chi square test with conditional expectations, the Freeman-Tukey test and the Mantel-Li test, all without continuity correction, were found to be best; they were closely followed by the chi square test and conditional chi square test, both with a continuity correction of 1/4, and by the Elston-Forthofer average test. (ii) Test of significance (when the obtained level of significance is used as strength against the null hypothesis); here the chi square test and conditional chi square test, both with a continuity correction of 1/4, and the average test were found to be best.