Assessing screening tests: extensions of McNemar's test.

We address the problem of comparing a new screening test to a currently available screening test in the absence of a gold standard. When both tests are given to each participant in a clinical trial, the usual analytical approach is to apply McNemar's test for equality of the off-diagonal probabilities, with rejection of the null hypothesis implying that the tests differ. For assessing equivalence, however, we consider a compound null hypothesis that the new test gives either fewer or more positive results than the standard. If both parts of this hypothesis are rejected, we assert equivalence in the rate of positive responses. We propose an extension of McNemar's test for this situation. A companion step is to construct a confidence interval for the ratio of the marginal probabilities and assert equivalence if the interval is sufficiently small. It is also important that the tests agree a large proportion of the time. This can be verified with a complementary two-tailed binomial test. Another situation arises when there is a gold standard for disease diagnosis, and we wish to compare the sensitivity and specificity of two screening tests. We show that a 2 degrees-of-freedom chi-square test based on two McNemar-like tables is an appropriate test.