We develop computational tools that can evaluate the exact size and power of three tests of trend – permutation, bootstrap and asymptotic – without resorting to large-sample theory or simulations. We then use these tools to compare the operating characteristics of the three tests. It is seen that the bootstrap test is ultra-conservative relative to the other two tests and as a result suffers from a severe deterioration in power. The power of the asymptotic test is uniformly larger than that of the other two tests, but it fails to preserve the type-1 error for most of the range of the baseline response probability. The permutation test, being exact is guaranteed to preserve the type-1 error throughout the range of the baseline response probability. The price paid for this guarantee is a loss of power relative to the asymptotic test. The power loss is, however, small in most situations. 1 Motivating Example Forty mice were divided into four equal groups. Each group was treated with a different dose of an animal carcinogen as a result of which some mice developed a tumor. The data are displayed in Table 1. The goal is to test for a dose-response relationship. Specifically, let πj be the Bernoulli probability that an animal treated at dose dj develops a tumor. We wish to test the null hypothesis H0: π1 = π2 = π3 = π4 ≡ π (1.1) against the one-sided alternative hypothesis H1: π1 ≤ π2 ≤ π3 ≤ π4 (1.2)
[1]
P. Armitage.
Tests for Linear Trends in Proportions and Frequencies
,
1955
.
[2]
C. R. Mehta,et al.
StatXact : a statistical package for exact non-parametric inference, Cytel Software Corporation, Cambridge, MA, USA
,
1990
.
[3]
W. G. Cochran.
Some Methods for Strengthening the Common χ 2 Tests
,
1954
.
[4]
Cyrus R. Mehta,et al.
EXACT POWER AND SAMPLE-SIZE COMPUTATIONS FOR THE COCHRAN-ARMITAGE TREND TEST
,
1998
.
[5]
P Senchaudhuri,et al.
Power comparisons for tests of trend in dose-response studies.
,
2000,
Statistics in medicine.