Procedures for two-sample comparisons with multiple endpoints controlling the experimentwise error rate.

Clinical trials are often concerned with the comparison of two treatment groups with multiple endpoints. As alternatives to the commonly used methods, the T2 test and the Bonferroni method, O'Brien (1984, Biometrics 40, 1079-1087) proposes tests based on statistics that are simple or weighted sums of the single endpoints. This approach turns out to be powerful if all treatment differences are in the same direction [compare Pocock, Geller, and Tsiatis (1987, Biometrics 43, 487-498)]. The disadvantage of these multivariate methods is that they are suitable only for demonstrating a global difference, whereas the clinician is further interested in which specific endpoints or sets of endpoints actually caused this difference. It is shown here that all tests are suitable for the construction of a closed multiple test procedure where, after the rejection of the global hypothesis, all lower-dimensional marginal hypotheses and finally the single hypotheses are tested step by step. This procedure controls the experimentwise error rate. It is just as powerful as the multivariate test and, in addition, it is possible to detect significant differences between the endpoints or sets of endpoints.