On inferences from Wei's biased coin design for clinical trials

Wei (1988) analyzed data from a clinical trial in which an urn-sampling model was used to allocate patients to treatments. The trial resulted in 11 patients being allocated to the experimental treatment, all successes, and with one patient allocated to the control treatment, a failure. Wei analyzed these data using a randomization algorithm and concluded that the results were almost significant at p = 0 051. He asserted that an analysis which ignored the design and presumed complete randomization leads to a p-value of 0-001. In fact, if a more conventional analysis is used in which both margins are fixed, then this leads to p = 0-083, Fisher's exact test, if complete randomization is assumed. If the urn-sampling allocation is taken into account much more conservative inferences are obtained. An analysis which conditions on both margins leads to p = 028, while one which conditions on the observed sequence of responses, and the observed treatment totals leads to p = 0-62. These serious discrepancies are discussed, in addition to the inappropriateness of biased coin design and small sample sizes in important medical trials.