On the non‐inferiority of a diagnostic test based on paired observations

Non-inferiority of a diagnostic test to the standard or the optimum test is a common issue in medical research. Often we want to determine if a new diagnostic test is as good as the standard reference test. Sometimes we are interested in an inexpensive test that may have an acceptably inferior sensitivity or specificity. While hypothesis testing procedures and sample size formulae for the equivalence of sensitivity or specificity alone have been proposed, very few studies have discussed simultaneous comparisons of both parameters. In this paper, we present three different testing procedures and sample size formulae for simultaneous comparison of sensitivity and specificity based on paired observations and with known disease status. These statistical procedures are then used to compare two classification rules that identify women for future osteoporotic fracture. Simulation experiments demonstrate that the new tests and sample size formulae give the appropriate type I and II error rates. Differences between our approach and the approach of Lui and Cumberland are discussed.

[1]  Assessment of equivalence on multiple endpoints , 2001, Statistics in medicine.

[2]  W. Cumberland,et al.  Sample size determination for equivalence test using rate ratio of sensitivity and specificity in paired sample data. , 2001, Controlled clinical trials.

[3]  P A Lachenbruch,et al.  Assessing screening tests: extensions of McNemar's test. , 1998, Statistics in medicine.

[4]  J A Bean,et al.  On the sample size for one-sided equivalence of sensitivities based upon McNemar's test. , 1995, Statistics in medicine.

[5]  T. Morikawa,et al.  A useful testing strategy in phase III trials: combined test of superiority and test of equivalence. , 1995, Journal of biopharmaceutical statistics.

[6]  T. Tango EQUIVALENCE TEST AND CONFIDENCE INTERVAL FOR THE DIFFERENCE IN PROPORTIONS FOR THE PAIRED-SAMPLE DESIGN , 1998 .

[7]  R. Berger,et al.  Bioequivalence trials, intersection-union tests and equivalence confidence sets , 1996 .

[8]  J. Swets Sensitivities and specificities of diagnostic tests. , 1982, JAMA.

[9]  M. Segal,et al.  Radiologic staging in patients with endometrial cancer: a meta-analysis. , 1999, Radiology.

[10]  J M Nam,et al.  Establishing equivalence of two treatments and sample size requirements in matched-pairs design. , 1997, Biometrics.

[11]  W. Blackwelder,et al.  Analysis of the ratio of marginal probabilities in a matched‐pair setting , 2002, Statistics in medicine.

[12]  Huey-miin Hsueh,et al.  Tests for equivalence or non‐inferiority for paired binary data , 2002, Statistics in medicine.

[13]  B. Wiens,et al.  Choosing an equivalence limit for noninferiority or equivalence studies. , 2002, Controlled clinical trials.

[14]  S. Cummings,et al.  Classification of Osteoporosis Based on Bone Mineral Densities , 2001, Journal of bone and mineral research : the official journal of the American Society for Bone and Mineral Research.

[15]  J R Thornbury,et al.  Eugene W. Caldwell Lecture. Clinical efficacy of diagnostic imaging: love it or leave it. , 1994, AJR. American journal of roentgenology.

[16]  P. Miller Controversies in Bone Mineral Density Diagnostic Classifications , 2000, Calcified Tissue International.

[17]  P. Miller,et al.  Discordance in patient classification using T-scores. , 1999, Journal of clinical densitometry : the official journal of the International Society for Clinical Densitometry.

[18]  C. Gatsonis Design of evaluations of imaging technologies: development of a paradigm. , 2000, Academic radiology.

[19]  J. Hanley Receiver operating characteristic (ROC) methodology: the state of the art. , 1989, Critical reviews in diagnostic imaging.