Measuring Agreement in Method Comparison Studies — A Review

Assessment of agreement between two or more methods of measurement is of considerable importance in many areas. In particular, in medicine, new methods or devices that are cheaper, easier to use, or less invasive, are routinely developed. Agreement between a new method and a traditional reference or gold standard must be evaluated before the new one is put into practice. Various methodologies have been proposed for this purpose in recent years. We review the literature focussing on the assessment of agreement between two methods, and on the selection of the best when several methods are compared with a reference. A real data set is analyzed to illustrate the various approaches.

[1]  G. Casella,et al.  Statistical Inference , 2003, Encyclopedia of Social Network Analysis and Mining.

[2]  E. Geiser,et al.  Measuring Relative Agreement: Echocardiographer versus Computer , 1998 .

[3]  Gabrielle E. Kelly,et al.  Use of the Structural Equations Model in Assessing the Reliability of a New Measurement Technique , 1985 .

[4]  J. Hsu Multiple Comparisons: Theory and Methods , 1996 .

[5]  J J Bartko,et al.  Measures of agreement: a single procedure. , 1994, Statistics in medicine.

[6]  Douglas G. Altman,et al.  Measurement in Medicine: The Analysis of Method Comparison Studies , 1983 .

[7]  Pankaj K. Choudhary ASSESSMENT OF AGREEMENT AND SELECTION OF THE BEST INSTRUMENT IN METHOD COMPARISON STUDIES , 2002 .

[8]  J. Fleiss Statistical methods for rates and proportions , 1974 .

[9]  Pankaj K Choudhary,et al.  Assessment of Agreement Using Intersection‐Union Principle , 2005, Biometrical journal. Biometrische Zeitschrift.

[10]  Weizhen Wang,et al.  A nearly unbiased test for individual bioequivalence problems using probability criteria , 2001 .

[11]  Stan C. Lin,et al.  Evaluation of statistical equivalence using limits of agreement and associated sample size calculation , 1998 .

[12]  H. N. Nagaraja,et al.  Order Statistics, Third Edition , 2005, Wiley Series in Probability and Statistics.

[13]  R. Müller,et al.  A critical discussion of intraclass correlation coefficients. , 1994, Statistics in medicine.

[14]  Pankaj K. Choudhary,et al.  Tests for assessment of agreement using probability criteria , 2007 .

[15]  L. Lin Assay Validation Using the Concordance Correlation Coefficient , 1992 .

[16]  H. Nagaraja,et al.  Selecting the instrument closest to a gold standard , 2005 .

[17]  N. Mukhopadhyay,et al.  On selecting the best component of a multivariate normal population , 1984 .

[18]  L. Lin,et al.  A concordance correlation coefficient to evaluate reproducibility. , 1989, Biometrics.

[19]  L. Torbeck,et al.  Coefficient of accuracy and concordance correlation coefficient: new statistics for methods comparison. , 1998, PDA journal of pharmaceutical science and technology.

[20]  H. Nagaraja,et al.  A Two-Stage Procedure for Selection and Assessment of Agreement of the Best Instrument with a Gold Standard , 2005 .

[21]  L. Lin,et al.  Total deviation index for measuring individual agreement with applications in laboratory performance and bioequivalence. , 2000, Statistics in medicine.

[22]  Roy T. St. Laurent,et al.  Evaluating agreement with a gold standard in method comparison studies. , 1998 .

[23]  R. Berger,et al.  Bioequivalence trials, intersection-union tests and equivalence confidence sets , 1996 .

[24]  Huiman X Barnhart,et al.  Overall Concordance Correlation Coefficient for Evaluating Agreement Among Multiple Observers , 2002, Biometrics.

[25]  H Quan,et al.  Assessing reproducibility by the within-subject coefficient of variation with random effects models. , 1996, Biometrics.

[26]  S. Chow,et al.  A two one-sided tests procedure for assessment of individual bioequivalence. , 1997, Journal of biopharmaceutical statistics.

[27]  Brent D. Burch,et al.  A blended estimator for a measure of agreement with a gold standard , 2001 .

[28]  D. Altman,et al.  STATISTICAL METHODS FOR ASSESSING AGREEMENT BETWEEN TWO METHODS OF CLINICAL MEASUREMENT , 1986, The Lancet.

[29]  Shanti S. Gupta,et al.  Multiple decision procedures - theory and methodology of selecting and ranking populations , 1979, Classics in applied mathematics.

[30]  A B Nix,et al.  Maximum likelihood techniques applied to method comparison studies. , 1991, Statistics in medicine.

[31]  D. Altman,et al.  A note on the use of the intraclass correlation coefficient in the evaluation of agreement between two methods of measurement. , 1990, Computers in biology and medicine.

[32]  E. Vonesh,et al.  Goodness-of-fit in generalized nonlinear mixed-effects models. , 1996, Biometrics.

[33]  M. Banerjee,et al.  Beyond kappa: A review of interrater agreement measures , 1999 .

[34]  K. Linnet,et al.  Evaluation of regression procedures for methods comparison studies. , 1993, Clinical chemistry.

[35]  M. Shoukri,et al.  Measures of Interobserver Agreement , 2003 .

[36]  C. Nickerson A note on a concordance correlation coefficient to evaluate reproducibility , 1997 .

[37]  D. Altman,et al.  Comparing methods of measurement: why plotting difference against standard method is misleading , 1995, The Lancet.

[38]  Graham Dunn,et al.  Design and Analysis of Reliability Studies: The Statistical Evaluation of Measurement Errors , 1989 .

[39]  K. McGraw,et al.  Forming inferences about some intraclass correlation coefficients. , 1996 .

[40]  J. Fleiss The design and analysis of clinical experiments , 1987 .

[41]  David C. Hamilton,et al.  A comparison of methods for univariate and multivariate acceptance sampling by variables , 1995 .

[42]  D. Koh,et al.  Statistical evaluation of agreement between two methods for measuring a quantitative variable. , 1989, Computers in biology and medicine.

[43]  A. Hedayat,et al.  Statistical Methods in Assessing Agreement , 2002 .

[44]  Art Noda,et al.  Kappa coefficients in medical research , 2002, Statistics in medicine.

[45]  S. Panchapakesan,et al.  Multiple Decision Procedures: Theory and Methology of Selecting and Ranking Populations. John Wiley, Chichester (1979), £20.00 , 1980 .

[46]  Graham Dunn,et al.  Review papers : Design and analysis of reliability studies , 1992 .

[47]  S Kumanyika,et al.  A weighted concordance correlation coefficient for repeated measurement designs. , 1996, Biometrics.

[48]  D. Altman,et al.  Measuring agreement in method comparison studies , 1999, Statistical methods in medical research.

[49]  H. Barnhart,et al.  Modeling Concordance Correlation via GEE to Evaluate Reproducibility , 2001, Biometrics.

[50]  Walter W. Hauck,et al.  Consideration of individual bioequivalence , 1990, Journal of Pharmacokinetics and Biopharmaceutics.

[51]  Douglas M Hawkins,et al.  Diagnostics for conformity of paired quantitative measurements , 2002, Statistics in medicine.

[52]  D G Altman,et al.  Comparing two methods of clinical measurement: a personal history. , 1995, International journal of epidemiology.

[53]  G Dunn,et al.  Modelling method comparison data , 1999, Statistical methods in medical research.

[54]  H. T. Tillotson,et al.  The Problem of Conversion in Method Comparison Studies , 1991 .

[55]  V M Chinchilli,et al.  A generalized concordance correlation coefficient for continuous and categorical data , 2001, Statistics in medicine.