Intraclass correlation – A discussion and demonstration of basic features

A re-analysis of intraclass correlation (ICC) theory is presented together with Monte Carlo simulations of ICC probability distributions. A partly revised and simplified theory of the single-score ICC is obtained, together with an alternative and simple recipe for its use in reliability studies. Our main, practical conclusion is that in the analysis of a reliability study it is neither necessary nor convenient to start from an initial choice of a specified statistical model. Rather, one may impartially use all three single-score ICC formulas. A near equality of the three ICC values indicates the absence of bias (systematic error), in which case the classical (one-way random) ICC may be used. A consistency ICC larger than absolute agreement ICC indicates the presence of non-negligible bias; if so, classical ICC is invalid and misleading. An F-test may be used to confirm whether biases are present. From the resulting model (without or with bias) variances and confidence intervals may then be calculated. In presence of bias, both absolute agreement ICC and consistency ICC should be reported, since they give different and complementary information about the reliability of the method. A clinical example with data from the literature is given.

[1]  A. Ståhle,et al.  The Mini-BESTest - a clinically reproducible tool for balance evaluations in mild to moderate Parkinson’s disease? , 2014, BMC Neurology.

[2]  M Lamontagne,et al.  Reliability of EMG spectral parameters in repeated measurements of back muscle fatigue. , 1999, Journal of electromyography and kinesiology : official journal of the International Society of Electrophysiological Kinesiology.

[3]  Phong Tran,et al.  Reliability of radiographic measurements for acute distal radius fractures , 2016, BMC Medical Imaging.

[4]  C. M. Pastre,et al.  Test-retest reliability of knee extensors endurance test with elastic resistance , 2018, PloS one.

[5]  Terry K Koo,et al.  A Guideline of Selecting and Reporting Intraclass Correlation Coefficients for Reliability Research. , 2016, Journal of chiropractic medicine.

[6]  M. Ernst,et al.  Intra and interrater reliability and clinical feasibility of a simple measure of cervical movement sense in patients with neck pain , 2018, BMC Musculoskeletal Disorders.

[7]  Peter Tyrer,et al.  The Effect of Number of Rating Scale Categories on Levels of Interrater Reliability : A Monte Carlo Investigation , 1985 .

[8]  Allan Donner,et al.  Testing the equality of dependent intraclass correlation coefficients , 2002 .

[9]  R. Downey,et al.  Intraclass Correlations: There's More There Than Meets the Eye , 1983 .

[10]  A. Vuillemin,et al.  Reliability and validity of the French version of the global physical activity questionnaire , 2016, Journal of sport and health science.

[11]  C. Hughes,et al.  Development of a core outcome set for trials investigating the long-term management of bronchiectasis , 2018, Chronic respiratory disease.

[12]  Helena Larsson,et al.  Reliability and agreement of the IsoKai isokinetic lift test – A test used for admission to the Swedish Armed Forces , 2018, PloS one.

[13]  C. Terwee,et al.  When to use agreement versus reliability measures. , 2006, Journal of clinical epidemiology.

[14]  Howard W. Alexander,et al.  The estimation of reliability when several trials are available , 1947, Psychometrika.

[15]  K. McGraw,et al.  Forming inferences about some intraclass correlation coefficients. , 1996 .

[16]  R. Müller,et al.  A critical discussion of intraclass correlation coefficients. , 1994, Statistics in medicine.

[17]  R. Fisher Statistical methods for research workers , 1927, Protoplasma.

[18]  T. Sozu,et al.  Effective number of subjects and number of raters for inter‐rater reliability studies , 2006, Statistics in medicine.

[19]  J. Bartko,et al.  On Various Intraclass Correlation Reliability Coefficients , 1976 .

[20]  E. Dietrichs,et al.  The reliability of gait variability measures for individuals with Parkinson's disease and healthy older adults - The effect of gait speed. , 2018, Gait & posture.

[21]  J. Stanghelle,et al.  Test-retest reliability at the item level and total score level of the Norwegian version of the Spinal Cord Injury Falls Concern Scale (SCI-FCS) , 2016, The journal of spinal cord medicine.

[22]  F. Maltais,et al.  Inter-day test–retest reliability and feasibility of isokinetic, isometric, and isotonic measurements to assess quadriceps endurance in people with chronic obstructive pulmonary disease: A multicenter study , 2018, Chronic respiratory disease.

[23]  T. Lundeberg,et al.  Reproducibility of manual pressure force on provocation of the sacroiliac joint. , 1998, Physiotherapy research international : the journal for researchers and clinicians in physical therapy.

[24]  Douglas G. Altman,et al.  Practical statistics for medical research , 1990 .

[25]  J. Fleiss,et al.  Intraclass correlations: uses in assessing rater reliability. , 1979, Psychological bulletin.

[26]  Shinichi Nakagawa,et al.  Repeatability for Gaussian and non‐Gaussian data: a practical guide for biologists , 2010, Biological reviews of the Cambridge Philosophical Society.

[27]  A. Khatibi,et al.  Reliability and Validity of the Pain Anxiety Symptom Scale in Persian Speaking Chronic Low Back Pain Patients , 2017, Spine.

[28]  Joseph C Cappelleri,et al.  A modified large-sample approach to approximate interval estimation for a particular intraclass correlation coefficient. , 2003, Statistics in medicine.

[29]  P. Madeleine,et al.  Inter-day reliability of surface electromyography recordings of the lumbar part of erector spinae longissimus and trapezius descendens during box lifting , 2017, BMC Musculoskeletal Disorders.

[30]  J. Bartko The Intraclass Correlation Coefficient as a Measure of Reliability , 1966, Psychological reports.

[31]  J. Weir Quantifying test-retest reliability using the intraclass correlation coefficient and the SEM. , 2005, Journal of strength and conditioning research.

[32]  A. Holland,et al.  Reliability of the hand held dynamometer in measuring muscle strength in people with interstitial lung disease. , 2016, Physiotherapy.