High Agreement and High Prevalence: The Paradox of Cohen’s Kappa
暂无分享,去创建一个
I. Baldi | S. Zec | R. Comoretto | N. Soriani
[1] K. Gwet. Handbook of Inter-Rater Reliability: The Definitive Guide to Measuring the Extent of Agreement Among Raters , 2014 .
[2] Jeffrey J. Fletcher,et al. Inter-Observer Agreement on the Diagnosis of Neurocardiogenic Injury Following Aneurysmal Subarachnoid Hemorrhage , 2014, Neurocritical Care.
[3] T. Marwick,et al. Development of a consensus algorithm to improve interobserver agreement and accuracy in the determination of tricuspid regurgitation severity. , 2014, Journal of the American Society of Echocardiography : official publication of the American Society of Echocardiography.
[4] B. Seifert,et al. Imaging Non-Specific Wrist Pain: Interobserver Agreement and Diagnostic Accuracy of SPECT/CT, MRI, CT, Bone Scan and Plain Radiographs , 2013, PloS one.
[5] N. Egund,et al. Spondyloarthritis-related and degenerative MRI changes in the axial skeleton - an inter- and intra-observer agreement study , 2013, BMC Musculoskeletal Disorders.
[6] K. Gwet. Computing inter-rater reliability and its variance in the presence of high agreement. , 2008, The British journal of mathematical and statistical psychology.
[7] Tasha R. Stanton,et al. Scales to Assess the Quality of Randomized Controlled Trials: A Systematic Review , 2008, Physical Therapy.
[8] D. Altman,et al. Systematic reviews in health care: Assessing the quality of controlled clinical trials. , 2001, BMJ.
[9] D. Moher,et al. The CONSORT statement: revised recommendations for improving the quality of reports of parallel group randomized trials , 2001, Annals of Internal Medicine.
[10] I. Olkin,et al. Improving the quality of reports of meta-analyses of randomised controlled trials: the QUOROM statement , 1999, The Lancet.
[11] H. Vet,et al. The Delphi list: a criteria list for quality assessment of randomized clinical trials for conducting systematic reviews developed by Delphi consensus. , 1998, Journal of clinical epidemiology.
[12] A R Jadad,et al. Assessing the Quality of Randomized Controlled Trials: Current Issues and Future Directions , 1996, International Journal of Technology Assessment in Health Care.
[13] A R Jadad,et al. Assessing the quality of reports of randomized clinical trials: is blinding necessary? , 1996, Controlled clinical trials.
[14] A R Jadad,et al. Assessing the quality of randomized controlled trials: an annotated bibliography of scales and checklists. , 1995, Controlled clinical trials.
[15] J. Carlin,et al. Bias, prevalence and kappa. , 1993, Journal of clinical epidemiology.
[16] M. Aickin. Maximum likelihood estimation of agreement in the constant predictive probability model, and its relation to Cohen's kappa. , 1990, Biometrics.
[17] A. J. Conger. Integration and generalization of kappas for multiple raters. , 1980 .
[18] J. Fleiss,et al. Intraclass correlations: uses in assessing rater reliability. , 1979, Psychological bulletin.
[19] J. R. Landis,et al. An application of hierarchical kappa-type statistics in the assessment of majority agreement among multiple observers. , 1977, Biometrics.
[20] D. Weiss,et al. Interrater reliability and agreement of subjective judgments , 1975 .
[21] Jacob Cohen. A Coefficient of Agreement for Nominal Scales , 1960 .
[22] R. Alpert,et al. Communications Through Limited-Response Questioning , 1954 .
[23] I. Baldi,et al. Research in Nursing and Nutrition: Is Randomized Clinical Trial the Actual Gold Standard? , 2017, Gastroenterology nursing : the official journal of the Society of Gastroenterology Nurses and Associates.
[24] K. Gwet. Kappa Statistic is not Satisfactory for Assessing the Extent of Agreement Between Raters , 2002 .
[25] K. Gwet. Inter-Rater Reliability: Dependency on Trait Prevalence and Marginal Homogeneity , 2002 .
[26] A. Feinstein,et al. High agreement but low kappa: II. Resolving the paradoxes. , 1990, Journal of clinical epidemiology.
[27] A. Feinstein,et al. High agreement but low kappa: I. The problems of two paradoxes. , 1990, Journal of clinical epidemiology.
[28] J. Fleiss. Measuring nominal scale agreement among many raters. , 1971 .
[29] W. A. Scott,et al. Reliability of Content Analysis ; The Case of Nominal Scale Cording , 1955 .