Reliability in evaluator-based tests: using simulation-constructed models to determine contextually relevant agreement thresholds
暂无分享,去创建一个
Z. C. Thumser | D. T. Beckler | Jonathon S. Schofield | Zachary C. Thumser | Paul D. Marasco | Dylan T. Beckler | P. Marasco
[1] J. Bartlett,et al. Reliability, repeatability and reproducibility: analysis of measurement errors in continuous variables , 2008, Ultrasound in obstetrics & gynecology : the official journal of the International Society of Ultrasound in Obstetrics and Gynecology.
[2] Tatyana Shatalova,et al. On the choice of measures of reliability and validity in the content-analysis of texts , 2014 .
[3] Klaus Krippendorff,et al. Answering the Call for a Standard Reliability Measure for Coding Data , 2007 .
[4] P. Neven,et al. Inter-rater reliability of shoulder measurements in middle-aged women. , 2017, Physiotherapy.
[5] Roberto Revetria,et al. Monte Carlo Simulation Models Evolving in Replicated Runs: A Methodology to Choose the Optimal Experimental Sample Size , 2012 .
[6] Mary McGee Wood,et al. Squibs and Discussions: Evaluating Discourse and Dialogue Coding Schemes , 2005, CL.
[7] E. Bartels,et al. Reliability of Pain Measurements Using Computerized Cuff Algometry: A DoloCuff Reliability and Agreement Study , 2017, Pain practice : the official journal of World Institute of Pain.
[8] Jean Carletta,et al. Squibs: Reliability Measurement without Limits , 2008, CL.
[9] Jean-Yves Antoine,et al. Weighted Krippendorff’s alpha is a more reliable metrics for multi-coders ordinal annotations: experimental studies on emotion, opinion and coreference annotation , 2014, EACL.
[10] Ron Artstein,et al. Survey Article: Inter-Coder Agreement for Computational Linguistics , 2008, CL.
[11] P. Vacek,et al. A protocol for the Hamilton Rating Scale for Depression: Item scoring rules, Rater training, and outcome accuracy with data on its application in a clinical trial. , 2016, Journal of affective disorders.
[12] Klaus Krippendorff,et al. Content Analysis: An Introduction to Its Methodology , 1980 .
[13] E. Wikstrom,et al. Reliability of two-point discrimination thresholds using a 4-2-1 stepping algorithm , 2016, Somatosensory & motor research.
[14] Barbara Maria Di Eugenio,et al. Squibs and Discussions - The Kappa Statistic , 2004 .
[15] K. Krippendorff. Krippendorff, Klaus, Content Analysis: An Introduction to its Methodology . Beverly Hills, CA: Sage, 1980. , 1980 .
[16] S. Walter,et al. Sample size and optimal designs for reliability studies. , 1998, Statistics in medicine.
[17] M. Banerjee,et al. Beyond kappa: A review of interrater agreement measures , 1999 .
[18] H. Endeman,et al. Validation of the Dutch version of the critical-care pain observation tool. , 2019, Nursing in critical care.
[19] Klaus Krippendorff,et al. Agreement and Information in the Reliability of Coding , 2011 .
[20] Jean Carletta,et al. Assessing Agreement on Classification Tasks: The Kappa Statistic , 1996, CL.
[21] Guangchao Charles Feng,et al. Factors affecting intercoder reliability: a Monte Carlo experiment , 2013 .
[22] C. Cooper,et al. Inter-rater reliability of distal ureteral diameter ratio compared to grade of VUR. , 2016, Journal of Pediatric Urology.
[23] E. R. Cohen. An Introduction to Error Analysis: The Study of Uncertainties in Physical Measurements , 1998 .
[24] T. Sozu,et al. Effective number of subjects and number of raters for inter‐rater reliability studies , 2006, Statistics in medicine.
[25] G. Bonsel,et al. Feasibility and reliability of a newly developed antenatal risk score card in routine care. , 2015, Midwifery.
[26] Anne Garrison Wilhelm,et al. Exploring Differences in Measurement and Reporting of Classroom Observation Inter-Rater Reliability. , 2018 .
[27] Klaus Krippendorff,et al. Computing Krippendorff's Alpha-Reliability , 2011 .
[28] Deen Freelon,et al. ReCal OIR : Ordinal , Interval , and Ratio Intercoder Reliability as a Web Service , 2013 .
[29] K. Krippendorff. Reliability in Content Analysis: Some Common Misconceptions and Recommendations , 2004 .
[30] M. Lombard,et al. Content Analysis in Mass Communication: Assessment and Reporting of Intercoder Reliability , 2002 .
[31] Barbara Di Eugenio,et al. Squibs and Discussions: The Kappa Statistic: A Second Look , 2004, CL.
[32] Rebecca Zwick,et al. Another look at interrater agreement. , 1988, Psychological bulletin.