Classification accuracy in Key Stage 2 National Curriculum tests in England
暂无分享,去创建一个
[1] G. Masters,et al. Rating Scale Analysis. Rasch Measurement. , 1983 .
[2] L. Harvill,et al. Standard Error of Measurement , 1991 .
[3] D. Andrich. Rating Scale Analysis , 1999 .
[4] Won-Chan Lee,et al. Classification Consistency and Accuracy for Complex Assessments Using Item Response Theory , 2010 .
[5] Bradley A. Hanson. A Comparison of Presmoothing and Postsmoothing Methods in Equipercentile Equating. ACT Research Report Series 94-4. , 1994 .
[6] R. Hambleton,et al. Fundamentals of Item Response Theory , 1991 .
[7] Robert L. Brennan,et al. Center for Advanced Studies in Measurement and Assessment , 2009 .
[8] Willem J. van der Linden,et al. Book reviews: Applying the Rasch Model , 2001 .
[9] Paul Black,et al. The Reliability of assessments , 2012 .
[10] Qingping He,et al. The reliability programme: final report , 2011 .
[11] Bo Zhang,et al. Investigating Proficiency Classification for the Examination for the Certificate of Proficiency in English (ECPE) , 2008 .
[12] L. S. Feldt,et al. A Comparison of Five Methods for Estimating the Standard Error of Measurement at Specific Score Levels , 1985 .
[13] M. R. Novick,et al. Statistical Theories of Mental Test Scores. , 1971 .
[14] P. Newton. The reliability of results from national curriculum testing in England , 2009 .
[15] B. Hanson. Method of Moments Estimates for the Four-Parameter Beta Compound Binomial Model and the Calculation of Classification Consistency Indexes , 1991 .
[16] G. Bolton. Reliability , 2003, Medical Humanities.
[17] F. Lord. Applications of Item Response Theory To Practical Testing Problems , 1980 .
[18] Lyle F. Bachman,et al. 语言测试实践 = Language testing in practice , 1998 .
[19] L. Cronbach. Coefficient alpha and the internal structure of tests , 1951 .
[20] M. R. Espejo. Applying the Rasch Model: Fundamental Measurement in the Human Sciences , 2004 .
[21] Lyle F. Bachman. Statistical analyses for language assessment , 2004 .
[22] D. Wiliam. Reliability, validity, and all that jazz , 2001 .
[23] Shameem Nyla. NATIONAL COUNCIL ON MEASUREMENT IN EDUCATION , 2004 .
[24] D. Eignor. The standards for educational and psychological testing. , 2013 .
[25] Lawrence M. Rudner. Computing the Expected Proportions of Misclassified Examinees. , 2001 .
[26] Charles Lewis,et al. Estimating the Consistency and Accuracy of Classifications Based on Test Scores , 1993 .
[27] Audrey L. Quails-Payne. A Comparison of Score Level Estimates of the Standard Error of Measurement , 1992 .
[28] G. Masters. A rasch model for partial credit scoring , 1982 .
[29] F. Lord. Estimating true-score distributions in psychological testing (an empirical bayes estimation problem) , 1969 .
[30] L. Crocker,et al. Introduction to Classical and Modern Test Theory , 1986 .
[31] Georg Rasch,et al. Probabilistic Models for Some Intelligence and Attainment Tests , 1981, The SAGE Encyclopedia of Research Design.
[32] J. Gardner,et al. The fallibility of high stakes ‘11‐plus’ testing in Northern Ireland , 2005 .
[33] R. Hambleton,et al. Item Response Theory , 1984, The History of Educational Measurement.
[34] R. Traub,et al. NCME Instructional Module: Understanding Reliability. , 1991 .
[35] Lawrence M. Rudner. Expected Classification Accuracy , 2005 .