MONITORING FACULTY CONSULTANT PERFORMANCE IN THE ADVANCED PLACEMENT ENGLISH LITERATURE AND COMPOSITION PROGRAM WITH A MANY-FACETED RASCH MODEL
暂无分享,去创建一个
[1] Joseph L. Fleiss,et al. Balanced Incomplete Block Designs for Inter-Rater Reliability Studies , 1981 .
[2] F. Lord. Applications of Item Response Theory To Practical Testing Problems , 1980 .
[3] E. Beale,et al. Missing Values in Multivariate Analysis , 1975 .
[4] C. I. Chase. ESSAY TEST SCORING: INTERACTION OF RELEVANT VARIABLES , 1986 .
[5] Elana Shohamy,et al. The Effect of Raters' Background and Training on the Reliability of Direct Writing Tests , 1992 .
[6] George Engelhard,et al. The Measurement of Writing Ability With a Many-Faceted Rasch Model , 1992 .
[7] Joyce M. Bainbridge,et al. Teachers' gendered expectations and their evaluation of student writing , 1998 .
[8] J M Linacre,et al. Investigating rating scale category utility. , 1999, Journal of outcome measurement.
[9] Georg Rasch,et al. Probabilistic Models for Some Intelligence and Attainment Tests , 1981, The SAGE Encyclopedia of Research Design.
[10] Tom Lumley,et al. Rater characteristics and rater bias: implications for training , 1995 .
[11] R. L. Ebel,et al. Estimation of the reliability of ratings , 1951 .
[12] G. Masters,et al. Rating Scale Analysis. Rasch Measurement. , 1983 .
[13] Lynn C. Webb. Rater Stringency and Consistency in Performance Assessment. , 1990 .
[14] George Engelhard,et al. The Influences of Mode of Discourse, Experiential Demand, and Gender on the Quality of Student Writing. , 1992 .
[15] Robert J. Mislevy,et al. Monitoring and Improving a Portfolio Assessment System. , 1995 .
[16] S. Graham,et al. Effects of the Learning Disability Label, Quality of Writing Performance, and Examiner's Level of Expertise on the Evaluation of Written Products , 1987, Journal of learning disabilities.
[17] G. Engelhard,et al. INVESTIGATING ASSESSOR EFFECTS IN NATIONAL BOARD FOR PROFESSIONAL TEACHING STANDARDS ASSESSMENTS FOR EARLY CHILDHOOD/GENERALIST AND MIDDLE CHILDHOOD/GENERALIST CERTIFICATION , 2000 .
[18] Edward W. Wolfe,et al. The Relationship between Scoring Procedures and Focus and the Reliability of Direct Writing Assessment Scores. , 1996 .
[19] George Engelhard,et al. Measurement with judges: Many-faceted conjoint measurement , 1994 .
[20] William E. Coffman,et al. A Comparison of Two Methods of Reading Essay Examinations , 1968 .
[21] George Engelhard,et al. Examining Rater Errors in the Assessment of Written Composition With a Many-Faceted Rasch Model , 1994 .
[22] George Engelhard,et al. Rater, Domain, and Gender Influences on the Assessed Quality of Student Writing Using Weighted and Unweighted Scoring. , 1998 .
[23] Nicole A. Lazar,et al. Statistical Analysis With Missing Data , 2003, Technometrics.
[24] Nigel O'Brian,et al. Generalizability Theory I , 2003 .
[25] N. Longford. Reliability of Essay Rating and Score Adjustment , 1994 .
[26] Edward W. Wolfe,et al. Learning To Rate Essays: A Study of Scorer Cognition. , 1994 .
[27] Expert/Novice Differences in the Focus and Procedures Used by Essay Scorers. , 1996 .
[28] George Engelhard,et al. Evaluating Rater Accuracy in Performance Assessments. , 1996 .
[29] Walter M. Houston,et al. Correcting Performance-Rating Errors in Oral Examinations , 1991, Evaluation & the health professions.
[30] Walter M. Houston,et al. Detecting and Correcting for Rater Effects in Performance Assessment , 1990 .
[31] L. Crocker,et al. Introduction to Classical and Modern Test Theory , 1986 .
[32] 村上 省三,et al. Technical manual : 日本語版 , 1985 .
[33] Henry Braun,et al. Understanding Scoring Reliability: Experiments in Calibrating Essay Readers , 1988 .
[34] C. Cason,et al. A Deterministic Theory of Clinical Performance Rating , 1984, Evaluation & the health professions.
[35] Mark R. Wilson,et al. An Examination of Variation in Rater Severity Over Time : A Study in Rater Drift , 2000 .
[36] T. McNamara. Measuring Second Language Performance , 1996 .
[37] Sara Cushing Weigle,et al. Investigating rater/prompt interactions in writing assessment: Quantitative and qualitative approaches , 1999 .
[38] Henry Braun. CALIBRATION OF ESSAY READERS FINAL REPORT , 1986 .
[39] H. John Bernardin,et al. Effects of rater training: Creating new response sets and decreasing accuracy. , 1980 .
[40] Edward W Wolfe,et al. Detecting and measuring rater effects using many-facet Rasch measurement: part I. , 2003, Journal of applied measurement.
[41] M. Lunz,et al. Examining the Invariance of Rater and Project Calibrations Using a Multi-facet Rasch Model. , 1996 .
[42] G. Engelhard,et al. Writing tasks and gender: influences on writing quality of Black and White students , 1994 .
[43] Brent Bridgeman,et al. RELIABILITY OF ADVANCED PLACEMENT EXAMINATIONS , 1996 .
[44] Mary E. Lunz,et al. Measuring the Impact of Judge Severity on Examination Scores , 1990 .
[45] Chockalingam Viswesvaran,et al. Least Squares Models to Correct for Rater Effects in Performance Assessment , 1993 .
[46] Walter M. Houston,et al. Adjustments for Rater Effects in Performance Assessment , 1991 .
[47] B. Huot,et al. Validating holistic scoring for writing assessment : theoretical and empirical foundations , 1993 .
[48] Parameter Estimation for Peer Grading under Incomplete Design , 1988 .
[49] G. Engelhard,et al. Gender Differences in Performance on Multiple-Choice and Constructed Response Mathematics Items. , 1999 .
[50] Mary E. Lunz,et al. Judge Consistency and Severity Across Grading Periods , 1990 .
[51] M. Raymond. Missing Data in Evaluation Research , 1986 .
[52] Lawrence M. Rudner. Reducing Errors Due to the Use of Judges , 1992 .
[53] K. Ercikan,et al. The Consistency Between Raters Scoring in Different Test Years , 1998 .
[54] G Engelhard. Constructing rater and task banks for performance assessments. , 1997, Journal of outcome measurement.
[55] D. McArthur. Bias in the Writing of Prose and Its Appraisal. , 1981 .
[56] Charles E. Lance,et al. A Test of the Context Dependency of Three Causal Models of Halo Rater Error , 1994 .
[57] B. Wright,et al. Best test design , 1979 .
[58] Nicholas T. Longford. A Case for Adjusting Subjectively Rated Scores in the Advanced Placement Tests. Program Statistics Research. Technical Report No. 94-5. , 1994 .
[59] D. Andrich. A rating formulation for ordered response categories , 1978 .
[60] D. D. Gruijter. Two simple models for rater effects. , 1984 .
[61] Stephen B. Dunbar,et al. Complex, Performance-Based Assessment: Expectations and Validation Criteria , 1991 .
[62] Carol M. Myford,et al. READER CALIBRATION AND ITS POTENTIAL ROLE IN EQUATING FOR THE TEST OF WRITTEN ENGLISH , 1995 .