Large-scale assessment systems: Design principles drawn from international comparisons

In recent years, a number of analyseshe assessments used in different countries have appeared. Eckstein and Noah (1993) reviewed the experiences of students in the final year of upper secondary education in China, England and Wales, France, Germany, Japan, Sweden, the Soviet Union, and the United States, looking at the details of the assessments used, and the wider policy issues such as success rates and the degree of centralization. Britton and Raizen (1996) looked in greater detail at the assessments used in science and mathematics, and in particular at the coverage of different topics. These, and other analyses emerging from international comparisons such as TIMSS and PISA, have focused on what might be termed “cross-sectional” comparisons; examining the differences between the assessments in different national systems for students of the same age. Other studies have focused on the study of systems of assessments within a

[1]  Randy Elliot Bennett,et al.  USING NEW TECHNOLOGY TO IMPROVE ASSESSMENT , 1999 .

[2]  C. W. Anderson,et al.  FOCUS ARTICLE: Implications of Research on Children's Learning for Standards and Assessment: A Proposed Learning Progression for Matter and the Atomic-Molecular Theory , 2006 .

[3]  Alan Feingold,et al.  Cognitive gender differences are disappearing. , 1988 .

[4]  Ann L. Brown,et al.  How people learn: Brain, mind, experience, and school. , 1999 .

[5]  Clare Lee,et al.  Assessment for Learning- putting it into practice , 2003 .

[6]  Tom Keller,et al.  Instructionally Supportive Accountability Tests in Science: A Viable Assessment Option? , 2005 .

[7]  Fred M. Newmann,et al.  Authentic Intellectual Work and Standardized Tests , 2001 .

[8]  R. Daugherty National Curriculum Assessment: A Review Of Policy 1987-1994 , 1995 .

[9]  R. Butler ENHANCING AND UNDERMINING INTRINSIC MOTIVATION: THE EFFECTS OF TASK‐INVOLVING AND EGO‐INVOLVING EVALUATION ON INTEREST AND PERFORMANCE , 1988 .

[10]  Gary L. Williamson,et al.  Longitudinal Analyses of Academic Achievement , 1991 .

[11]  J. Nagaoka,et al.  Ending Social Promotion: The Effects of Retention. Charting Reform in Chicago Series. , 2004 .

[12]  Subject-oriented test construction , 1980 .

[13]  D. Wiliam Reliability, validity, and all that jazz , 2001 .

[14]  S Pirie Deficiencies in basic mathematical skills among nurses : Development and evaluation of methods of detection and treatment. , 1982 .

[15]  P. Black 1987 to 1995 — The Struggle to Formulate a National Curriculum for Science in England and Wales , 1995 .

[16]  G. Stobart,et al.  Interim Report of the Evaluation of Project 1 of the Assessment is for Learning Development Programme: Support for Professional Practice in Formative Assessment , 2003 .

[17]  P. Black,et al.  Working inside the Black Box: Assessment for Learning in the Classroom , 2004 .

[18]  C. McGuire,et al.  Office of Educational Research and Improvement , 1999 .

[19]  Educational Evaluation Standards for Educational and Psychological Testing , 1999 .

[20]  Patricia Broadfoot,et al.  Balancing Dilemmas in Assessment and Learning in Contemporary Education , 2008 .

[21]  J. Greeno,et al.  Thinking Practices in Mathematics and Science Learning , 1998 .

[22]  Susy Macqueen,et al.  Validity , 1973, Just Algorithms.

[23]  Descriptors Educational,et al.  of Educational Measurement , 1988 .

[24]  P. Black,et al.  Assessment and Classroom Learning , 1998 .

[25]  Sandra Johnson National assesment : the APU science approach , 1989 .

[26]  M. Linn,et al.  Beyond Fourth-Grade Science: Why Do U.S. and Japanese Students Diverge? , 2000 .

[27]  Geoffrey N. Masters,et al.  Bridging the Conceptual Gap between Classroom Assessment and System Accountability , 2004, Teachers College Record: The Voice of Scholarship in Education.

[28]  E. Fennema,et al.  Gender differences in mathematics performance: a meta-analysis. , 1990, Psychological bulletin.

[29]  Dylan Wiliam Constructing difference: assessment in mathematics education , 2003 .

[30]  A. West ‘BANDING’ AND SECONDARY SCHOOL ADMISSIONS: 1972–2004 , 2005 .

[31]  Pinchas Tamir,et al.  Justifying the selection of answers in multiple choice items , 1990 .

[32]  William Snyder,et al.  Cultivating Communities of Practice: A Guide to Managing Knowledge , 2002 .

[33]  Richard Lesh,et al.  A diagnostic analysis of a proportional reasoning test item: An introduction to the properties of a semi-dense item , 1994 .

[34]  Wendy M. Yen,et al.  THE CHOICE OF SCALE FOR EDUCATIONAL MEASUREMENT: AN IRT PERSPECTIVE , 1986 .

[35]  Anthony S. Bryk,et al.  Authentic Intellectual Work and Standardized Tests: Conflict or Coexistence? Improving Chicago's Schools. , 2001 .

[36]  D. Wood Self-theories: Their Role in Motivation, Personality and Development. By Carol S. Dweck. Psychology Press, Hove, 1999. pp. 195. £29.95 (hb). , 2000 .

[37]  L. Darling-Hammond,et al.  Authentic assessment in action : studies of schools and students at work , 1995 .

[38]  Gail P. Baxter,et al.  Mathematics Performance Assessment: Technical Quality and Diverse Student Impact. , 1993 .

[39]  Steven N. Durlauf,et al.  Current perspectives and future directions , 2006 .

[40]  Caroline Clapham,et al.  ASSESSMENT AND TESTING , 2000, Annual Review of Applied Linguistics.

[41]  B. Stecher,et al.  The Local Benefits and Burdens of Large‐scale Portfolio Assessment , 1998 .

[42]  G. Bruce SECONDARY SCHOOL EXAMINATIONS , 1969 .

[43]  Special needs and the distribution of attainment in the National Curriculum. , 1992, The British journal of educational psychology.

[44]  N. Cole,et al.  Gender and fair assessment , 1997 .

[45]  P. Black,et al.  Towards Coherence Between Classroom Assessment and Accountability (103rd Yearbook of the National Society for the Study of Education) , 2004 .

[46]  W. H. Angoff,et al.  The College Board Admissions Testing Program: A Technical Report on Research and Development Activities Relating to the Scholastic Aptitude Test and Achievement Tests. , 1971 .

[47]  Wynne Harlen,et al.  Testing, motivation and learning , 2002 .

[48]  Testing the Test: A Study of the Reliability and Validity of the Northern Ireland Transfer Procedure Test in Enabling the Selection of Pupils for Grammar School Places. , 2000 .

[49]  Jacob Cohen Statistical Power Analysis for the Behavioral Sciences , 1969, The SAGE Encyclopedia of Research Design.

[50]  Accuracy of Individual Scores Expressed in Percentile Ranks: Classical Test Theory Calculations. CSE Technical Report. , 2000 .

[51]  Liz McDowell,et al.  Balancing Dilemmas in Assessment and Learning in Contemporary Education. Routledge Research in Education. , 2007 .

[52]  Brenda Denvir,et al.  The feasibility of class administered diagnostic assessment in primary mathematics , 1987 .

[53]  Michael O. Martin TIMSS 1999 international science report : Findings from IEA's repeat of the third international mathematics and science study at the eighth grade , 2000 .

[54]  Paul Black,et al.  The Formative Purpose: Assessment Must First Promote Learning , 2004, Teachers College Record: The Voice of Scholarship in Education.

[55]  P. Black,et al.  Developing the theory of formative assessment , 2009 .

[56]  Magnus Wikström,et al.  Grade inflation and school competition: an empirical analysis based on the Swedish upper secondary schools , 2005 .

[57]  Dylan Wiliam,et al.  Diagnostic Questions: Is There Value in Just One? 1 , 2006 .

[58]  Kathleen Hart,et al.  Children's Understanding of Mathematics , 1989 .

[59]  Alija Kulenović,et al.  Standards for Educational and Psychological Testing , 1999 .

[60]  Nicholas Lemann,et al.  The Big Test: The Secret History of the American Meritocracy , 1999 .

[61]  Identifiers California,et al.  Annual Meeting of the National Council on Measurement in Education , 1998 .

[62]  Patricia Broadfoot,et al.  Education, assessment, and society : a sociological analysis , 1996 .

[63]  P. Black,et al.  'In praise of educational research': formative assessment , 2003 .

[64]  Lynn Friedman,et al.  Mathematics and the Gender Gap: A Met-Analysis of Recent Studies on Sex Differences in Mathematical Tasks , 1989 .

[65]  Mark Wilson,et al.  Some Links between Large-Scale and Classroom Assessments: The Case of the BEAR Assessment System , 2004, Teachers College Record: The Voice of Scholarship in Education.

[66]  R. Linn Assessments and Accountability , 2000 .

[67]  Dan Watt,et al.  Quality Assessment , 2009, Encyclopedia of Database Systems.

[68]  I. Mullis TIMSS 1999 international mathematics report : Findings from IEA's repeat of the third international mathematics and science study at the eighth grade , 2000 .

[69]  S. Klein,et al.  The Vemont Portfolio Assessment Program: Findings and Implications , 2005 .

[70]  Michael Shayer,et al.  Towards a science of science teaching : cognitive development and curriculum demand , 1981 .

[71]  M. J. Cresswell,et al.  Aggregation and Awarding Methods for National Curriculum Assessments in England and Wales: a comparison of approaches proposed for Key Stages 3 and 4 , 1994 .

[72]  K. Hart,et al.  Children's understanding of mathematics: 11-16 , 1981 .

[73]  P. Black,et al.  Assessment of Science Learning 14-19 , 2004 .

[74]  Michael C. Rodriguez The Role of Classroom Assessment in Student Performance on TIMSS , 2004 .

[75]  Etienne Wenger,et al.  Situated Learning: Legitimate Peripheral Participation , 1991 .

[76]  Mark R. Wilson,et al.  Towards Coherence between Classroom Assessment and Accountability: 103rd Yearbook of the National Society for the Study of Education, Part II , 2005 .

[77]  Student use of test-wiseness strategies in solving multiple-choice chemistry examinations , 1993 .

[78]  W. A. Nicewander,et al.  Grade Equivalent and IRT Representations of Growth , 1997 .

[79]  Susan R. Goldman,et al.  Evaluation of Procedure-Based Scoring for Hands-On Science Assessment , 1992 .

[80]  Shirley Simon,et al.  Progression in Measuring. , 1995 .

[81]  P. Black,et al.  Meanings and Consequences: a basis for distinguishing formative and summative functions of assessment? , 1996 .

[82]  Harry Torrance Formative Assessment: some theoretical problems and empirical questions , 1993 .

[83]  Edward H. Haertel,et al.  Generalizability Analysis for Performance Assessments of Student Achievement or School Effectiveness , 1997 .

[84]  James S. Braswell,et al.  The Nation's Report Card: Mathematics, 2000. , 2001 .