What Counts as Evidence: A Review of Validity Studies in Educational and Psychological Measurement

Validity is considered a fundamental concern in educational measurement, yet it remains an intensely debated concept with conflicting implications for the practice of validation. This study used a systematic review process to analyze past (1960–1969) and present (2000–2009) validation practices, with an eye towards implications for both the theory and practice of validation. Articles in the “Validity Studies” section of the journal Educational and Psychological Measurement were systematically selected, and the validity evidence presented in each study was analyzed based upon classifications from the AERA, APA, and NCME Standards for Educational and Psychological Testing. Results show two primary categories of evidence were consistently absent (evidence based on response processes and consequences of test use), and modern theoretical validity frameworks were rarely used or cited. Discussion of these findings is situated within current theoretical debates, and the implications for the practice of test validation are considered.

[1]  S. Sireci The Construct of Content Validity , 1998 .

[2]  S. Embretson CONSTRUCT VALIDITY: CONSTRUCT REPRESENTATION VERSUS NOMOTHETIC SPAN , 1983 .

[3]  Michael T. Kane,et al.  Certification Testing as an Illustration of Argument-Based Validation , 2004 .

[4]  L. Shepard Evaluating Test Validity , 1993 .

[5]  E. Rosch,et al.  The Embodied Mind: Cognitive Science and Human Experience , 1993 .

[6]  L. Cronbach,et al.  Construct validity in psychological tests. , 1955, Psychological bulletin.

[7]  R. Linn Educational measurement, 3rd ed. , 1989 .

[8]  Bruce W. Hall,et al.  Evaluation of Published Educational Research: A National Survey1 , 1975 .

[9]  Victor L. Willson,et al.  Research Techniques in AERJ Articles: 1969 to 1978 , 1980 .

[10]  Carol A. Chapelle,et al.  Does an Argument-Based Approach to Validity Make a Difference? , 2010 .

[11]  Thomas P. Hogan,et al.  An Empirical Study of Reporting Practices Concerning Measurement Validity , 2004 .

[12]  Michael T. Kane,et al.  An argument-based approach to validity. , 1992 .

[13]  D. Borsboom Educational Measurement (4th ed.) , 2009 .

[14]  Bruno D. Zumbo,et al.  3 Validity: Foundational Issues and Statistical Methodology , 2006 .

[15]  B. Plake,et al.  A Historical Comparison of Validity Standards and Validity Practices , 1998 .

[16]  Robert W. Lissitz,et al.  A Suggested Change in Terminology and Emphasis Regarding Validity and Education , 2007 .

[17]  Lorrie A. Shepard,et al.  Chapter 9: Evaluating Test Validity , 1993 .

[18]  Bruno D. Zumbo,et al.  Validity and the Consequences of Test Interpretation and Use , 2011 .

[19]  Susan R. Davis,et al.  Trends in Reporting Psychometric Properties of Scales Used in Counseling Psychology Research. , 1990 .

[20]  B. Zumbo,et al.  A Dialectic on Validity: Where We Have Been and Where We Are Going , 1996 .

[21]  W. V. Bingham Aptitudes and aptitude testing , 1937 .

[22]  J. P. Guilford,et al.  New Standards For Test Evaluation , 1946 .

[23]  Dale Whhtington,et al.  How Well Do Researchers Report their Measures? an Evaluation of Measurementin Published Educational Research , 1998 .

[24]  S. Whitely Construct validity: Construct representation versus nomothetic span. , 1983 .

[25]  Linda Crocker,et al.  Editorial The Great Validity Debate , 2005 .

[26]  B. R. Buckingham Intelligence and its measurement: A symposium--XIV. , 1921 .

[27]  M. Kane Validating the Interpretations and Uses of Test Scores , 2013 .

[28]  D. Borsboom,et al.  The concept of validity. , 2004, Psychological review.

[29]  T. Hogan,et al.  Reliability Methods: A Note on the Frequency of Use of Various Types , 2000 .

[30]  Christina Wikström,et al.  The concept of validity in theory and practice , 2010 .

[31]  Denny Borsboom,et al.  The end of construct validity. , 2009 .

[32]  Alija Kulenović,et al.  Standards for Educational and Psychological Testing , 1999 .

[33]  Edward Haertel,et al.  How Is Testing Supposed to Improve Schooling? , 2013 .

[34]  Heather H. Koons,et al.  Sources of Validity Evidence for Educational and Psychological Tests , 2008 .

[35]  Robert W. Lissitz,et al.  The concept of validity : revisions, new directions, and applications , 2009 .

[36]  Gregory J. Cizek,et al.  Sources of Validity Evidence for Educational and Psychological Tests: A Follow-Up Study , 2010 .

[37]  Susan E. Embretson,et al.  Construct Validity: A Universal Validity System or Just Another Test Evaluation Procedure? , 2007 .

[38]  Audrey L. Qualls,et al.  The Degree of Congruence between Test Standards and Test Documentation with in Journal Publications , 1996 .

[39]  A. Hubley,et al.  Screening for depression after cardiac events using the Beck Depression Inventory-II and the Geriatric Depression Scale , 2007 .

[40]  S. Messick Validity of Psychological Assessment: Validation of Inferences from Persons' Responses and Performances as Scientific Inquiry into Score Meaning. Research Report RR-94-45. , 1994 .