Modeling the Predictive Validity of SAT Mathematics Items Using Item Characteristics

There is much debate on the merits and pitfalls of standardized tests for college admission, with questions regarding the format (multiple-choice vs. constructed response), cognitive complexity, and content of these assessments (achievement vs. aptitude) at the forefront of the discussion. This study addressed these questions by investigating the relationship between SAT Mathematics (SAT-M) item characteristics and the item’s ability to predict college outcomes. Using multiple regression, SAT-M item characteristics (content area, format, cognitive complexity, and abstract/concrete classification) were used to predict three outcome measures: the correlation of item score with first-year college grade point average, the correlation of item score with mathematics course grades, and the percentage of students who answered the item correctly and chose to major in a mathematics or science field. Separate models were run including and excluding item difficulty and discrimination as covariates. The results revealed that many of the item characteristics were related to the outcome measures and that item difficulty and discrimination had a mediating effect on several of the predictor variables, particularly on the effects of nonroutine/insightful items and multiple-choice items.

[1]  Kathleen A. O'Neill,et al.  Item and test characteristics that are associated with differential item functioning. , 1993 .

[2]  Mark J. Gierl,et al.  Identifying Content and Cognitive Dimensions on the SAT , 2005 .

[3]  N. Dorans,et al.  THE INTERNAL CONSTRUCT VALIDITY OF THE SAT1 , 1987 .

[4]  Mitchell J. Nathan,et al.  The Real Story Behind Story Problems: Effects of Representations on Quantitative Reasoning , 2004 .

[5]  Jacqueline,et al.  Identifying Content and Cognitive Dimensions on the SAT , 2005 .

[6]  Donald A. Rock,et al.  Assessing Writing Skill , 1988 .

[7]  A. Ryan,et al.  Life Is Not Multiple Choice: Reactions to the Alternatives , 1998 .

[8]  Saul Geiser,et al.  UC and the SAT: Predictive Validity and Differential Impact of the SAT I and SAT II at the University of California , 2001 .

[9]  David W. Carraher,et al.  Street mathematics and school mathematics , 1993 .

[10]  M. Ewing,et al.  Assessing the Reliability of Skills Measured by the SAT®. Research Notes. RN-24. , 2005 .

[11]  LSAT Item-Type Validity Study , 1998 .

[12]  Stephen B. Dunbar,et al.  Complex, Performance-Based Assessment: Expectations and Validation Criteria , 1991 .

[13]  Xianglei Chen,et al.  Students Who Study Science, Technology, Engineering, and Mathematics (STEM) in Postsecondary Education. Stats in Brief. NCES 2009-161. , 2009 .

[14]  Vladimir M Sloutsky,et al.  The Advantage of Abstract Examples in Learning Math , 2008, Science.

[15]  Kenneth R. Koedinger,et al.  Trade-Offs Between Grounded and Abstract Representations: Evidence From Algebra Problem Solving , 2008, Cogn. Sci..

[16]  R. Adams,et al.  Construction Versus Choice in Cognitive Measurement , 1995 .

[17]  Milton D. Hakel,et al.  Beyond Multiple Choice : Evaluating Alternatives To Traditional Testing for Selection , 2013 .

[18]  L. Jenkins,et al.  Using Achievement Tests/SAT II: Subject Tests to Demonstrate Achievement and Predict College Grades: Sex, Language, Ethnic, and Parental Education Groups , 2001 .

[19]  Beth Hart,et al.  Common Core State Standards Alignment: ReadiStep[TM], PSAT/NMSQT[R] and SAT[R]. Research Report 2010-5A. , 2010 .

[20]  Jennifer L. Kobrin,et al.  Validity of the SAT for Predicting First-Year College Grade Point Average , 2008 .

[21]  H. Wainer,et al.  Differential Item Functioning. , 1994 .

[22]  Glenn B. Milewski,et al.  The Utility of the SAT ® I and SAT II for Admissions Decisions in California and the Nation , 2002 .

[23]  Brent Bridgeman,et al.  Essays and multiple-choice tests as predictors of college freshman GPA , 1991 .

[24]  Brent Bridgeman,et al.  A Comparison of Quantitative Questions in Open‐Ended and Multiple‐Choice Formats , 1992 .

[25]  Brent Bridgeman,et al.  SEX DIFFERENCES IN THE RELATIONSHIP OF ADVANCED PLACEMENT ESSAY AND MULTIPLE-CHOICE SCORES TO GRADES IN COLLEGE COURSES , 1991 .

[26]  Kathleen M. Sheehan,et al.  ITEMS BY DESIGN: THE IMPACT OF SYSTEMATIC FEATURE VARIATION ON ITEM STATISTICAL CHARACTERISTICS , 1999 .

[27]  T. Haladyna Developing and Validating Multiple-Choice Test Items , 1994 .

[28]  M. McDonald Systematic Assessment of Learning Outcomes: Developing Multiple-Choice Exams , 2001 .

[29]  Richard C. Atkinson,et al.  Reflections on a Century of College Admissions Tests , 2009 .

[30]  M. Sebrechts,et al.  Using Algebra Word Problems to Assess Quantitative Ability: Attributes, Strategies, and Errors , 1996 .

[31]  Randy Elliot Bennett,et al.  ON THE MEANINGS OF CONSTRUCTED RESPONSE , 1991 .