Gender Fairness within the Force Concept Inventory

Research on the test structure of the Force Concept Inventory (FCI) has largely ignored gender, and research on FCI gender effects (often reported as "gender gaps") has seldom interrogated the structure of the test. These rarely-crossed streams of research leave open the possibility that the FCI may not be structurally valid across genders, particularly since many reported results come from calculus-based courses where 75% or more of the students are men. We examine the FCI considering both psychometrics and gender disaggregation (while acknowledging this as a binary simplification), and find several problematic questions whose removal decreases the apparent gender gap. We analyze three samples (total $N_{pre}=5,391$, $N_{post}=5,769$) looking for gender asymmetries using Classical Test Theory, Item Response Theory, and Differential Item Functioning. The combination of these methods highlights six items that appear substantially unfair to women and two items biased in favor of women. No single physical concept or prior experience unifies these questions, but they are broadly consistent with problematic items identified in previous research. Removing all significantly gender-unfair items halves the gender gap in the main sample in this study. We recommend that instructors using the FCI report the reduced-instrument score as well as the 30-item score, and that credit or other benefits to students not be assigned using the biased items.

[1]  Paul W. Holland,et al.  An Alternate Definition of the ETS Delta Scale of Item Difficulty. Program Statistics Research. , 1985 .

[2]  Jenessa R. Shapiro,et al.  The Role of Stereotype Threats in Undermining Girls’ and Women’s Performance and Interest in STEM Fields , 2012 .

[3]  So Yoon Yoon,et al.  A Meta-Analysis on Gender Differences in Mental Rotation Ability Measured by the Purdue Spatial Visualization Tests: Visualization of Rotations (PSVT:R) , 2013 .

[4]  W. M. Yen Using Simulation Results to Choose a Latent Trait Model , 1981 .

[5]  Sarah L. Eddy,et al.  Beneath the numbers: A review of gender disparities in undergraduate education across science, technology, engineering, and math disciplines , 2016 .

[6]  Educational Evaluation Standards for Educational and Psychological Testing , 1999 .

[7]  David Hestenes,et al.  Interpreting the force concept inventory: A response to March 1995 critique by Huffman and Heller , 1995 .

[8]  Lauren E. Kost,et al.  Characterizing the gender gap in introductory physics , 2009 .

[9]  Rebecca Susan Lindell Enhancing college students' understanding of lunar phases , 2001 .

[10]  James W. Pellegrino,et al.  An Analytic Framework for Evaluating the Validity of Concept Inventory Claims , 2015 .

[11]  Christine E. DeMars Item Response Theory , 2010 .

[12]  D. Hestenes,et al.  Force concept inventory , 1992 .

[13]  Jesper Bruun,et al.  Using module analysis for multiple choice responses: A new method applied to Force Concept Inventory data , 2016 .

[14]  Lauren E. Kost,et al.  Reducing the gender gap in the physics classroom: How sufficient is interactive engagement? , 2007 .

[15]  Rebecca Zwick,et al.  A Review of ETS Differential Item Functioning Assessment Procedures: Flagging Rules, Minimum Sample Size Requirements, and Criterion Refinement , 2012 .

[16]  Patrick B. Kohl,et al.  Introductory Physics Gender Gaps: Pre- and Post-Studio Transition , 2009 .

[17]  Douglas Huffman,et al.  What does the force concept inventory actually measure , 1995 .

[18]  K. Wilson,et al.  Differences in Gender Performance on Competitive Physics Selection Tests. , 2016 .

[19]  Diane F. Halpern,et al.  Sex differences in cognitive abilities, 2nd ed. , 1992 .

[20]  Michael J. Zieky Fairness review in assessment. , 2006 .

[21]  Dinah Sparks,et al.  Gender Differences in Science, Technology, Engineering, and Mathematics (STEM) Interest, Credits Earned, and NAEP Performance in the 12th Grade. Stats in Brief. NCES 2015-075. , 2015 .

[22]  Stephen D. Baker,et al.  An item response curves analysis of the Force Concept Inventory , 2012 .

[23]  Andrew Gavrin,et al.  Just-in-Time Teaching , 2011 .

[24]  Brian E. Clauser,et al.  Using Statistical Procedures to Identify Differentially Functioning Test Items , 2005 .

[25]  K. Ercikan,et al.  Analysis of Differential Item Functioning in the NAEP History Assessment , 1988 .

[26]  Ronald K. Thornton,et al.  Assessing student learning of Newton’s laws: The Force and Motion Conceptual Evaluation and the Evaluation of Active Learning Laboratory and Lecture Curricula , 1998 .

[27]  M. Carter Visible learning: a synthesis of over 800 meta‐analyses relating to achievement , 2009 .

[28]  Philip M. Sadler,et al.  Success in introductory college physics: The role of high school preparation , 2001 .

[29]  Janet Shibley Hyde,et al.  Cross-national patterns of gender differences in mathematics: a meta-analysis. , 2010, Psychological bulletin.

[30]  Elaine Seymour,et al.  The loss of women from science, mathematics, and engineering undergraduate majors: An explanatory account , 1995 .

[31]  Anne H. Soukhanov,et al.  The american heritage dictionary of the english language , 1992 .

[32]  Peter Dalgaard,et al.  R Development Core Team (2010): R: A language and environment for statistical computing , 2010 .

[33]  Jeffry V. Mallow,et al.  Science Anxiety and Gender in Students Taking General Education Science Courses , 2004 .

[34]  Laura McCullough,et al.  Gender, Context, and Physics Assessment , 2004 .

[35]  Ana Susac,et al.  Rasch Model Based Analysis of the Force Concept Inventory. , 2010 .

[36]  Sara E. Brownell,et al.  Beneath the numbers: A review of gender disparities in undergraduate education across science, technology, engineering, and math disciplines , 2016 .

[37]  Ronald K. Thornton,et al.  Comparing the force and motion conceptual evaluation and the force concept inventory , 2009 .

[38]  Nathaniel Lasry,et al.  The puzzling reliability of the Force Concept Inventory , 2011 .

[39]  Charles Henderson,et al.  Common Concerns About the Force Concept Inventory , 2002 .

[40]  Sarah B. McKagan,et al.  Gender gap on concept inventories in physics: what is consistent, what is inconsistent,and what factors influence the gap? , 2013, 1307.0912.

[41]  Nancy S. Cole,et al.  The ETS Gender Study: How Females and Males Perform in Educational Settings. , 1997 .

[42]  Catherine H. Crouch,et al.  Reducing the gender gap in the physics classroom , 2006 .

[43]  Robert J. Beichner,et al.  Approaches to Data Analysis of Multiple-Choice Questions. , 2009 .

[44]  D. Halpern Sex Differences in Cognitive Abilities , 1986 .

[45]  Susan D. Voyer,et al.  Gender differences in scholastic achievement: a meta-analysis. , 2014, Psychological bulletin.

[46]  Andrew Gavrin,et al.  Just-In-Time Teaching: Blending Active Learning with Web Technology , 1999 .

[47]  Adrienne L. Traxler,et al.  Enriching gender in physics education research: A binary past and a complex future , 2015, 1507.05107.

[48]  Andrew R. Gray,et al.  Exploratory factor analysis of a Force Concept Inventory data set , 2012 .

[49]  Jacob Cohen,et al.  A power primer. , 1992, Psychological bulletin.

[50]  M. Linn,et al.  Gender differences in verbal ability: A meta-analysis. , 1988 .

[51]  Terry F. Scott,et al.  Students' Proficiency Scores within Multitrait Item Response Theory. , 2015 .

[52]  Dimitris Rizopoulos,et al.  ltm: An R Package for Latent Variable Modeling and Item Response Analysis , 2006 .

[53]  Eric Mazur,et al.  Peer Instruction: A User's Manual , 1996 .

[54]  Mark A. McDaniel,et al.  Multiyear, Multi-Instructor Evaluation of a Large-Class Interactive-Engagement Curriculum. , 2014 .

[55]  Gregory Camilli,et al.  5 Differential Item Functioning and Item Bias , 2006 .

[56]  Willem J. van der Linden,et al.  Unidimensional Logistic Response Models , 2016 .

[57]  Dorothy T. Thayer,et al.  Differential Item Performance and the Mantel-Haenszel Procedure. , 1986 .

[58]  Lei Bao,et al.  Dividing the Force Concept Inventory into Two Equivalent Half-Length Tests. , 2015 .

[59]  W. Marsden I and J , 2012 .

[60]  Caesar Saloma,et al.  Things I have learned so far , 2008 .

[61]  David E. Meltzer,et al.  Differences in Male/Female Response Patterns on Alternative-format Versions of the Force Concept Inventory , 2001 .

[62]  M. F. Fuller,et al.  Practical Nonparametric Statistics; Nonparametric Statistical Inference , 1973 .

[63]  Douglas Huffman,et al.  Interpreting the force concept inventory: A reply to Hestenes and Halloun , 1995 .

[64]  N. Dorans ETS CONTRIBUTIONS TO THE QUANTITATIVE ASSESSMENT OF ITEM, TEST, AND SCORE FAIRNESS , 2013 .

[65]  Fred B. Bryant,et al.  Science Anxiety, Science Attitudes, and Gender: Interviews from a Binational Study , 2010 .

[66]  P. Boeck,et al.  A general framework and an R package for the detection of dichotomous differential item functioning , 2010, Behavior research methods.

[67]  Paula V. Engelhardt,et al.  Gender bias in the force concept inventory , 2012 .

[68]  David P Maloney,et al.  Surveying students’ conceptual knowledge of electricity and magnetism , 2001 .

[69]  Lei Bao,et al.  Analyzing force concept inventory with item response theory , 2010 .

[70]  Xu Song Gender Differences in Science , 2005 .

[71]  Stephen D. Baker,et al.  Testing the test: Item response curves and test quality , 2006 .

[72]  Dimitrios Rizopoulos ltm: An R Package for Latent Variable Modeling and Item Response Theory Analyses , 2006 .

[73]  Steven P. Reise,et al.  A Comparison of Item- and Person-Fit Methods of Assessing Model-Data Fit in IRT , 1990 .

[74]  Robert H. Tai,et al.  Gender differences in introductory university physics performance: The influence of high school physics preparation and affective factors , 2007 .

[75]  C. Singh,et al.  Do evidence-based active-engagement courses reduce the gender gap in introductory physics? , 2018 .

[76]  Adrienne L. Traxler,et al.  Exploring the Gender Gap in the Conceptual Survey of Electricity and Magnetism , 2017 .

[77]  A. Hood,et al.  Gender , 2019, Textile History.

[78]  Seth DeVore,et al.  Examining the effects of testwiseness in conceptual physics evaluations , 2016 .

[79]  Steven J. Pollock,et al.  Comparing student learning with multiple research-based conceptual surveys: CSEM and BEMA. , 2008 .

[80]  F. Lord Applications of Item Response Theory To Practical Testing Problems , 1980 .

[81]  Xin Ma A META-ANALYSIS OF THE RELATIONSHIP BETWEEN ANXIETY TOWARD MATHEMATICS AND ACHIEVEMENT IN MATHEMATICS , 1999 .

[82]  E. Fennema,et al.  Gender differences in mathematics performance: a meta-analysis. , 1990, Psychological bulletin.