Exogenous Variables and Value-Added Assessments: A Fatal Flaw

Background There has been rapid growth in value-added assessment of teachers to meet the widely supported policy goal of identifying the most effective and the most ineffective teachers in a school system. The former group is to be rewarded while the latter group is to be helped or fired for their poor performance. But, value-added approaches to teacher evaluation have many problems. Chief among them is the commonly found class-to-class and year-to-year unreliability in the scores obtained. Teacher value-added scores appear to be highly unstable across two classes of the same subject that they teach in the same semester, or from class to class across two adjacent years. Focus of Study This literature review first focuses on the confusion in the minds of the public and politicians between teachers’ effects on individual students, which may be great and usually positive, and teachers’ effects on classroom mean achievement scores, which may be limited by the huge number of exogenous variables affecting classroom achievement scores. Exogenous variables are unaccounted for influences on the data, such as peer classroom effects, school compositional effects, and characteristics of the neighborhoods in which some students live. Further, even if some of these variables are measured, the interactions among these many variables often go unexamined. But, two-way and three-way interactions are quite likely to be occurring and influencing classroom achievement. This analysis promotes the idea that the ubiquitous and powerful effects on value-added scores of these myriad exogenous variables is the reason that almost all current research finds instability in teachers’ classroom behavior and instability in teachers’ value-added scores. This may pose a fatal flaw in implementing value-added assessments of teaching competency. Research Design This is an analytic essay, including a selective literature review that includes some secondary analyses. Conclusions I conclude that because of the effects of countless exogenous variables on student classroom achievement, value-added assessments do not now and may never be stable enough from class to class or year to year to be used in evaluating teachers. The hope is that with three or more years of value-added data, the identification of extremely good and bad teachers might be possible; but, that goal is not assured, and empirical results suggest that it really is quite hard to reliably identify extremely good and extremely bad groups of teachers. In fact, when picking extremes among teachers, both luck and regression to the mean will combine with the interactions of many variables to produce instability in the value-added scores that are obtained. Examination of the apparently simple policy goal of identifying the best and worst teachers in a school system reveals a morally problematic and psycho-metrically inadequate base for those policies. In fact, the belief that there are thousands of consistently inadequate teachers may be like the search for welfare queens and disability scam artists—more sensationalism than it is reality.

[1]  J. Douglas Willms,et al.  LEARNING DIVIDES: TEN POLICY QUESTIONS ABOUT THE PERFORMANCE AND EQUITY OF SCHOOLS AND SCHOOLING SYSTEMS , 2006 .

[2]  D. Berliner Effects of Inequality and Poverty vs. Teachers and Schooling on America's Youth , 2013 .

[3]  Xiaoxia A. Newton,et al.  Value-Added Modeling of Teacher Effectiveness: An Exploration of Stability across Models and Contexts , 2010 .

[4]  C. Wimer,et al.  After School Programs in the 21st Century: Their Potential and What It Takes to Achieve It. Issues and Opportunities in Out-of-School Time Evaluation. Number 10. , 2008 .

[5]  J. Lester,et al.  Sentinels Guarding the Grail: Value-Added Measurement and the Quest for Education Reform , 2013 .

[6]  Raj Chetty,et al.  The Long-Term Impacts of Teachers: Teacher Value-Added and Student Outcomes in Adulthood , 2011 .

[7]  Jesse Levin,et al.  For whom the reductions count: A quantile regression analysis of class size and peer effects on scholastic achievement , 2001 .

[8]  Amy L. Chua Battle Hymn of the Tiger Mother , 2011 .

[9]  Gregory J. Palardy,et al.  Does Segregation Still Matter? The Impact of Student Composition on Academic Achievement in High School , 2005, Teachers College Record: The Voice of Scholarship in Education.

[10]  R. Sternberg,et al.  Narrowing curriculum, assessments, and conceptions of what it means to be smart in the U.S. schools: Creaticide by design , 2012 .

[11]  N. Guerra,et al.  Neighborhood disadvantage, stressful life events, and adjustment in urban elementary-school children. , 1994 .

[12]  J. Brophy Teacher behavior and its effects , 1979 .

[13]  R. Rumberger Hierarchical linear models: Applications and data analysis methods: and. Newbury Park, CA: Sage, 1992. (ISBN 0-8039-4627-9), pp. xvi + 265. Price: U.S. $45.00 (cloth) , 1997 .

[14]  M. Amacker,et al.  Learning for tomorrow's world , 2005 .

[15]  R. Winter‐Ebmer,et al.  Peer effects in Austrian schools , 2005 .

[16]  G. Henry,et al.  Do peers influence children's skill development in preschool? , 2007 .

[17]  Thomas A Louis,et al.  Jump down to Document , 2022 .

[18]  S. Loeb,et al.  How Teacher Turnover Harms Student Achievement , 2011 .

[19]  Nicole B. Kersting,et al.  Value-Added Teacher Estimates as Part of Teacher Evaluations: Exploring the Effects of Data and Model Specifications on the Stability of Teacher Value-Added Scores. , 2014 .

[20]  Dan Goldhaber,et al.  Is It Just a Bad Class? Assessing the Stability of Measured Teacher Performance. CEDR Working Paper No. 2010-3.0. , 2010 .

[21]  Dan Goldhaber,et al.  The Mystery of Good Teaching , 2002 .

[22]  P. McEwan Peer effects on student achievement: evidence from Chile , 2003 .

[23]  Andrew McConney,et al.  Does the SES of the School Matter? An Examination of Socioeconomic Status and Student Achievement Using PISA 2003 , 2010, Teachers College Record: The Voice of Scholarship in Education.

[24]  Howard Wainer,et al.  Uneducated Guesses: Using Evidence to Uncover Misguided Education Policies , 2011 .

[25]  Chungmei Lee,et al.  Why Segregation Matters: Poverty and Educational Inequality , 2005 .

[26]  L. Dehaan,et al.  Child Care and Development: Results From the NICHD Study of Early Child Care and Youth Development. The NICHD Early Child Care Research Network , 2006 .

[27]  Audrey Amrein-Beardsley,et al.  The SAS Education Value-Added Assessment System (SAS® EVAAS®) in the Houston Independent School District (HISD): Intended and Unintended Consequences , 2012 .

[28]  J. Oakes,et al.  Keeping Track: How Schools Structure Inequality. , 1986 .

[29]  Andreas Ammermueller,et al.  Peer Effects in European Primary Schools: Evidence from Pirls , 2006, SSRN Electronic Journal.

[30]  David C. Berliner,et al.  Our Impoverished View of Educational Research. , 2006 .

[31]  R. Goddard Relational Networks, Social Trust, and Norms: A Social Capital Perspective on Students’ Chances of Academic Success , 2003 .

[32]  C. Hoxby,et al.  Peer Effects in the Classroom: Learning from Gender and Race Variation , 2000 .

[33]  C. Collins Houston, We Have a Problem: Studying the SAS Education Value-Added Assessment System (EVAAS) from Teachers' Perspectives in the Houston Independent School District (HISD) , 2012 .

[34]  D. Berliner Poverty and Potential: Out-of-School Factors and School Success. , 2009 .

[35]  Anthony S. Bryk,et al.  Hierarchical Linear Models: Applications and Data Analysis Methods , 1992 .

[36]  J. Belsky,et al.  Are there long-term effects of early child care? , 2007, Child development.

[37]  Jesse Rothstein,et al.  Teacher Quality in Educational Production: Tracking, Decay, and Student Achievement , 2008 .