Mixed-format exams in higher education: Assessment of internal consistency reliability

In higher education courses, instructors often use mixed-format exams composed of several types of questions such as essays, problem-solving, and multiple-choice to evaluate student performance. It is important to discriminate reliably amongst students according to their performance on final examinations. The lower the reliability of student exam scores, the greater the error associated with making decisions based on them. Why then have we found no previous studies of reliability for this, one of the most common types of exam? We investigated the reliability of student scores on 12 official mixed-format final exams used in 22 classes with 1012 students in six undergraduate courses taught by five professors in three fields of business (finance, accounting, and statistics). We focussed on estimating internal consistency reliability, which is essentially a measure of the reproducibility of test scores. Using coefficient omega, the most appropriate measure of assessing reliability of mixed-format exams, we found that in these 22 classes reliability averaged .85, with over 90% of the classes with reliabilities exceeding .80. These reliabilities are very high, comparable with those reported for professionally developed standarized tests and better than those reported recently for single-format multiple choice exams in higher education. http://dx.doi.org/10.4995/HEAd15.2015.364

[1]  J. Divers,et al.  Bootstrap Interval Estimation of Reliability via Coefficient Omega , 2013 .

[2]  W. Revelle,et al.  Coefficients Alpha, Beta, Omega, and the glb: Comments on Sijtsma , 2009 .

[3]  Roy Cox,et al.  Examinations and higher education: a survey of the literature , 1967 .

[4]  T. Hogan,et al.  Reliability Methods: A Note on the Frequency of Use of Various Types , 2000 .

[5]  N. Schmitt Uses and abuses of coefficient alpha. , 1996 .

[6]  D. Royce Sadler,et al.  Assessment in Education : Principles , Policy & Practice , 2012 .

[7]  Klaas Sijtsma,et al.  On the Use, the Misuse, and the Very Limited Usefulness of Cronbach’s Alpha , 2008, Psychometrika.

[8]  J. Divers,et al.  Coefficient Alpha Bootstrap Confidence Interval Under Nonnormality , 2012 .

[9]  Thomas J. Dunn,et al.  From alpha to omega: a practical solution to the pervasive problem of internal consistency estimation. , 2014, British journal of psychology.

[10]  Jamie L. Jensen,et al.  Investigating the Effects of Exam Length on Performance and Cognitive Fatigue , 2013, PLoS ONE.

[11]  Ken Kelley,et al.  Methods for the Behavioral, Educational, and Social Sciences: An R package , 2007, Behavior research methods.

[12]  M. R. Novick,et al.  Statistical Theories of Mental Test Scores. , 1971 .

[13]  L. S. Feldt,et al.  Estimating the reliability of a test split into two parts of equal or unequal length. , 2003, Psychological methods.

[14]  Bruce Thompson,et al.  Confidence intervals about score reliability coefficients, please: An EPM guidelines editorial. , 2001 .

[15]  B. J. Hill EXAMINATION PAPER LENGTH: HOW MANY QUESTIONS? , 1978 .

[16]  Alexandra Marie Henchy,et al.  REVIEW AND EVALUATION OF RELIABILITY GENERAL IZATION RESEARCH , 2013 .

[17]  M. Ray Karnes,et al.  Measuring educational achievement , 1950 .

[18]  Jonas Gloeckner,et al.  Measuring Educational Achievement , 2016 .

[19]  Eric S. Lee,et al.  Can Exams Be Shortened? Using a New Empirical Approach to Test in Finance Courses , 2014 .

[20]  Stephen E. Newstead,et al.  Examining the Examiners: Why are We So Bad at Assessing Students? , 2002 .

[21]  Michael Miller Coefficient alpha: A basic introduction from the perspectives of classical test theory and structural equation modeling , 1995 .

[22]  Jasmin Divers,et al.  Coefficient Omega Bootstrap Confidence Intervals , 2013 .

[23]  L. S. Feldt,et al.  Averaging Internal Consistency Reliability Coefficients , 2006 .

[24]  R. Henson Understanding Internal Consistency Reliability Estimates: A Conceptual Primer on Coefficient Alpha , 2001 .

[25]  D. D. Morley,et al.  Assessing the reliability of student evaluations of teaching: choosing the right coefficient , 2014 .

[26]  J. Graham Congeneric and (Essentially) Tau-Equivalent Estimates of Score Reliability , 2006 .

[27]  L. Crocker,et al.  Introduction to Classical and Modern Test Theory , 1986 .

[28]  Andy P. Field,et al.  Discovering Statistics Using SPSS , 2000 .

[29]  K. Krippendorff Reliability in Content Analysis: Some Common Misconceptions and Recommendations , 2004 .

[30]  Bruce Thompson,et al.  If Statistical Significance Tests are Broken/Misused, What Practices Should Supplement or Replace Them? , 1999 .

[31]  Christopher Dracup,et al.  The reliability of marking on a psychology degree , 1997 .

[32]  A. Qualls Estimating the Reliability of a Test Containing Multiple Item Formats , 1995 .