Methods for Creating and Evaluating the Item Model Structure Used In Automatic Item Generation

represents a relatively new but rapidly evolving research area where cognitive theories, computer technologies, and psychometric practices are used to generate items. In its most ambitious form, AIG can be described as the process of using models to generate statistically calibrated items with the aid of computer technology. Significant developments in AIG research and practice have occurred in the last decade, with a particularly strong wave of development occurring in the last several years. Important areas of AIG growth included cognitive model Automatic item generation requires three general steps. First, content and test development specialists create item models that highlight the features or elements in the assessment task that can be manipulated. Second, the elements in the item model are varied to generate new items with the aid of computer-based algorithms. Third, statistical models are used to estimate the psychometric properties of the generated items based on the combination of elements used in item assembly. The focus of our study is on steps 1 and 2, item model development and item generation. Item models contains the variables in an assessment task that can be manipulated and used for generation. The elements include the stem, the options, and the auxiliary information. The stem is the part of an item model which contains the context, content, item, and/or the question the examinee is required to answer. The options include the alternative answers with one correct option and one or more incorrect options or distracters. For multiple-choice item models, both stem and options are required. For constructed-response item models, only the stem is created. Auxiliary information includes any additional content, in either the stem or option, required to generate an item. Auxiliary information can be expressed in text, images, tables, diagrams, sound, or video. The stem and options can be further divided into elements. Elements are denoted as strings, which are non-numeric content, and integers, which are numeric content. Drasgow, Luecht, and Bennett (2006) claimed that items models can be created using either a weak or a strong theory approach. With weak theory, a combination of outcomes from research, theory, and experience provide the guidelines necessary for identifying and manipulating the elements in an item model that yield generated assessment tasks. If the goal is to pre-calibrate the generated items using statistical methods (i.e., step 3 in the three step process we described in our Introduction), then the models should be designed so …

[1]  Thomas M. Haladyna,et al.  Using Weak and Strong Theory to Create Item Models for Automatic Item Generation: Some Practical Guidelines with Examples , 2012 .

[2]  Marvin Minsky,et al.  A framework for representing knowledge , 1974 .

[3]  Wim J. van der Linden,et al.  Computerized Adaptive Testing With Item Cloning , 2003 .

[4]  Susan E. Embretson,et al.  23 Automatic Item Generation and Cognitive Psychology , 2006 .

[5]  Sandip Sinharay,et al.  Use of Item Models in a Large-Scale Admissions Test: A Case Study , 2008 .

[6]  J. H. McMillan Annual Meeting of the American Educational Research , 2001 .

[7]  Susan E. Embretson,et al.  Generating items during testing: Psychometric issues and models , 1999 .

[8]  Isaac I. Bejar,et al.  A FEASIBILITY STUDY OF ON‐THE‐FLY ITEM GENERATION IN ADAPTIVE TESTING , 2002 .

[9]  M. Oliveri,et al.  The Learning Sciences in Educational Assessment: The Role of Cognitive Models , 2011, Alberta Journal of Educational Research.

[10]  Mark J. Gierl,et al.  Automatic item generation : theory and practice , 2012 .

[11]  Richard M. Luecht An Introduction to Assessment Engineering for Automatic Item Generation , 2012 .

[12]  Mark J. Gierl,et al.  Using automatic item generation to create multiple‐choice test items , 2012, Medical education.

[13]  Cornelis A.W. Glas,et al.  Modeling Rule-Based Item Generation , 2011 .

[14]  Mark J. Gierl,et al.  Generating Items Under the Assessment Engineering Framework , 2012 .

[15]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[16]  Ehud Reiter,et al.  NLG vs. Templates , 1995, ArXiv.

[17]  Mark J. Gierl,et al.  The Role of Item Models in Automatic Item Generation , 2012 .

[18]  David M. Williamson,et al.  Calibrating Item Families and Summarizing the Results Using Family Expected Response Functions , 2003 .

[19]  Randy Elliot Bennett,et al.  Item generation and beyond: Applications of schema theory to mathematics assessment. , 2002 .

[20]  A. Laduca,et al.  Item modelling procedure for constructing content‐equivalent multiple choice questions , 1986, Medical education.

[21]  Issac I. Bejar A Generative Analysis of a Three-Dimensional Spatial Task , 1990 .

[22]  Mark J. Gierl,et al.  Developing a Taxonomy of Item Model Types to Promote Assessment Engineering , 2008 .

[23]  Paul Deane,et al.  MULTILINGUAL GENERALIZATION OF THE MODELCREATOR SOFTWARE FOR MATH ITEM GENERATION , 2005 .

[24]  Thomas M. Haladyna,et al.  Item Shells , 1989 .

[25]  Wells HivelyII,et al.  A “UNIVERSE‐DEFINED” SYSTEM OF ARITHMETIC ACHIEVEMENT TESTS1 , 1968 .

[26]  James W Pellegrino,et al.  Technology and Testing , 2009, Science.

[27]  Morris De Beer,et al.  Technology and Testing , 2013 .

[28]  Isaac I. Bejar GENERATIVE RESPONSE MODELING: LEVERAGING THE COMPUTER AS A TEST DELIVERY MEDIUM , 1996 .