Evaluating the quality of medical multiple‐choice items created with automated processes

Computerised assessment raises formidable challenges because it requires large numbers of test items. Automatic item generation (AIG) can help address this test development problem because it yields large numbers of new items both quickly and efficiently. To date, however, the quality of the items produced using a generative approach has not been evaluated. The purpose of this study was to determine whether automatic processes yield items that meet standards of quality that are appropriate for medical testing. Quality was evaluated firstly by subjecting items created using both AIG and traditional processes to rating by a four‐member expert medical panel using indicators of multiple‐choice item quality, and secondly by asking the panellists to identify which items were developed using AIG in a blind review.