Validating Metrics for a Mastoidectomy Simulator

One of the primary barriers to the acceptance of surgical simulators is that most simulators still require a significant amount of an instructing surgeon's time to evaluate and provide feedback to the students using them. Thus, an important area of research in this field is the development of metrics that can enable a simulator to be an essentially self-contained teaching tool, capable of identifying and explaining the user's weaknesses. However, it is essential that these metrics be validated in able to ensure that the evaluations provided by the "virtual instructor" match those that the real instructor would provide were he/she present. We have previously proposed a number of algorithms for providing automated feedback in the context of a mastoidectomy simulator. In this paper, we present the results of a user study in which we attempted to establish construct validity (with inter-rater reliability) for our simulator itself and to validate our metrics. Fifteen subjects (8 experts, 7 novices) were asked to perform two virtual mastoidectomies. Each virtual procedure was recorded, and two experienced instructing surgeons assigned global scores that were correlated with subjects' experience levels. We then validated our metrics by correlating the scores generated by our algorithms with the instructors' global ratings, as well as with metric-specific sub-scores assigned by one of the instructors.