The Delphi Method to Validate Diagnostic Knowledge in Computerized ECG Interpretation

We investigated the applicability of the Delphi method for increasing the agreement among multiple cardiologists on, firstly, their classifications of a set of electrocardiograms and, secondly, their reasons for these classifications. Five cardiologists were requested to judge the computer classifications of a set of thirty ECGs. If a cardiologist disagreed with the computer classification, he had to provide a new classification and a reason for this change. The results of this first round were compiled and anonymously fed back to the cardiologists. In a second round the cardiologists were asked once again to judge the ECGs and to rate the reasons provided in the first round. The level of agreement was estimated by means of the kappa statistic. The Delphi procedure substantially increased the agreement on the classifications among the cardiologists. The final agreement was very high and comparable with the intraobserver agreement. There was also a high level of agreement on the reasons provided by the cardiologists. However, their use in improving the program's performance is hampered by the qualitative nature of many of the reasons. Suggestions are given for a more formalized elicitation of knowledge.