Models for measuring performance of medical expert systems

Abstract Scoring schemes for measuring expert system performance are reviewed. Rule-based classification systems and their error rates on sample data are considered. We present several models of measurement that are categorized by four characteristics: mutual exclusivity of classes, unique answers provided by the system, known correct conclusions for each case, and use of confidence factors to weight the system's conclusions. An underlying model of performance measurement is critical in determining which scoring strategy is appropriate for a system and whether a comparison of different medical expert systems can be made.

[1]  William J. Clancey,et al.  Classification Problem Solving , 1984, AAAI.

[2]  Perry L. Miller The evaluation of artificial intelligence systems in medicine. , 1985 .

[3]  McClung,et al.  Physiological Rule-Based System for Interpreting Pulmonary Function Test Results , 2022 .

[4]  Lawrence C. Kingsland,et al.  The Evaluation of Medical Expert Systems: Experience with the AI/RHEUM Knowledge-Based Consultant System in Rheumatology , 1988 .

[5]  P L Miller,et al.  Evaluation of Artificial Intelligence Systems in Medicine , 1988 .

[6]  Lawrence M. Fagan,et al.  Antimicrobial selection by a computer. A blinded evaluation by infectious diseases experts. , 1979, JAMA.

[7]  H. E. Pople,et al.  Internist-I, an Experimental Computer-Based Diagnostic Consultant for General Internal Medicine , 1982 .

[8]  Casimir A. Kulikowski,et al.  Developing Microprocessor Based Expert Models for Instrument Interpretation , 1981, IJCAI.

[9]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[10]  Sholom M. Weiss,et al.  Automatic Knowledge Base Refinement for Classification Systems , 1988, Artif. Intell..

[11]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[12]  J A Swets,et al.  Measuring the accuracy of diagnostic systems. , 1988, Science.

[13]  R. Galen,et al.  Beyond Normality: The Predictive Value and E ciency of Medical Diagnoses , 1975 .

[14]  Ryszard S. Michalski,et al.  The AQ15 Inductive Learning System: An Overview and Experiments , 1986 .

[15]  Kent A. Spackman,et al.  Learning Categorical Decision Criteria in Biomedical Domains , 1988, ML.

[16]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[17]  Casimir A. Kulikowski,et al.  A Model-Based Method for Computer-Aided Medical Decision-Making , 1978, Artif. Intell..