Assessing problem solving in expert systems using human benchmarking

Abstract The human benchmarking approach attempts to assess problem solving in expert systems by measuring their performance against a range of human problem-solving performances. We established a correspondence between functions of the expert system GATES and human problem-solving skills required to perform a scheduling task. We then developed process and outcome measures and gave them to people of different assumed problem-solving ability. The problem-solving ability or “intelligence” of this expert system is extremely high in the narrow domain of scheduling planes to airport gates as indicated by its superior performance compared to that of undergraduates, graduate students and expert human schedulers (i.e. air traffic controllers). In general, the study supports the feasibility of using human benchmarking methodology to evaluate the problem-solving ability of a specific expert system.

[1]  Mark A. Davis,et al.  Cognitive and emotional components of anxiety: literature review and a revised worry-emotionality scale. , 1981, Journal of educational psychology.

[2]  Amparo Alonso-Betanzos,et al.  Information analysis and validation of intelligent monitoring systems in intensive care units , 1997, IEEE Transactions on Information Technology in Biomedicine.

[3]  Jonathan M. Garibaldi,et al.  The evaluation of an expert system for the analysis of umbilical cord blood , 1999, Artif. Intell. Medicine.

[4]  Caroline C. Hayes A study of solution quality in human expert and knowledge-based system reasoning , 1997 .

[5]  Olivia R. Liu Sheng,et al.  A knowledge-based system for patient image pre-fetching in heterogeneous database environments - modeling, design, and evaluation , 2001, IEEE Transactions on Information Technology in Biomedicine.

[6]  Ravi S. Sharma,et al.  Evaluating expert systems: the socio‐technical dimensions of quality , 1992 .

[7]  Harold F. O'Neil,et al.  Anxiety, learning, and instruction , 1977 .

[8]  P. Pintrich,et al.  Motivational and self-regulated learning components of classroom academic performance. , 1990 .

[9]  V. B. Pandit Artificial intelligence and expert systems: a technology update , 1994, Conference Proceedings. 10th Anniversary. IMTC/94. Advanced Technologies in I & M. 1994 IEEE Instrumentation and Measurement Technolgy Conference (Cat. No.94CH3424-9).

[10]  Eva L. Baker,et al.  Reliability and validity of japanese children's trait and state worry and emotionality scales , 1992 .

[11]  Barry K. Beyer,et al.  Developing a thinking skills program , 1987 .

[12]  Houman Talebzadeh,et al.  Countrywide Loan-Underwriting Expert System , 1995, AI Mag..

[13]  Byung Kook Kim,et al.  Measuring the machine intelligence quotient (MIQ) of human-machine cooperative systems , 2001, IEEE Trans. Syst. Man Cybern. Part A.

[14]  Harold F. O'Neil,et al.  Perspectives on computer-based performance assessment of problem solving , 1999 .

[15]  Elpida T. Keravnou,et al.  Abductive Diagnosis Using Time‐Objects: Criteria for the Evaluation of Solutions , 2001, Comput. Intell..

[16]  Richard E. Mayer,et al.  The Teaching of Learning Strategies. , 1983 .

[17]  Dianne C. Berry,et al.  Evaluating expert systems , 1990 .

[18]  Kathleen M. Swigger,et al.  GATES: an airline gate assignment and tracking expert system , 1988, IEEE Expert.

[19]  Paul R. Cohen,et al.  Benchmarks, Test Beds, Controlled Experimentation, and the Design of Agent Architectures , 1993, AI Mag..

[20]  Eva L. Baker,et al.  Technology assessment in software applications , 1994 .

[21]  Jamal Abedi,et al.  Reliability and Validity of a State Metacognitive Inventory: Potential for Alternative Assessment , 1996 .

[22]  Alison H. Paris,et al.  Classroom Applications of Research on Self-Regulated Learning , 2001 .

[23]  Richard E. Mayer,et al.  Problem-solving transfer. , 1996 .