Torture Tests: A Quantitative Analysis for the Robustness of Knowledge-Based Systems

The overall aim of this paper is to provide a general setting for quantitative quality measures of Knowledge-Based System behavior which is widely applicable to many Knowledge-Based Systems. We propose a general approach that we call "degradation studies": an analysis of how system output degrades as a function of degrading system input, such as incomplete or incorrect inputs. Such degradation studies avoid a number of problems that have plagued earlier attempts at defining such quality measures because they do not require a comparison between different (and often incomparable) systems, and they are entirely independent of the internal workings of the particular Knowledge-Based System at hand. To show the feasibility of our approach, we have applied it in a specific case-study. We have taken a large and realistic vegetation-classification system, and have analyzed its behavior under various varieties of missing input. This case-study shows that degradation studies can reveal interesting and surprising properties of the system under study.

[1]  Alun D. Preece,et al.  Evaluation of verification tools for knowledge-based systems , 1997, Int. J. Hum. Comput. Stud..

[2]  Standard Glossary of Software Engineering Terminology , 1990 .

[3]  Frank van Harmelen,et al.  Editorial: Evaluating knowledge engineering techniques , 1999, Int. J. Hum. Comput. Stud..

[4]  Johann Schumann,et al.  NORA/HAMMR: making deduction-based software component retrieval practical , 1997, Proceedings 12th IEEE International Conference Automated Software Engineering.

[5]  Frank van Harmelen,et al.  Computing Approximate Diagnoses By Using Approximate Entailment , 1996, KR.

[6]  Swaminathan Natarajan Imprecise and Approximate Computation , 1995 .

[7]  Shlomo Zilberstein,et al.  Approximate Reasoning Using Anytime Algorithms , 1995 .

[8]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[9]  Frank van Harmelen,et al.  Exploiting Domain Knowledge for Approximate Diagnosis , 1997, IJCAI.

[10]  Nigel Shadbolt,et al.  The experimental evaluation of knowledge acquisition techniques and methods: history, problems and new directions , 1999, Int. J. Hum. Comput. Stud..

[11]  Frederick Hayes-Roth,et al.  Knowledge-based expert systems — The State of the Art in the US , 1984, The Knowledge Engineering Review.

[12]  S.J.J. Smith,et al.  Empirical Methods for Artificial Intelligence , 1995 .

[13]  Mark S. Boddy,et al.  An Analysis of Time-Dependent Planning , 1988, AAAI.

[14]  Frank van Harmelen,et al.  Characterising approximate problem solving: by partially fulfilled pre- and postconditions , 1998, EUROVAV.

[15]  Shlomo Zilberstein,et al.  Using Anytime Algorithms in Intelligent Systems , 1996, AI Mag..

[16]  Frank Puppe,et al.  Wiederverwendbare Bausteine für eine konfigurierbare Diagnostik-Shell , 1994, Künstliche Intell..