Knowledge Discovery in Databases: Exploiting Knowledge-Level Redescription

Within this paper, we analyse the nature of knowledge discovery in database. We conclude that it is similar to that of knowledge acquisition, yet unique in that it employs pre-existing data collected for reasons other than analysis. The post-hoc nature of KDD means that the database is often unfit for analysis using traditional machine-learning techniques. We present a methodology for KDD that attempts to overcome this problem. Knowledge elicitation techniques are employed to define the structure of an appropriate learning dataset and to relate this structure to the raw database. The raw database is then redescribed in terms of the new structure before machine learning tools are applied. We also present CASTLE, a software workbench designed to support this methodology, and illustrate it's usage upon a worked example drawn from the Sisyphus-I room allocation problem.

[1]  Nigel Shadbolt,et al.  Knowledge Based Knowledge Acquisition: The Next Generation of Support Tools , 1990 .

[2]  Patrick Albert,et al.  Knowledge Level Model of a Configurable Learning System , 1994, EKAW.

[3]  Ronald L. Rivest,et al.  Inferring Decision Trees Using the Minimum Description Length Principle , 1989, Inf. Comput..

[4]  Tom M. Mitchell,et al.  Justification-Based Refinement of Expert Knowledge , 1991, Knowledge Discovery in Databases.

[5]  Nigel Shadbolt,et al.  The empirical study of knowledge elicitation techniques , 1989, SGAR.

[6]  Benjamin N. Grosof,et al.  Declarative Bias: An Overview , 1990 .

[7]  Enrico Motta,et al.  Constructing knowledge-based systems , 1993, IEEE Software.

[8]  M. Chi,et al.  The Nature of Expertise , 1988 .

[9]  Brian R. Gaines,et al.  Current Trends in Knowledge Acquisition , 1990 .

[10]  Jean-Gabriel Ganascia,et al.  Integrating Models of Knowledge and Machine Learning , 1993, ECML.

[11]  Nigel Shadbolt,et al.  Laddering: technique and tool use in knowledge acquisition , 1994 .

[12]  J. Austin How to do things with words , 1962 .

[13]  William Frawley,et al.  Knowledge Discovery in Databases , 1991 .

[14]  B. Chandrasekaran,et al.  Generic tasks as building blocks for knowledge-based systems: the diagnosis and routine design examples , 1988, The Knowledge Engineering Review.

[15]  Nigel Shadbolt,et al.  Representational Redescription within Knowledge Intensive Data-Mining , 1994 .

[16]  Gregory Piatetsky-Shapiro,et al.  Knowledge Discovery in Databases: An Overview , 1992, AI Mag..

[17]  Nigel Shadbolt,et al.  KA process support through generalised directive models , 1993 .

[18]  Kieron O'Hara,et al.  A Representation of KADS-I Interpretation Models Using a Decompositional Approach , 1993 .

[19]  Nigel Shadbolt,et al.  REKAP, A Methodology for the Automated Construction of Real-Time and Distributed Knowledge-Based Systems , 1994 .