System for managing and refining structural characteristics discovered from databases

Systems that allow automatic knowledge discovery from databases will play an increasingly important role in building/sharing large scale knowledge bases. Although many systems for knowledge discovery in databases have been proposed, few of them have addressed the capabilities of managing and refining the discovered knowledge. In particular, the contents of most databases are ever changing and erroneous data can be a significant problem in real-world databases. Hence, the process of discovering knowledge from databases is a process based on incipient hypothesis generation/evaluation and refinement/management. The paper describes a system named IIBR (Inheritance Inference Based Refinement) for managing and refining structural characteristics discovered from databases. Structural characteristics are a kind of important regularity hidden in databases, and are denoted by regression models for describing three kinds of functional relations: the exact, strong and weak ones. IIBR is one subsystem of the authors' GLS (Global Learning Scheme) discovery system, and can be cooperatively used with other subsystems of GLS such as KOSI (Knowledge Oriented Statistic Inference). By means of IIBR, the structural characteristics discovered by KOSI can be added to a knowledge base as the deductive rules and the sets of data for showing their errors, and can be easily managed and refined according to data change in a database. IIBR is based on inheritance inference and error analysis, as well as the model representation of knowledge, multiple worlds/levels, and metareasoning in the knowledge-based system KAUS. Experience with a prototype of IIBR implemented by KAUS is discussed.

[1]  Herbert A. Simon,et al.  Scientific discovery: compulalional explorations of the creative process , 1987 .

[2]  H. Akaike A new look at the statistical model identification , 1974 .

[3]  Setsuo Ohsuga How can knowledge-based systems solve large-scale problems?: model-based decomposition and problem solving , 1993, Knowl. Based Syst..

[4]  Setsuo Ohsuga Framework of knowledge-based systems : Multiple meta-level architecture for representing problems and problem-solving processes , 1990, Knowl. Based Syst..

[5]  Setsuo Ohsuga,et al.  Losse Coupling of KAUS with Existing RDBMSs , 1990, Data Knowl. Eng..

[6]  H. Yamauchi Loose Coupling of KAUS with Existing RDB MSs , 1990 .

[7]  S. Ohsuga,et al.  Toward intelligent CAD systems , 1989 .

[8]  Gregory Piatetsky-Shapiro,et al.  Knowledge discovery workbench for exploring business databases , 1992, Int. J. Intell. Syst..

[9]  Neil C. Rowe,et al.  Management of Regression-Model Data , 1991, Data Knowl. Eng..

[10]  P. R. Bevington,et al.  Data Reduction and Error Analysis for the Physical Sciences , 1969 .

[11]  Pattie Maes,et al.  Meta-Level Architectures and Reflection , 1988 .

[12]  Ning Zhong,et al.  Managing/refining structural characteristics discovered from databases , 1995, Proceedings of the Twenty-Eighth Annual Hawaii International Conference on System Sciences.

[13]  Philip K. Chan,et al.  Systems for Knowledge Discovery in Databases , 1993, IEEE Trans. Knowl. Data Eng..

[14]  Ning Zhong,et al.  The GLS Discovery System: Its Goal, Architecture and Current Results , 1994, ISMIS.

[15]  G. K. Bhattacharyya,et al.  Statistical Concepts And Methods , 1978 .

[16]  John R. Anderson,et al.  MACHINE LEARNING An Artificial Intelligence Approach , 2009 .

[17]  William Frawley,et al.  Knowledge Discovery in Databases , 1991 .

[18]  Neil C. Rowe,et al.  Absolute Bounds on Set Intersection and Union Sizes from Distribution Information , 1988, IEEE Trans. Software Eng..

[19]  Jan M. Zytkow,et al.  Data-Driven Approaches to Empirical Discovery , 1989, Artif. Intell..

[20]  R. M. Alexander How Dinosaurs Ran , 1991 .

[21]  Ning Zhong,et al.  Toward a Multi-Strategy and Cooperative Discovery System , 1995, KDD.

[22]  Ning Zhong,et al.  Discovering Concept Clusters by Decomposing Databases , 1994, Data Knowl. Eng..

[23]  C RoweNeil Antisampling for Estimation , 1985 .

[24]  Willi Klösgen,et al.  Problems for knowledge discovery in databases and their treatment in the statistics interpreter explora , 1992, Int. J. Intell. Syst..