Design Aspects of Discovery Systems

SUMMARY This paper reviews design aspects of computational discovery systems through the analysis of some successful discovery systems. We first review the concept of viewscope/view on data which provides an interpretation of raw data in a specific domain. Then we relate this concept to the KDD process described by Fayyad et al. (1996) and the developer’s role in computational discovery due to Langley (1998). We emphasize that integration of human experts and discovery systems is a crucial problem in designing discovery systems and claim together with the analysis of discovery systems that the concept of viewscope/view gives a way for approaching this problem.

[1]  Ming Li,et al.  An Introduction to Kolmogorov Complexity and Its Applications , 2019, Texts in Computer Science.

[2]  Raúl E. Valdés-Pérez,et al.  Automatic componential analysis of kinship semantics with a proposed structural solution to the problem of multiple models , 1998 .

[3]  Pat Langley,et al.  The Computer-Aided Discovery of Scientific Knowledge , 1998, Discovery Science.

[4]  Ayumi Shinohara,et al.  Knowledge Acquisition from Amino Acid Sequences by Machine Learning System BONSAI , 1992 .

[5]  G. Klopman Artificial intelligence approach to structure-activity studies. Computer automated structure evaluation of biological activity of organic molecules , 1985 .

[6]  R. Paro,et al.  The Polycomb protein shares a homologous domain with a heterochromatin-associated protein of Drosophila. , 1991, Proceedings of the National Academy of Sciences of the United States of America.

[7]  A. Patapoutian,et al.  Identification and purification of a factor that binds to the Mlu I cell cycle box of yeast DNA replication genes. , 1991, Proceedings of the National Academy of Sciences of the United States of America.

[8]  Inge Jonassen,et al.  Efficient discovery of conserved patterns using a pattern graph , 1997, Comput. Appl. Biosci..

[9]  Padhraic Smyth,et al.  From Data Mining to Knowledge Discovery in Databases , 1996, AI Mag..

[10]  David Haussler,et al.  Mining scientific data , 1996, CACM.

[11]  Rüdiger Wirth,et al.  Towards Process-Oriented Tool Support for Knowledge Discovery in Databases , 1997, PKDD.

[12]  Yukio Ohsawa,et al.  KeyGraph: automatic indexing by co-occurrence graph based on building construction metaphor , 1998, Proceedings IEEE International Forum on Research and Technology Advances in Digital Libraries -ADL'98-.

[13]  Peter C. Cheeseman,et al.  Bayesian Classification (AutoClass): Theory and Results , 1996, Advances in Knowledge Discovery and Data Mining.

[14]  Raúl E. Valdés-Pérez,et al.  Principles of Human Computer Collaboration for Knowledge Discovery in Science , 1999, Artif. Intell..

[15]  James Kelly,et al.  AutoClass: A Bayesian Classification System , 1993, ML.

[16]  Yukio Ohsawa,et al.  Discover Risky Active Faults by Indexing an Earthquake Sequence , 1999, Discovery Science.

[17]  Neil R. Smalheiser,et al.  Artificial Intelligence An interactive system for finding complementary literatures : a stimulus to scientific discovery , 1995 .

[18]  G. Klopman MULTICASE 1. A Hierarchical Computer Automated Structure Evaluation Program , 1992 .

[19]  A. Stewart,et al.  The chromo shadow domain, a second chromo domain in heterochromatin-binding protein 1, HP1. , 1995, Nucleic acids research.

[20]  Esko Ukkonen,et al.  Discovering Unbounded Unions of Regular Pattern Languages from Positive Examples (Extended Abstract) , 1996, ISAAC.

[21]  Dana Angluin,et al.  Finding Patterns Common to a Set of Strings , 1980, J. Comput. Syst. Sci..

[22]  Ronald W. Davis,et al.  A genome-wide transcriptional analysis of the mitotic cell cycle. , 1998, Molecular cell.

[23]  Esko Ukkonen,et al.  Pattern Discovery in Biosequences , 1998, ICGI.

[24]  P. Brown,et al.  Exploring the metabolic and genetic control of gene expression on a genomic scale. , 1997, Science.

[25]  Satoru Miyano,et al.  Designing Views in HypothesisCreator: System for Assisting in Discovery , 1999, Discovery Science.

[26]  Udi Manber,et al.  Fast text searching: allowing errors , 1992, CACM.

[27]  Rá Ul,et al.  Machine Discovery in Chemistry: New Results , 1995 .

[28]  Yusuke Nakamura,et al.  Mutation analysis in the BRCA2 gene in primary breast cancers , 1996, Nature Genetics.

[29]  B. Barrell,et al.  Life with 6000 Genes , 1996, Science.

[30]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[31]  Siemion Fajtlowicz,et al.  On conjectures of Graffiti , 1988, Discret. Math..

[32]  Satoru Miyano,et al.  Toward Genomic Hypothesis Creator: View Designer for Discovery , 1998, Discovery Science.

[33]  L. Johnston,et al.  Coordination of expression of DNA synthesis genes in budding yeast by a cell-cycle regulated trans factor , 1991, Nature.