Views: Fundamental Building Blocks in the Process of Knowledge Discovery

We present a novel approach to describe the knowledge discovery process, focusing on a generalized form of attribute called view. It is observed that the process of knowledge discovery can, in fact. be modeled as the design, generation, use, and evaluation of views, asserting that views are the fundamental building blocks in the discovery process. We realize these concepts as an object oriented class library and conduct computational knowledge discovery experiments on biological data. namely the characterization of N-terminal protein sorting signals, yielding significant results.

[1]  Ayumi Shinohara,et al.  Knowledge Acquisition from Amino Acid Sequences by Machine Learning System BONSAI , 1992 .

[2]  Shinichi Shimozono,et al.  Alphabet Indexing for Approximating Features of Symbols , 1999, Theor. Comput. Sci..

[3]  Ryszard S. Michalski,et al.  Data-Driven Constructive Induction , 1998, IEEE Intell. Syst..

[4]  B. Matthews Comparison of the predicted and observed secondary structure of T4 phage lysozyme. , 1975, Biochimica et biophysica acta.

[5]  Tom Khabaza,et al.  Data mining with Clementine , 1995 .

[6]  Larry A. Rendell,et al.  Constructive Induction On Decision Trees , 1989, IJCAI.

[7]  S. Brunak,et al.  Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. , 2000, Journal of molecular biology.

[8]  J. M. Zimmerman,et al.  The characterization of amino acid sequences in proteins by statistical methods. , 1968, Journal of theoretical biology.

[9]  Hiroyuki Ogata,et al.  AAindex: Amino Acid Index Database , 1999, Nucleic Acids Res..

[10]  K. Nakai Protein sorting signals and prediction of subcellular localization. , 2000, Advances in protein chemistry.

[11]  Pat Langley,et al.  The Computer-Aided Discovery of Scientific Knowledge , 1998, Discovery Science.

[12]  G. von Heijne The signal peptide. , 1990, The Journal of membrane biology.

[13]  J. Current,et al.  Theory and methodology , 1991 .

[14]  G. von Heijne,et al.  Domain structure of mitochondrial and chloroplast targeting peptides. , 1989, European journal of biochemistry.

[15]  Peter C. Cheeseman,et al.  Bayesian Classification (AutoClass): Theory and Results , 1996, Advances in Knowledge Discovery and Data Mining.

[16]  G. Vonheijne The signal peptide. , 1990 .

[17]  B. Bruce,et al.  Chloroplast transit peptides: structure, function and evolution. , 2000, Trends in cell biology.

[18]  Satoru Miyano,et al.  Toward Genomic Hypothesis Creator: View Designer for Discovery , 1998, Discovery Science.

[19]  P Vincens,et al.  Computational method to predict mitochondrially imported proteins and their targeting sequences. , 1996, European journal of biochemistry.

[20]  Padhraic Smyth,et al.  From Data Mining to Knowledge Discovery in Databases , 1996, AI Mag..

[21]  Satoru Miyano,et al.  Designing Views in HypothesisCreator: System for Assisting in Discovery , 1999, Discovery Science.

[22]  Udi Manber,et al.  Fast text searching: allowing errors , 1992, CACM.

[23]  R. Doolittle,et al.  A simple method for displaying the hydropathic character of a protein. , 1982, Journal of molecular biology.

[24]  M. Kanehisa,et al.  A knowledge base for predicting protein localization sites in eukaryotic cells , 1992, Genomics.

[25]  Ryszard S. Michalski,et al.  A theory and methodology of inductive learning , 1993 .