Multi-relational Data Mining in Medical Databases

This paper presents the application of a method for mining data in a multi-relational database that contains some information about patients strucked down by chronic hepatitis. Our approach may be used on any kind of multirelational database and aims at extracting probabilistic tree patterns from a database using Grammatical Inference techniques. We propose to use a representation of the database by trees in order to extract these patterns. Trees provide a natural way to represent structured information taking into account the statistical distribution of the data. In this work we try to show how they can be useful for interpreting knowledge in the medical domain.

[1]  Nandit Soparkar,et al.  Frequent Itemset Counting Across Multiple Tables , 2000, PAKDD.

[2]  Luc De Raedt,et al.  Machine Learning: ECML 2001 , 2001, Lecture Notes in Computer Science.

[3]  Krzysztof J. Cios,et al.  Uniqueness of medical data mining , 2002, Artif. Intell. Medicine.

[4]  Amaury Habrard,et al.  Generalized Stochastic Tree Automata for Multi-relational Data Mining , 2002, ICGI.

[5]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[6]  Jean-Gabriel Ganascia Extraction of Recurrent Patterns from Stratified Ordered Trees , 2001, ECML.

[7]  Naoki Abe,et al.  Predicting Protein Secondary Structure Using Stochastic Tree Grammars , 1997, Machine Learning.

[8]  Paul Cull,et al.  On Exact Learning of Unordered Tree Patterns , 2001, Machine Learning.

[9]  Daniel A. Keim,et al.  Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining , 2002, KDD.

[10]  Hannu Toivonen,et al.  Discovery of frequent DATALOG patterns , 1999, Data Mining and Knowledge Discovery.

[11]  Stephen Kwek,et al.  On Learning Unions of Pattern Languages and Tree Patterns , 1999, ALT.

[12]  Maurice Bruynooghe,et al.  Information Extraction in Structured Documents Using Tree Automata Induction , 2002, PKDD.

[13]  Yusuke Suzuki,et al.  Discovery of Frequent Tag Tree Patterns in Semistructured Web Documents , 2002, PAKDD.

[14]  J. Oncina Inference of recognizable tree sets , 2003 .

[15]  Mohammed J. Zaki Efficiently mining frequent trees in a forest , 2002, KDD.

[16]  Jan Komorowski,et al.  Principles of Data Mining and Knowledge Discovery , 2001, Lecture Notes in Computer Science.

[17]  Hiroshi Sakamoto,et al.  Efficient Learning of Semi-structured Data from Queries , 2001, ALT.

[18]  Jorge Calera-Rubio,et al.  Stochastic Inference of Regular Tree Languages , 1998, ICGI.

[19]  Timo Knuutila,et al.  The Inference of Tree Languages from Finite Samples: An Algebraic Approach , 1994, Theor. Comput. Sci..