Automated extraction of hierarchical decision rules from clinical databases using rough set model

Abstract One of the most important problems on rule induction methods is that they cannot extract rules, which plausibly represent experts' decision processes. On one hand, rule induction methods induce probabilistic rules, the description length of which is too short, compared with the experts' rules. On the other hand, construction of Bayesian networks generates too lengthy rules. In this paper, the characteristics of experts' rules are closely examined and a new approach to extract plausible rules is introduced, which consists of the following three procedures. First, the characterization of decision attributes (given classes) is extracted from databases and the classes are classified into several groups with respect to the characterization. Then, two kinds of sub-rules, characterization rules for each group and discrimination rules for each class in the group are induced. Finally, those two parts are integrated into one rule for each decision attribute. The proposed method was evaluated on a medical database, the experimental results of which show that induced rules correctly represent experts' decision processes.

[1]  J. Kacprzyk,et al.  Advances in the Dempster-Shafer theory of evidence , 1994 .

[2]  Shusaku Tsumoto,et al.  Automated Extraction of Medical Expert System Rules from Clinical Databases on Rough Set Theory , 1998, Inf. Sci..

[3]  Shusaku Tsumoto Formalization and Induction of Medical Expert System Rules Based on Rough Set Theory , 1998 .

[4]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[5]  Hiroshi Tanaka,et al.  PRIMEROSE: PROBABILISTIC RULE INDUCTION METHOD BASED ON ROUGH SETS AND RESAMPLING METHODS , 1995, Comput. Intell..

[6]  Thomas G. Dietterich,et al.  Readings in Machine Learning , 1991 .

[7]  Robert C. Kohberger,et al.  Cluster Analysis (3rd ed.) , 1994 .

[8]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[9]  David W. Aha,et al.  Instance-Based Learning Algorithms , 1991, Machine Learning.

[10]  Brian Everitt,et al.  Cluster analysis , 1974 .

[11]  Shusaku Tsumoto,et al.  Clustering Time-series Data Based on the Modified Multiscale Matching Technique (Joint Workshop of Vietnamese Society of AI, SIGKBS-JSAI, ICS-IPSJ and IEICE-SIGAI on Active Mining) -- (Session 8: Medical Data Mining) , 2004 .

[12]  Peter Clark,et al.  The CN2 induction algorithm , 2004, Machine Learning.

[13]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[14]  Nada Lavrac,et al.  The Multi-Purpose Incremental Learning System AQ15 and Its Testing Application to Three Medical Domains , 1986, AAAI.

[15]  Lotfi A. Zadeh,et al.  Toward a theory of fuzzy information granulation and its centrality in human reasoning and fuzzy logic , 1997, Fuzzy Sets Syst..

[16]  Andrzej Skowron,et al.  From the Rough Set Theory to the Evidence Theory , 1991 .

[17]  Wojciech Ziarko,et al.  Variable Precision Rough Set Model , 1993, J. Comput. Syst. Sci..

[18]  Andrzej Skowron,et al.  Rough mereology: A new paradigm for approximate reasoning , 1996, Int. J. Approx. Reason..

[19]  Shusaku Tsumoto Extraction of Experts' Decision Rules from Clinical Databases Using Rough Set Model , 1998, Intell. Data Anal..