Identifying top-k Vital Patterns from multi-class medical data

With the development of modern science, the goal of medical research is not limit to explore a type of disease but more accurate multi-subtypes of this disease. For example breast cancer can be divided into three different subtypes: BRCA1, BRCA2 and Sporadic. Previous work only focuses on distinguishing several pairs of tumors. However, the simultaneous distinguish across multiple disease types has not been well studied yet, which is important for medical researcher. In this paper, we define VP (an acronym for “Vital Pattern”) and PP (an acronym for “Protect Pattern”) by a statistical metric, and propose a new algorithm to make use of the property discovery VP and PP from multiple disease types. The algorithm can generate some useful rules for medical researchers. The results demonstrate the feasibility of performing the clinically useful classification from patients of multiple pneumonia types.

[1]  Zhi-Hua Zhou,et al.  Medical diagnosis with C4.5 rule preceded by artificial neural network ensemble , 2003, IEEE Transactions on Information Technology in Biomedicine.

[2]  Dimitrios Gunopulos,et al.  Constraint-Based Rule Mining in Large, Dense Databases , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[3]  Warren T. Jones,et al.  Research Paper: Association Rules and Data Mining in Hospital Infection Control and Public Health Surveillance , 1998, J. Am. Medical Informatics Assoc..

[4]  Rüdiger W. Brause,et al.  A Frequent Patterns Tree Approach for Rule Generation with Categorical Septic Shock Patient Data , 2001, ISMDA.

[5]  Arno Siebes,et al.  Data Mining: the search for knowledge in databases. , 1994 .

[6]  Karsten M. DeckerCSCS,et al.  Technology Overview: a Report on Data Mining , 1995 .

[7]  Dimitrios Gunopulos,et al.  Constraint-Based Rule Mining in Large, Dense Databases , 2004, Data Mining and Knowledge Discovery.

[8]  Jinyan Li,et al.  Using Rules to Analyse Bio-medical Data: A Comparison between C4.5 and PCL , 2003, WAIM.

[9]  U. Fayyad Knowledge Discovery and Data Mining: An Overview , 1995 .

[10]  Wynne Hsu,et al.  Pruning and summarizing the discovered associations , 1999, KDD '99.

[11]  Katsutoshi Yada,et al.  Mining Pharmacy Data Helps to Make Profits , 1998, Data Mining and Knowledge Discovery.

[12]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[13]  Rajjan Shinghal,et al.  Evaluating the Interestingness of Characteristic Rules , 1996, KDD.

[14]  Laks V. S. Lakshmanan,et al.  Exploratory mining and pruning optimizations of constrained associations rules , 1998, SIGMOD '98.

[15]  M. Ohsaki A Rule Discovery Support System for Sequential Medical Data,-In the Case Study of a Chronic Hepatitis Dataset- , 2002 .

[16]  Ramakrishnan Srikant,et al.  Mining Association Rules with Item Constraints , 1997, KDD.

[17]  HanJiawei,et al.  Exploratory mining and pruning optimizations of constrained associations rules , 1998 .

[18]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.