Extraction of Significant Patterns from Heart Disease Warehouses for Heart Attack Prediction

Summary The diagnosis of diseases is a significant and tedious task in medicine. The detection of heart disease from various factors or symptoms is a multi-layered issue which is not free from false presumptions often accompanied by unpredictable effects. Thus the effort to utilize knowledge and experience of numerous specialists and clinical screening data of patients collected in databases to facilitate the diagnosis process is considered a valuable option. The healthcare industry gathers enormous amounts of heart disease data that regrettably, are not “mined” to determine concealed information for effective decision making by healthcare practitioners. In this paper, we have proposed an efficient approach for the extraction of significant patterns from the heart disease warehouses for heart attack prediction. Initially, the data warehouse is preprocessed to make it appropriate for the mining process. After preprocessing, the heart disease warehouse is clustered using the K-means clustering algorithm, which will extract the data relevant to heart attack from the warehouse. Subsequently the frequent patterns are mined from the extracted data, relevant to heart disease, using the MAFIA algorithm. Then the significant weightage of the frequent patterns are calculated. Further, the patterns significant to heart attack prediction are chosen based on the calculated significant weightage. These significant patterns can be used in the development of heart attack prediction system.

[1]  Simon Lin,et al.  Data mining issues and opportunities for building nursing knowledge , 2003, J. Biomed. Informatics.

[2]  Vasudha Bhatnagar,et al.  Analysis of Medical Data using Data Mining and Formal Concept Analysis , 2007 .

[3]  Nicos Maglaveras,et al.  Mining Association Rules from Clinical Databases: An Intelligent Diagnostic Process in Healthcare , 2001, MedInfo.

[4]  Nello Cristianini,et al.  An introduction to Support Vector Machines , 2000 .

[5]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[6]  Johannes Gehrke,et al.  MAFIA: a maximal frequent itemset algorithm for transactional databases , 2001, Proceedings 17th International Conference on Data Engineering.

[7]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[8]  H. Koh,et al.  Data mining applications in healthcare. , 2005, Journal of healthcare information management : JHIM.

[9]  Sellappan Palaniappan,et al.  Intelligent heart disease prediction system using data mining techniques , 2008, 2008 IEEE/ACS International Conference on Computer Systems and Applications.

[10]  Kevin C. Desouza,et al.  Data mining in healthcare information systems: case study of a veterans' administration spinal cord injury population , 2003, 36th Annual Hawaii International Conference on System Sciences, 2003. Proceedings of the.

[11]  Keun Ho Ryu,et al.  Mining Biosignal Data: Coronary Artery Disease Diagnosis Using Linear and Nonlinear Features of HRV , 2007, PAKDD Workshops.

[12]  Philip S. Yu,et al.  Data Mining: An Overview from a Database Perspective , 1996, IEEE Trans. Knowl. Data Eng..

[13]  Frank Lemke,et al.  Medical data analysis using self-organizing data mining technologies , 2003 .

[14]  Marc Cuggia,et al.  Predicting Survival Causes After Out of Hospital Cardiac Arrest using Data Mining Method , 2004, MedInfo.

[15]  D. Beymer,et al.  AALIM: Multimodal mining for cardiac decision support , 2007, 2007 Computers in Cardiology.

[16]  L. Parthiban,et al.  Intelligent Heart Disease Prediction System Using CANFIS and Genetic Algorithm , 2007 .

[17]  Abdelghani Bellaachia,et al.  E-CAST: A Data Mining Algorithm for Gene Expression Data , 2002, BIOKDD.

[18]  Hsinchun Chen,et al.  Knowledge Management, Data Mining, and Text Mining in Medical Informatics , 2005 .

[19]  Johannes Gehrke,et al.  MAFIA: A Performance Study of Mining Maximal Frequent Itemsets , 2003, FIMI.

[20]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[21]  Florin Gorunescu,et al.  Data Mining Techniques in Computer-Aided Diagnosis: Non-Invasive Cancer Detection , 2007 .

[22]  Jian Pei,et al.  CMAR: accurate and efficient classification based on multiple class-association rules , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[23]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[24]  Keun Ho Ryu,et al.  Associative Classification Approach for Diagnosing Cardiovascular Disease , 2006 .

[25]  Carlos Ordonez Programming the K-means clustering algorithm in SQL , 2004, KDD '04.

[26]  Tok Wang Ling,et al.  Exploration mining in diabetic patients databases: findings and conclusions , 2000, KDD '00.

[27]  Boleslaw K. Szymanski,et al.  USING EFFICIENT SUPANOVA KERNEL FOR HEART DISEASE DIAGNOSIS , 2006 .

[28]  Gang Zheng,et al.  A Comparative Study of Medical Data Classification Methods Based on Decision Tree and System Reconstruction Analysis , 2005 .

[29]  Gregory Piatetsky-Shapiro,et al.  Knowledge Discovery in Databases: An Overview , 1992, AI Mag..

[30]  Beatriz de la Iglesia,et al.  A Comparison of Two Document Clustering Approaches for Clustering Medical Documents , 2006, DMIN.

[31]  Erhan Guven,et al.  PREDICTING BREAST CANCER SURVIVABILITY USING DATA MINING TECHNIQUES , 2006 .

[32]  Eric Li,et al.  Optimization of Frequent Itemset Mining on Multiple-Core Processor , 2007, VLDB.

[33]  Fu-Ren Lin,et al.  Mining time dependency patterns in clinical pathways , 2001, Int. J. Medical Informatics.

[34]  Berna Sayraç,et al.  Generalization Capabilities Enhancement of a Learning System by Fuzzy Space Clustering , 2007, J. Commun..

[35]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD 2000.