Data mining based fragmentation and prediction of medical data

Data mining concerns theories, methodologies, and in particular, computer systems for knowledge extraction or mining from large amounts of data. Association rule mining is a general purpose rule discovery scheme. It has been widely used for discovering rules in medical applications. The diagnosis of diseases is a significant and tedious task in medicine. The detection of heart disease from various factors or symptoms is an issue which is not free from false presumptions often accompanied by unpredictable effects. Thus the effort to utilize knowledge and experience of numerous specialists and clinical screening data of patients collected in databases to facilitate the diagnosis process is considered a valuable option. In this paper, we presented an efficient approach for the prediction of heart attack risk levels from the heart disease database. Firstly, the heart disease database is clustered using the K-means clustering algorithm, which will extract the data relevant to heart attack from the database. This approach allows mastering the number of fragments through its k parameter. Subsequently the frequent patterns are mined from the extracted data, relevant to heart disease, using the MAFIA (Maximal Frequent Itemset Algorithm) algorithm. The machine learning algorithm is trained with the selected significant patterns for the effective prediction of heart attack. We have employed the ID3 algorithm as the training algorithm to show level of heart attack with the decision tree. The results showed that the designed prediction system is capable of predicting the heart attack effectively.

[1]  A. A. Safavi,et al.  Predicting breast cancer survivability using data mining techniques , 2010, 2010 2nd International Conference on Software Technology and Engineering.

[2]  Gang Zheng,et al.  A Comparative Study of Medical Data Classification Methods Based on Decision Tree and System Reconstruction Analysis , 2005 .

[3]  Erhan Guven,et al.  PREDICTING BREAST CANCER SURVIVABILITY USING DATA MINING TECHNIQUES , 2006 .

[4]  Johannes Gehrke,et al.  MAFIA: a maximal frequent itemset algorithm for transactional databases , 2001, Proceedings 17th International Conference on Data Engineering.

[5]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[6]  Sellappan Palaniappan,et al.  Intelligent heart disease prediction system using data mining techniques , 2008, 2008 IEEE/ACS International Conference on Computer Systems and Applications.

[7]  H. Koh,et al.  Data mining applications in healthcare. , 2005, Journal of healthcare information management : JHIM.

[8]  Nicos Maglaveras,et al.  Mining Association Rules from Clinical Databases: An Intelligent Diagnostic Process in Healthcare , 2001, MedInfo.

[9]  Philip S. Yu,et al.  A vertical partitioning algorithm for relational databases , 1987, 1987 IEEE Third International Conference on Data Engineering.