A Diabetes is a life-threatening issue in modern health care domain. With the use of data mining techniques, diabetes factors and co morbid risk conditions associated with diabetes has found. In order to stifle the evolution of diabetes mellitus, applies distributed association rule mining and summarization techniques to electronic medical records. This helps to discover set of risk factors and co morbid conditions in distributed medical dataset using frequent item set mining. In general, association rule mining (ARM) generates bulky volume of data sets which need to summarize certain rules over medical record. This encompasses a novel approach to find the common factors which lead to high risks of diabetes and co morbid conditions associated with diabetes. This performs both association rule mining and association rule summarization techniques with improved classification algorithms. Exiting systems aim to apply association rule mining to electronic medical records to discover sets of risk factors and their corresponding subpopulations that represent patients at particularly high risk of developing diabetes. Given the high dimensionality of EMRs (Electronic Medical Records), association rule mining generates a very large set of rules which we need to summarize for easy clinical use. The existing system reviewed four association rule set summarization techniques and conducted a comparative evaluation to provide guidance regarding the diabetes risk prediction. In the field of medical domain, the prediction of diabetes and its Co-Morbid in earlier stage is important. We propose a set of methods to perform the Co-Morbid prediction. The propose technique named as SAM (Split and Merge), which is based on fast distributed quantitative association rule mining and rule filtering for prediction co morbid conditions associated with diabetes. SAM algorithm is used to discover the frequent data item sets and summarized data sets. In performance comparison of proposed SAM with existing BUS approach based on prediction efficiency SAM is better than BUS.
[1]
Yehuda Lindell,et al.
A Statistical Theory for Quantitative Association Rules
,
1999,
KDD '99.
[2]
R. Agarwal.
Fast Algorithms for Mining Association Rules
,
1994,
VLDB 1994.
[3]
Donald K. Wedding,et al.
Discovering Knowledge in Data, an Introduction to Data Mining
,
2005,
Inf. Process. Manag..
[4]
Mohammad Al Hasan.
Summarization in Pattern Mining
,
2009,
Encyclopedia of Data Warehousing and Mining.
[5]
Dimitris Kanellopoulos,et al.
Association Rules Mining: A Recent Overview
,
2006
.
[6]
Jian Pei,et al.
Mining frequent patterns without candidate generation
,
2000,
SIGMOD '00.
[7]
S. Fowler,et al.
Reduction in the incidence of type 2 diabetes with lifestyle intervention or metformin.
,
2002
.
[8]
Vipin Kumar,et al.
Summarization - compressing data into an informative representation
,
2005,
Fifth IEEE International Conference on Data Mining (ICDM'05).
[9]
G. Collins,et al.
Developing risk prediction models for type 2 diabetes: a systematic review of methodology and reporting
,
2011,
BMC medicine.