An Empirical Study about Type2 Diabetics using Duo mining Approach

Due to the revolutionary change in data mining and bio-informatics, it is very useful to use data mining techniques to evaluate and analyze bio-medical data. In this paper we propose a frame work called duo-mining tool for intelligent Text mining system for diabetic patients depending on their medical test reports. Diabetes is a chronic disease and major problem of morbidity and mortality in developing countries. The International Diabetes Federation estimates that 285 million people around the world have diabetes. This total is expected to rise to 438 million within 20 years. Type-2 diabetes mellitus (T2DM) is the most common type of diabetes and accounts for 90-95% of all diabetes. Detection of T2DM from various factors or symptoms became an issue which was not free from false presumptions accompanied by unpredictable effects. According to this context, data mining and machine learning could be used as an alternative way help us in knowledge discovery from data. We applied several learning methods, such as K-Nearest Neighbor, decision tree, support vector machines, acquire information from historical data of patient‟s from medical practicing centers in and around Guntur. Rules are extracted from Decision tree to offer decision-making support through early detection of T2DM for clinicians. Through this paper, we tried to determine how the extracted knowledge by the Text Mining is integrated with expert system knowledge to assist crucial decision making process.

[1]  S. Colagiuri,et al.  The Diabetes Control and Complications Trial , 1983, Henry Ford Hospital medical journal.

[2]  Blaz Zupan,et al.  Intelligent Data Analysis in Medicine , 2000 .

[3]  David L. Olson,et al.  Advanced Data Mining Techniques , 2008 .

[4]  J. Pickup,et al.  Textbook of Diabetes , 1991 .

[5]  Norman D. Black,et al.  Feature selection and classification model construction on type 2 diabetic patients' data , 2007, Artif. Intell. Medicine.

[6]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory, Second Edition , 2000, Statistics for Engineering and Information Science.

[7]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[8]  Ron Kohavi,et al.  Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid , 1996, KDD.

[9]  D. Opitz,et al.  Popular Ensemble Methods: An Empirical Study , 1999, J. Artif. Intell. Res..

[10]  Ethem Alpaydin,et al.  Introduction to machine learning , 2004, Adaptive computation and machine learning.

[11]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[12]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[13]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[14]  Kemal Polat,et al.  An expert system approach based on principal component analysis and adaptive neuro-fuzzy inference system to diagnosis of diabetes disease , 2007, Digit. Signal Process..

[15]  Xia Kewen,et al.  An Intelligent Diagnosis to Type 2 Diabetes Based on QPSO Algorithm and WLS-SVM , 2008, 2008 International Symposium on Intelligent Information Technology Application Workshops.

[16]  Ian Witten,et al.  Data Mining , 2000 .

[17]  Andrew P. Bradley,et al.  Intelligible Support Vector Machines for Diagnosis of Diabetes Mellitus , 2010, IEEE Transactions on Information Technology in Biomedicine.

[18]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..