Detecting major disease in public hospital using ensemble techniques

Hepatitis is chronic disease that becomes major problem in developing countries. Health experts estimate that more than 185 billion people have chronic hepatitis worldwide. This paper attempts to detect major disease such as hepatitis in public hospital using ensemble methods. Several ensemble techniques were applied to acquire knowledge from patient medical records. Afterwards, rule extraction from decision tree and neural network are summarized in order to assist experts in detecting hepatitis. Accuracy of those algorithms is also performed and from the experimental result shows that Bagging, with decision tree as base-classifier, denotes best performance among other classifiers.

[1]  M. Ohsaki A Rule Discovery Support System for Sequential Medical Data,-In the Case Study of a Chronic Hepatitis Dataset- , 2002 .

[2]  Gregory Piatetsky-Shapiro,et al.  Knowledge Discovery in Databases: An Overview , 1992, AI Mag..

[3]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[4]  Huan Liu,et al.  X2R: a fast rule generator , 1995, 1995 IEEE International Conference on Systems, Man and Cybernetics. Intelligent Systems for the 21st Century.

[5]  TuBaoHo,et al.  Mining Hepatitis Data with Temporal Abstraction (文部科学省科学研究費特定領域研究「情報洪水時代におけるアクティブマイニングの実現」公開シンポジウム) , 2003 .

[6]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  S. M. Kamruzzaman,et al.  An Algorithm to Extract Rules from Artificial Neural Networks for Medical Diagnosis Problems , 2010, ArXiv.

[8]  Chun-Xia Zhang,et al.  RotBoost: A technique for combining Rotation Forest and AdaBoost , 2008, Pattern Recognit. Lett..

[9]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[10]  Blaz Zupan,et al.  Intelligent Data Analysis in Medicine , 2000 .

[11]  Juan José Rodríguez Diez,et al.  Rotation Forest: A New Classifier Ensemble Method , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[13]  Shan Ling Pan,et al.  Automatic knowledge extraction from survey data: learning M-of-N constructs using a hybrid approach , 2005, J. Oper. Res. Soc..

[14]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[15]  M. Kurosaki,et al.  Data mining model using simple and readily available factors could identify patients at high risk for hepatocellular carcinoma in chronic hepatitis C. , 2012, Journal of hepatology.

[16]  Robert P. W. Duin,et al.  Bagging, Boosting and the Random Subspace Method for Linear Classifiers , 2002, Pattern Analysis & Applications.

[17]  Riccardo Bellazzi,et al.  Predictive data mining in clinical medicine: a focus on selected methods and applications , 2011, WIREs Data Mining Knowl. Discov..

[18]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[19]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[20]  Norman D. Black,et al.  Feature Selection and Classification Model Construction on Type 2 Diabetic Patient's Data , 2004, ICDM.

[21]  D. Opitz,et al.  Popular Ensemble Methods: An Empirical Study , 1999, J. Artif. Intell. Res..

[22]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[23]  Rudy Setiono Extracting M-of-N rules from trained neural networks , 2000, IEEE Trans. Neural Networks Learn. Syst..

[24]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..