SVM-based Decision Tree for medical knowledge representation

Machine learning has become one of blooming research topics in recent years. Many applications can be found from integrating various techniques such as Chi-squared Automatic Interaction Detection (CHAID), Decision Tree, k-Nearest Neighbors (KNN), Recursive Partitioning and Regression Trees, and Support Vector Machines (SVM), to the obtrusive platforms that include the domains of healthcare, economics and agriculture. Researchers on healthcare domains have built effective systems to help clinicians alleviate diagnosis efforts. However, some models lacked flexibility to interpret the knowledge as if clinician's indulgement. To overcome such problems, SVM, one of the supervised learning algorithms with kernel radial basis function (RBF) as a nonlinear classification model, was exploited to classify and extract knowledge from medical data. The idea behind the proposed system was to classify the given data step by step by SVM. Incorrectly classified patterns will be fed to the succeeding stage to find a better split point in SVM. Split point was used to calculate information gain that can identify principal features from candidate attributes. Finally, knowledge-based decision trees were constructed from the ordered information gain to classify the unknown medical patterns. Experimental results from three different datasets verified that the proposed system was effective and feasible for the classification of medical databases.

[1]  Ujjwal Maulik,et al.  Identifying Cancer Biomarkers From Microarray Data Using Feature Selection and Semisupervised Learning , 2014, IEEE Journal of Translational Engineering in Health and Medicine.

[2]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[3]  Yudong D. He,et al.  Gene expression profiling predicts clinical outcome of breast cancer , 2002, Nature.

[4]  Senlin Luo,et al.  Rule Extraction From Support Vector Machines Using Ensemble Learning Approach: An Application for Diagnosis of Diabetes , 2015, IEEE Journal of Biomedical and Health Informatics.

[5]  Juan Manuel Górriz,et al.  NMF-SVM Based CAD Tool Applied to Functional Brain Images for the Diagnosis of Alzheimer's Disease , 2012, IEEE Transactions on Medical Imaging.

[6]  Bálint Antal,et al.  An ensemble-based system for automatic screening of diabetic retinopathy , 2014, Knowl. Based Syst..

[7]  Thomas M. Cover,et al.  Elements of Information Theory: Cover/Elements of Information Theory, Second Edition , 2005 .

[8]  M. Elter,et al.  The prediction of breast cancer biopsy outcomes using two CAD approaches that both emphasize an intelligible decision process. , 2007, Medical physics.

[9]  Hadi Sadoghi Yazdi,et al.  Multi Branch Decision Tree: A New Splitting Criterion , 2012 .

[10]  Asifullah Khan,et al.  GECC: Gene Expression Based Ensemble Classification of Colon Samples , 2014, IEEE/ACM Transactions on Computational Biology and Bioinformatics.