SVM Feature Selection Based Rotation Forest Ensemble Classifiers to Improve Computer-Aided Diagnosis of Parkinson Disease

Parkinson disease (PD) is an age-related deterioration of certain nerve systems, which affects movement, balance, and muscle control of clients. PD is one of the common diseases which affect 1% of people older than 60 years. A new classification scheme based on support vector machine (SVM) selected features to train rotation forest (RF) ensemble classifiers is presented for improving diagnosis of PD. The dataset contains records of voice measurements from 31 people, 23 with PD and each record in the dataset is defined with 22 features. The diagnosis model first makes use of a linear SVM to select ten most relevant features from 22. As a second step of the classification model, six different classifiers are trained with the subset of features. Subsequently, at the third step, the accuracies of classifiers are improved by the utilization of RF ensemble classification strategy. The results of the experiments are evaluated using three metrics; classification accuracy (ACC), Kappa Error (KE) and Area under the Receiver Operating Characteristic (ROC) Curve (AUC). Performance measures of two base classifiers, i.e. KStar and IBk, demonstrated an apparent increase in PD diagnosis accuracy compared to similar studies in literature. After all, application of RF ensemble classification scheme improved PD diagnosis in 5 of 6 classifiers significantly. We, numerically, obtained about 97% accuracy in RF ensemble of IBk (a K-Nearest Neighbor variant) algorithm, which is a quite high performance for Parkinson disease diagnosis.

[1]  Thomas Jackson,et al.  Neural Computing - An Introduction , 1990 .

[2]  S. Cousins,et al.  Knowledge-Based Systems. Methods for Developing and Evaluating Expert Systems: Automated Interpretation of Diabetes Patient Data: Detecting Temporal Changes in Insulin Therapy , 1990 .

[3]  Igor Kononenko,et al.  Machine learning for medical diagnosis: history, state of the art and perspective , 2001, Artif. Intell. Medicine.

[4]  I K Fodor,et al.  A Survey of Dimension Reduction Techniques , 2002 .

[5]  Ivan Bruha Meta-Learner for Unknown Attribute Values Processing: Dealing with Inconsistency of Meta-Databases , 2004, Journal of Intelligent Information Systems.

[6]  Elena Marchiori,et al.  Feature selection in proteomic pattern data with support vector machines , 2004, 2004 Symposium on Computational Intelligence in Bioinformatics and Computational Biology.

[7]  D. Kibler,et al.  Instance-based learning algorithms , 2004, Machine Learning.

[8]  Subhash C. Bagui,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2005, Technometrics.

[9]  Charles X. Ling,et al.  Using AUC and accuracy in evaluating learning algorithms , 2005, IEEE Transactions on Knowledge and Data Engineering.

[10]  Aleix M. Martínez,et al.  Where are linear feature extraction methods applicable? , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  D. Sharma,et al.  Learning from Ensembles: Using Artificial Neural Network Ensemble for Medical Outcomes Prediction , 2006, 2006 Innovations in Information Technology.

[12]  Juan José Rodríguez Diez,et al.  Rotation Forest: A New Classifier Ensemble Method , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  K. Chaudhuri,et al.  Non-motor symptoms of Parkinson's disease: diagnosis and management , 2006, The Lancet Neurology.

[14]  Raymond D. Kent,et al.  Parametric quantitative acoustic analysis of conversation produced by speakers with dysarthria and healthy speakers. , 2006, Journal of speech, language, and hearing research : JSLHR.

[15]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..

[16]  Juan José Rodríguez Diez,et al.  An Experimental Study on Rotation Forest Ensembles , 2007, MCS.

[17]  Arif Gulten,et al.  Assessing Effects of Pre-Processing Mass Spectrometry Data on Classification Performance , 2008, European journal of mass spectrometry.

[18]  Robi Polikar,et al.  An ensemble based data fusion approach for early diagnosis of Alzheimer's disease , 2008, Inf. Fusion.

[19]  Michael C. Lee,et al.  A Two-Step Approach for Feature Selection and Classifier Ensemble Construction in Computer-Aided Diagnosis , 2008, 2008 21st IEEE International Symposium on Computer-Based Medical Systems.

[20]  Chun-Xia Zhang,et al.  RotBoost: A technique for combining Rotation Forest and AdaBoost , 2008, Pattern Recognit. Lett..

[21]  Sieu Phan,et al.  An ensemble machine learning approach to predict survival in breast cancer , 2008, Int. J. Comput. Biol. Drug Des..

[22]  Arie Ben-David,et al.  Comparison of classification accuracy using Cohen's Weighted Kappa , 2008, Expert Syst. Appl..

[23]  Max A. Little,et al.  Suitability of Dysphonia Measurements for Telemonitoring of Parkinson's Disease , 2008, IEEE Transactions on Biomedical Engineering.

[24]  Terry Windeatt,et al.  Relevant and Redundant Feature Analysis with Ensemble Classification , 2009, 2009 Seventh International Conference on Advances in Pattern Recognition.

[25]  Resul Das,et al.  A comparison of multiple classification methods for diagnosis of Parkinson disease , 2010, Expert Syst. Appl..

[26]  Abdulkadir Sengür,et al.  Evaluation of ensemble methods for diagnosing of valvular heart disease , 2010, Expert Syst. Appl..