An improved support vector machines model in medical data analysis

The support vector machine (SVM) technique is an emerging classification scheme that has been successfully employed in solving many classification problems. However, three main traits: features selection, dimension reduction and parameters selection, essentially influence the classification performance of SVM models. Therefore, this study developed an improved support vector machine (IMSVM) model using factor analysis (FA), kernel sliced inverse regression (KSIR) and honey-bee mating optimisation with genetic algorithms (HBMOG) to deal with feature selection, dimension reduction, and parameter selection issues, respectively, for SVM models. Then, the statlog heart dataset from the Center for Machine Learning and Intelligent Systems at the University of California, Irvine (UCI) was used to demonstrate the performance of the IMSVM model. Experimental results revealed that the IMSVM model can provide more accurate classification results than the results obtained by classification models in previous literature. Thus, the proposed model is a promising alternative for analysing medical data.

[1]  Anirban Mukherjee,et al.  Nonparallel plane proximal classifier , 2009, Signal Process..

[2]  Denis Larocque,et al.  An empirical comparison of ensemble methods based on classification trees , 2003 .

[3]  Yi Liu,et al.  FS_SFS: A novel feature selection method for support vector machines , 2006, Pattern Recognit..

[4]  Walter A. Kosters,et al.  Genetic Programming for data classification: partitioning the search space , 2004, SAC '04.

[5]  Stefan Lessmann,et al.  A reference model for customer-centric data mining with support vector machines , 2009, Eur. J. Oper. Res..

[6]  César Hervás-Martínez,et al.  Multilogistic regression by means of evolutionary product-unit neural networks , 2008, Neural Networks.

[7]  Hussein A. Abbass,et al.  A True Annealing Approach to the Marriage in Honey-Bees Optimization Algorithm , 2003, Int. J. Comput. Intell. Appl..

[8]  Miguel A. Mariño,et al.  Design-Operation of Multi-Hydropower Reservoirs: HBMO Approach , 2008 .

[9]  Juan José Rodríguez Diez,et al.  Boosting recombined weak classifiers , 2008, Pattern Recognit. Lett..

[10]  Taher Niknam,et al.  Application of honey-bee mating optimization on state estimation of a power distribution system including distributed generators , 2008 .

[11]  Petar Ćurković,et al.  Honey-bees optimization algorithm applied to path planning problem , 2007 .

[12]  Shelly Crisler,et al.  Sleep-stage scoring in the rat using a support vector machine , 2008, Journal of Neuroscience Methods.

[13]  Dimitrios I. Fotiadis,et al.  A methodology for automated fuzzy model generation , 2008, Fuzzy Sets Syst..

[14]  Taher Niknam,et al.  An efficient hybrid evolutionary algorithm based on PSO and HBMO algorithms for multi-objective Distribution Feeder Reconfiguration , 2009 .

[15]  Huiqing Liu,et al.  A comparative study on feature selection and classification methods using gene expression profiles and proteomic patterns. , 2002, Genome informatics. International Conference on Genome Informatics.

[16]  Yuehjen E. Shao,et al.  Mining the breast cancer pattern using artificial neural networks and multivariate adaptive regression splines , 2004, Expert Syst. Appl..

[17]  Barry J. Adams,et al.  Honey-bee mating optimization (HBMO) algorithm for optimal reservoir operation , 2007, J. Frankl. Inst..

[18]  Santiago Patricio Serendero Sáez Methods for knowledge discovery in data , 2004 .

[19]  Wolfgang K. Härdle,et al.  Robust estimation of dimension reduction space , 2006, Comput. Stat. Data Anal..

[20]  Weidong Zhang,et al.  Improved sparse least-squares support vector machine classifiers , 2006, Neurocomputing.

[21]  Robert Ivor John,et al.  A method of learning weighted similarity function to improve the performance of nearest neighbor , 2009, Inf. Sci..

[22]  Mouloud Koudil,et al.  Using artificial bees to solve partitioning and scheduling problems in codesign , 2007, Appl. Math. Comput..

[23]  Mingjun Wang,et al.  Particle swarm optimization-based support vector machine for forecasting dissolved gases content in power transformer oil , 2009 .

[24]  Jie Chen,et al.  Algorithm of Marriage in Honey Bees Optimization Based on the Nelder-Mead Method , 2007 .

[25]  Reshma Khemchandani,et al.  Fast and robust learning through fuzzy linear proximal support vector machines , 2004, Neurocomputing.

[26]  N. Aronszajn Theory of Reproducing Kernels. , 1950 .

[27]  Ian H. Witten,et al.  Generating Accurate Rule Sets Without Global Optimization , 1998, ICML.

[28]  Xingquan Zhu,et al.  A lazy bagging approach to classification , 2008, Pattern Recognit..

[29]  Juan Luis Castro,et al.  Loss and gain functions for CBR retrieval , 2009, Inf. Sci..

[30]  Sotiris B. Kotsiantis,et al.  Logitboost of Simple Bayesian Classifier , 2005, Informatica.

[31]  Jiejin Cai,et al.  Applying support vector machine to predict hourly cooling load in the building , 2009 .

[32]  Zhi-Hua Zhou,et al.  SETRED: Self-training with Editing , 2005, PAKDD.

[33]  Sreeram Ramakrishnan,et al.  A hybrid approach for feature subset selection using neural networks and ant colony optimization , 2007, Expert Syst. Appl..

[34]  C. Spearman General intelligence Objectively Determined and Measured , 1904 .

[35]  Hussein A. Abbass,et al.  A Monogenous MBO Approach to Satisfiability , 2001 .

[36]  Ker-Chau Li,et al.  Sliced Inverse Regression for Dimension Reduction , 1991 .

[37]  Lawrence O. Hall,et al.  Ensembles of Fuzzy Classifiers , 2007, 2007 IEEE International Fuzzy Systems Conference.

[38]  Lei Wang,et al.  AdaBoost with SVM-based component classifiers , 2008, Eng. Appl. Artif. Intell..

[39]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[40]  A. Marušič Factor analysis of risk for coronary heart disease: an independent replication. , 2000, International journal of cardiology.

[41]  Nelson F. F. Ebecken,et al.  Design of interpretable fuzzy rule-based classifiers using spectral analysis with structure and parameters optimization , 2009, Fuzzy Sets Syst..

[42]  Han-Ming Wu Kernel Sliced Inverse Regression with Applications to Classification , 2008 .

[43]  Hiroshi Motoda,et al.  Feature Selection for Knowledge Discovery and Data Mining , 1998, The Springer International Series in Engineering and Computer Science.

[44]  Madan Gopal,et al.  Application of smoothing technique on twin support vector machines , 2008, Pattern Recognit. Lett..

[45]  Lin Ma,et al.  Empirical analysis of support vector machine ensemble classifiers , 2009, Expert Syst. Appl..

[46]  R. Dennis Cook,et al.  Dimension reduction via marginal high moments in regression , 2006 .

[47]  Ian H. Witten,et al.  Making Better Use of Global Discretization , 1999, ICML.

[48]  Hussein A. Abbass,et al.  A novel mixture of experts model based on cooperative coevolution , 2006, Neurocomputing.

[49]  H.F. Hung,et al.  Applying SVM to build supplier evaluation model - comparing likert scale and fuzzy scale , 2007, 2007 IEEE International Conference on Industrial Engineering and Engineering Management.

[50]  William Stafford Noble,et al.  Nonstationary kernel combination , 2006, ICML.

[51]  Shunde Yin,et al.  Geomechanical parameters identification by particle swarm optimization and support vector machine , 2009 .

[52]  Hyunsoo Kim,et al.  Multiclass classifiers based on dimension reduction with generalized LDA , 2007, Pattern Recognit..

[53]  Christoph F. Eick,et al.  Using representative-based clustering for nearest neighbor dataset editing , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[54]  Yanxi Liu,et al.  SVM decision boundary based discriminative subspace induction , 2005, Pattern Recognit..

[55]  Xuyan Tu,et al.  Algorithm of Marriage in Honey Bees Optimization Based on the Wolf Pack Search , 2007 .

[56]  Ki Yong Lee Local fuzzy PCA based GMM with dimension reduction on speaker identification , 2004, Pattern Recognit. Lett..

[57]  Xiaohua Hu,et al.  Feature Selection Based on Relative Attribute Dependency: An Experimental Study , 2005, RSFDGrC.

[58]  Tzong-Ru Tsai,et al.  Weighted quasi-likelihood estimation based on fuzzy clustering analysis method and dimension reduction technique , 2002, Fuzzy Sets Syst..

[59]  Ali Maroosi,et al.  Application of honey-bee mating optimization algorithm on clustering , 2007, Appl. Math. Comput..

[60]  Jihoon Yang,et al.  Feature Subset Selection Using a Genetic Algorithm , 1998, IEEE Intell. Syst..

[61]  Ling Li,et al.  Perceptron learning with random coordinate descent , 2005 .

[62]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .