GA-SVM based feature selection and parameter optimization in hospitalization expense modeling

Abstract Feature selection and parameter optimization are two important aspects to improve the performance of classifier. A novel approach based on the genetic algorithm(GA) for feature selection and parameter optimization of support vector machine(SVM) is proposed in order to improve the prediction accuracy of hospitalization expense model. First of all, the data of hospitalization expense are preprocessed, including data cleaning, discretization, normalization; Secondly, using k-means to cluster and obtain two category labels; Thirdly, kernel penalty factor c, kernel function γ and the feature mask are used to construct chromosome; The Fourth, a weighted combination of classification accuracy and feature number are taken as the fitness function, and GA was used to optimize the SVM parameters, and simultaneously select the optimal subset of features; Finally, single parameter optimization is performed using GA and particle swarm optimization (PSO), and the optimization performance of which is compared with that of GA-PCA and PSO-PCA. Experimental results show that the proposed algorithm can be used to quickly obtain suitable feature subsets and SVM parameters, thereby achieving a better classification result.

[1]  P. J. García Nieto,et al.  A hybrid PSO optimized SVM-based method for predicting of the cyanotoxin content from experimental cyanobacteria concentrations in the Trasona reservoir: A case study in Northern Spain , 2015, Appl. Math. Comput..

[2]  Xu Dezhong Analysis of the hospitalization expenses for army patients with virus A hepatitis and relevant factors , 2000 .

[3]  Wei Chen,et al.  Intelligent fault diagnosis of rotating machinery using support vector machine with ant colony algorithm for synchronous feature selection and parameter optimization , 2015, Neurocomputing.

[4]  Mingtian Zhou,et al.  Feature selection and parameter optimization for support vector machines: A new approach based on genetic algorithm with feature chromosomes , 2011, Expert Syst. Appl..

[5]  Yihao Zhang,et al.  Real estate price forecasting based on SVM optimized by PSO , 2014 .

[6]  Woo Kyung Moon,et al.  Combining support vector machine with genetic algorithm to classify ultrasound breast tumor images , 2012, Comput. Medical Imaging Graph..

[7]  Zhang Yin Application of the support vector machine model in the analysis of impact factors for hospitalization expenses , 2014 .

[8]  Ricardo Massa Ferreira Lima,et al.  GA-based method for feature selection and parameters optimization for machine learning regression applied to software effort estimation , 2010, Inf. Softw. Technol..

[9]  T. Santhanam,et al.  Application of K-Means and Genetic Algorithms for Dimension Reduction by Integrating SVM for Diabetes Diagnosis , 2015 .

[10]  Manami Inoue,et al.  Comparative epidemiology of gastric cancer between Japan and China. , 2011, World journal of gastroenterology.

[11]  Enayatollah Bakhshi,et al.  Cost prediction of antipsychotic medication of psychiatric disorder using artificial neural network model , 2013, Journal of research in medical sciences : the official journal of Isfahan University of Medical Sciences.

[12]  Yingwei Jin,et al.  An effective discretization method for disposing high-dimensional data , 2014, Inf. Sci..

[13]  Hao Wu,et al.  An effective feature selection method for hyperspectral image classification based on genetic algorithm and support vector machine , 2011, Knowl. Based Syst..

[14]  Dario Gregori,et al.  Factors affecting hospitalization costs in Type 2 diabetic patients. , 2009, Journal of diabetes and its complications.

[15]  Yu Tian,et al.  A decision-tree-based analysis of the factors influencing single disease costs , 2012, 2012 International Conference on Systems and Informatics (ICSAI2012).

[16]  Liu Bai-l Method of Intrusion Early Feature Selection Based on Genetic Algorithm , 2015 .

[17]  Wanqing Chen,et al.  [Analysis of liver cancer mortality in the national retrospective sampling survey of death causes in China, 2004 - 2005]. , 2010, Zhonghua yu fang yi xue za zhi [Chinese journal of preventive medicine].

[18]  Ling Yang,et al.  Incidence and mortality of gastric cancer in China. , 2006, World journal of gastroenterology.

[19]  Kohji Okamoto,et al.  Multivariate analysis of factors influencing medical costs of acute pancreatitis hospitalizations based on a national administrative database. , 2012, Digestive and liver disease : official journal of the Italian Society of Gastroenterology and the Italian Association for the Study of the Liver.

[20]  Li Ya-li APPLIED STUDY ON COX REGRESSION MODEL IN HOSPITALIZATION EXPENSES CONTROL , 2008 .

[21]  Qing Cao,et al.  Forecasting medical cost inflation rates: A model comparison approach , 2012, Decis. Support Syst..