Simulation Study of Feature Selection on Survival Least Square Support Vector Machines with Application to Health Data

One of semi parametric survival model commonly used is Cox Proportional Hazard Model (Cox PHM) that has some conditions must be satisfied, one of them is proportional hazard assumption among the category at each predictor. Unfortunately, the real case cannot always satisfy this assumption. One alternative model that can be employed is non-parametric approach using Survival Least Square-Support Vector Machine (SURLS-SVM). Meanwhile, the SURLS-SVM cannot inform which predictors are significant like the Cox PHM can do. To overcome this issue, the feature selection using backward elimination is employed by means of c-index increment. This paper compares two approaches, i.e. Cox PHM and SURLS-SVM, using c-index criterion applied on simulated and clinical data. The empirical results inform that the c-index of SURLS-SVM is higher than Cox PHM on both datasets. Furthermore, the simulation study is repeated 100 times. The simulation results show that the non-relevant predictors are often included in the model because the effect of confounding. For the application on clinical data (cervical cancer), the feature selection yields nine relevant predictors out of twelve predictors. The three predictors among the nine relevant predictors in SURLS-SVM are the significant predictors in Cox PHM.

[1]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[2]  Dedy Dwi Prastyo,et al.  Additive survival least square support vector machines and feature selection on health data in Indonesia , 2018, 2018 International Conference on Information and Communications Technology (ICOIACT).

[3]  D. Kleinbaum,et al.  Survival Analysis: A Self-Learning Text. , 1996 .

[4]  S Van Huffel,et al.  Additive survival least‐squares support vector machines , 2010, Statistics in medicine.

[5]  Sabine Van Huffel,et al.  Support vector machines for survival analysis , 2007 .

[6]  Dedy Dwi Prastyo,et al.  Model Selection in Feedforward Neural Networks for Forecasting Inflow and Outflow in Indonesia , 2017 .

[7]  Gwowen Shieh,et al.  Suppression Situations in Multiple Linear Regression , 2006 .

[8]  Michelle Chen,et al.  A Model for Spheroid versus Monolayer Response of SK-N-SH Neuroblastoma Cells to Treatment with 15-Deoxy-PGJ 2 , 2016, Comput. Math. Methods Medicine.

[9]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[10]  Sabine Van Huffel,et al.  Support vector methods for survival analysis: a comparison between ranking and regression approaches , 2011, Artif. Intell. Medicine.

[11]  Dedy Dwi Prastyo,et al.  Support Vector Machines with Evolutionary Feature Selection for Default Prediction , 2012 .

[12]  Dedy Dwi Prastyo,et al.  Additive survival least square support vector machines: A simulation study and its application to cervical cancer prediction , 2017 .

[13]  Ferat Sahin,et al.  A survey on feature selection methods , 2014, Comput. Electr. Eng..

[14]  Dedy Dwi Prastyo,et al.  Embedded Predictor Selection for Default Risk Calculation: A Southeast Asian Industry Study , 2014 .

[15]  Hoda Mashayekhi,et al.  Survival Prediction and Feature Selection in Patients with Breast Cancer Using Support Vector Regression , 2016, Comput. Math. Methods Medicine.

[16]  Hossein Mahjub,et al.  Performance Evaluation of Support Vector Regression Models for Survival Analysis: A Simulation Study , 2016 .

[17]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[18]  Ralf Bender,et al.  Generating survival times to simulate Cox proportional hazards models , 2005, Statistics in medicine.