Improving grasshopper optimization algorithm for hyperparameters estimation and feature selection in support vector regression

Abstract High-dimensionality is one of the major problems which affect the quality of the classification and prediction modeling. Support vector regression has been applied in several real problems. However, it is usually needed to tune manually the hyperparameters.In addition, SVR cannot perform feature selection. Nature-inspired algorithms have been used as a feature selection and as hyperparameters estimation procedure. In this paper, an improving grasshopper optimization algorithm (GOA) is proposed by adapting a new function of the main controlling parameter of GOA to enhance the exploration and exploitation capability of GOA. This improving is utilized to optimize the hyperparameters of the SVR with embedding the feature selection simultaneously. Experimental results, obtained by running on four datasets, show that our proposed algorithm performs better than cross-validation method, in terms of prediction, number of selected features, and running time. Besides, the experimental results of the proposed improving confirm the efficiency of the proposed algorithm in improving the prediction performance and computational time compared to other nature-inspired algorithms, which proves the ability of GOA in searching for the best hyperparameters values and selecting the most informative features for prediction tasks.

[1]  T. Edgar,et al.  An improved variable selection method for support vector regression in NIR spectral modeling , 2017, Journal of Process Control.

[2]  Bieke Dejaegher,et al.  Feature selection methods in QSAR studies. , 2012, Journal of AOAC International.

[3]  N Basant,et al.  Qualitative and quantitative structure–activity relationship modelling for predicting blood-brain barrier permeability of structurally diverse chemicals , 2015, SAR and QSAR in environmental research.

[4]  Zakariya Yahya Algamal,et al.  Feature selection using particle swarm optimization-based logistic regression model , 2018 .

[5]  Maarouk Toufik Messaoud,et al.  A new binary grasshopper optimization algorithm for feature selection problem , 2019, J. King Saud Univ. Comput. Inf. Sci..

[6]  A. Sava,et al.  On the optimization of the support vector machine regression hyperparameters setting for gas sensors array applications , 2019, Chemometrics and Intelligent Laboratory Systems.

[7]  Oguz Bayat,et al.  A grasshopper optimizer approach for feature selection and optimizing SVM parameters utilizing real biomedical data sets , 2019, Neural Computing and Applications.

[8]  Muhammad Hisyam Lee,et al.  A two-stage sparse logistic regression for optimal gene selection in high-dimensional microarray data classification , 2018, Advances in Data Analysis and Classification.

[9]  Xiaoyong Liu,et al.  Parameter optimization of support vector regression based on sine cosine algorithm , 2018, Expert Syst. Appl..

[10]  Ahmed A. Ewees,et al.  Improved grasshopper optimization algorithm using opposition-based learning , 2018, Expert Syst. Appl..

[11]  Chuntian Cheng,et al.  Optimizing Hydropower Reservoir Operation Using Hybrid Genetic Algorithm and Chaos , 2008 .

[12]  Jui-Sheng Chou,et al.  Nature-inspired metaheuristic optimization in least squares support vector regression for obtaining bridge scour information , 2017, Inf. Sci..

[13]  L. Buydens,et al.  Determination of optimal support vector regression parameters by genetic algorithms and simplex optimization , 2005 .

[14]  Mohammed Benaissa,et al.  Support vector regression with digital band pass filtering for the quantitative analysis of near‐infrared spectra , 2014 .

[15]  Zakariya Yahya Algamal,et al.  High‐dimensional QSAR prediction of anticancer potency of imidazo[4,5‐b]pyridine derivatives using adjusted adaptive LASSO , 2015 .

[16]  Z Y Algamal,et al.  A new adaptive L1-norm for optimal descriptor selection of high-dimensional QSAR classification model for anti-hepatitis C virus activity of thiourea derivatives , 2017, SAR and QSAR in environmental research.

[17]  Scott Boyer,et al.  Benchmarking Variable Selection in QSAR , 2012, Molecular informatics.

[18]  Hiromasa Kaneko,et al.  Fast optimization of hyperparameters for support vector regression models with highly predictive ability , 2015 .

[19]  Haithem Taha Mohammad Ali,et al.  A QSAR classification model for neuraminidase inhibitors of influenza A viruses (H1N1) based on weighted penalized support vector machine , 2017, SAR and QSAR in environmental research.

[20]  Muhammad Hisyam Lee,et al.  Regularized logistic regression with adjusted adaptive elastic net for gene selection in high dimensional cancer classification , 2015, Comput. Biol. Medicine.

[21]  Apilak Worachartcheewan,et al.  Predictive QSAR modeling of aldose reductase inhibitors using Monte Carlo feature selection. , 2014, European journal of medicinal chemistry.

[22]  Gai-Ge Wang,et al.  A modified firefly algorithm for UCAV path planning , 2012 .

[23]  Zakariya Yahya Algamal,et al.  An efficient gene selection method for high-dimensional microarray data based on sparse logistic regression , 2017 .

[24]  Chih-Hung Wu,et al.  A Novel hybrid genetic algorithm for kernel function and parameter optimization in support vector regression , 2009, Expert Syst. Appl..

[25]  Eslam Pourbasheer,et al.  2D and 3D Quantitative Structure-Activity Relationship Study of Hepatitis C Virus NS5B Polymerase Inhibitors by Comparative Molecular Field Analysis and Comparative Molecular Similarity Indices Analysis Methods , 2014, J. Chem. Inf. Model..

[26]  Yuan-Hai Shao,et al.  Robust Lp-norm least squares support vector regression with feature selection , 2017, Appl. Math. Comput..

[27]  Andrew Lewis,et al.  Grasshopper Optimisation Algorithm: Theory and application , 2017, Adv. Eng. Softw..

[28]  Hossam Faris,et al.  Simultaneous Feature Selection and Support Vector Machine Optimization Using the Grasshopper Optimization Algorithm , 2018, Cognitive Computation.

[29]  Haithem Taha Mohammad Ali,et al.  QSAR classification model for diverse series of antifungal agents based on improved binary differential search algorithm , 2019, SAR and QSAR in environmental research.

[30]  Farookh Khadeer Hussain,et al.  Support vector regression with chaos-based firefly algorithm for stock market price forecasting , 2013, Appl. Soft Comput..

[31]  Pablo R Duchowicz,et al.  A comparative QSAR on 1,2,5-thiadiazolidin-3-one 1,1-dioxide compounds as selective inhibitors of human serine proteinases. , 2011, Journal of molecular graphics & modelling.

[32]  Scott Boyer,et al.  Choosing Feature Selection and Learning Algorithms in QSAR , 2014, J. Chem. Inf. Model..

[33]  Zakariya Yahya Algamal,et al.  Tuning parameter estimation in SCAD-support vector machine using firefly algorithm with application in gene selection and cancer classification , 2018, Comput. Biol. Medicine.

[34]  Francesco Contino,et al.  A hyperparameters selection technique for support vector regression models , 2017, Appl. Soft Comput..

[35]  Zakariya Yahya Algamal,et al.  High Dimensional QSAR Study of Mild Steel Corrosion Inhibition in acidic medium by Furan Derivatives , 2015, International Journal of Electrochemical Science.

[36]  Dong-Sheng Cao,et al.  Combination of kernel PCA and linear support vector machine for modeling a nonlinear relationship between bioactivity and molecular descriptors , 2011 .

[37]  Wei-Chiang Hong,et al.  SVR with hybrid chaotic genetic algorithms for tourism demand forecasting , 2011, Appl. Soft Comput..

[38]  Muhammad Hisyam Lee,et al.  Penalized logistic regression with the adaptive LASSO for gene selection in high-dimensional cancer classification , 2015, Expert Syst. Appl..

[39]  Guohua Cao,et al.  Support vector regression with fruit fly optimization algorithm for seasonal electricity consumption forecasting , 2016 .

[40]  Z Y Algamal,et al.  A QSAR model for predicting antidiabetic activity of dipeptidyl peptidase-IV inhibitors by enhanced binary gravitational search algorithm , 2019, SAR and QSAR in environmental research.

[41]  Andrew Lewis,et al.  S-shaped versus V-shaped transfer functions for binary Particle Swarm Optimization , 2013, Swarm Evol. Comput..

[42]  Chien-Feng Huang,et al.  A hybrid stock selection model using genetic algorithms and support vector regression , 2012, Appl. Soft Comput..

[43]  Hossam Faris,et al.  Binary grasshopper optimisation algorithm approaches for feature selection problems , 2019, Expert Syst. Appl..

[44]  Arezoo Zakeri,et al.  Efficient feature selection method using real-valued grasshopper optimization algorithm , 2019, Expert Syst. Appl..

[45]  Hasmerya Maarof,et al.  Quantitative structure–activity relationship model for prediction study of corrosion inhibition efficiency using two‐stage sparse multiple linear regression , 2016 .

[46]  Ryohei Nakano,et al.  Optimizing Support Vector regression hyperparameters based on cross-validation , 2003, Proceedings of the International Joint Conference on Neural Networks, 2003..

[47]  M. N. Amar,et al.  Application of hybrid support vector regression artificial bee colony for prediction of MMP in CO2-EOR process , 2018 .

[48]  Yong-Ping Zhao,et al.  Robust truncated support vector regression , 2010, Expert Syst. Appl..

[49]  Vladimir Vapnik,et al.  An overview of statistical learning theory , 1999, IEEE Trans. Neural Networks.

[50]  Zne-Jung Lee,et al.  Hybrid robust support vector machines for regression with outliers , 2011, Appl. Soft Comput..

[51]  Yunqian Ma,et al.  Practical selection of SVM parameters and noise estimation for SVM regression , 2004, Neural Networks.

[52]  M K Qasim,et al.  A binary QSAR model for classifying neuraminidase inhibitors of influenza A viruses (H1N1) using the combined minimum redundancy maximum relevancy criterion with the sparse support vector machine , 2018, SAR and QSAR in environmental research.

[53]  Peter Filzmoser,et al.  Review of sparse methods in regression and classification with application to chemometrics , 2012 .

[54]  Bing Wang,et al.  Optimization enhanced genetic algorithm-support vector regression for the prediction of compound retention indices in gas chromatography , 2017, Neurocomputing.