SVM Classifier Based Feature Selection Using GA, ACO and PSO for siRNA Design

Recently there has been considerable interest in applying evolutionary and natural computing techniques for analyzing large datasets with large number of features. In particular, efficacy prediction of siRNA has attracted a lot of researchers, because of large number of features involved. In the present work, we have applied the SVM based classifier along with PSO, ACO and GA on Huesken dataset of siRNA features as well as on two other wine and wdbc breast cancer gene benchmark dataset and achieved considerably high accuracy and the results have been presented. We have also highlighted the necessary data size for better accuracy in SVM for selected kernel. Both groups of features (sequential and thermodynamic) are important in the efficacy prediction of siRNA. The results of our study have been compared with other results available in the literature.

[1]  Sam Kwong,et al.  Ant Colony Clustering and Feature Extraction for Anomaly Intrusion Detection , 2006, Swarm Intelligence in Data Mining.

[2]  Xing-Ming Zhao,et al.  A Novel Hybrid GA/SVM System for Protein Sequences Classification , 2004, IDEAL.

[3]  Dieter Huesken,et al.  Design of a genome-wide siRNA library using an artificial neural network , 2005, Nature Biotechnology.

[4]  A. Reynolds,et al.  Rational siRNA design for RNA interference , 2004, Nature Biotechnology.

[5]  Marco Dorigo,et al.  Ant colony optimization theory: A survey , 2005, Theor. Comput. Sci..

[6]  M. A. Khanesar,et al.  A novel binary particle swarm optimization , 2007, 2007 Mediterranean Conference on Control & Automation.

[7]  Alex Alves Freitas,et al.  Particle swarm and bayesian networks applied to attribute selection for protein functional classification , 2007, GECCO '07.

[8]  Zenglin Xu,et al.  Feature Selection with Particle Swarms , 2004, CIS.

[9]  Li-Yeh Chuang,et al.  Feature Selection using PSO-SVM , 2007, IMECS.

[10]  David H. Mathews,et al.  OligoWalk: an online siRNA design tool utilizing hybridization thermodynamics , 2008, Nucleic Acids Res..

[11]  Jean-Philippe Vert,et al.  An accurate and interpretable model for siRNA efficacy prediction , 2006, BMC Bioinformatics.

[12]  T. Tuschl,et al.  Duplexes of 21-nucleotide RNAs mediate RNA interference in cultured mammalian cells , 2001, Nature.

[13]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[14]  Nasser Ghasem-Aghaee,et al.  Text feature selection using ant colony optimization , 2009, Expert Syst. Appl..

[15]  Marco Dorigo,et al.  Ant system: optimization by a colony of cooperating agents , 1996, IEEE Trans. Syst. Man Cybern. Part B.

[16]  Cheng-Lung Huang,et al.  ACO-based hybrid classification system with feature subset selection and model parameters optimization , 2009, Neurocomputing.

[17]  Yamuna Prasad,et al.  Feature selection for siRNA efficacy prediction using natural computation , 2009, 2009 World Congress on Nature & Biologically Inspired Computing (NaBIC).

[18]  Xiaohui Wang,et al.  Selection of hyperfunctional siRNAs with improved potency and specificity , 2009, Nucleic acids research.

[19]  Yuxi Fu,et al.  Computational and Information Science, First International Symposium, CIS 2004, Shanghai, China, December 16-18, 2004, Proceedings , 2004, CIS.

[20]  Ajith Abraham,et al.  Swarm Intelligence in Data Mining , 2009, Swarm Intelligence in Data Mining.

[21]  Alexander J. Smola,et al.  Support Vector Regression Machines , 1996, NIPS.

[22]  Ola Snøve,et al.  A comparison of siRNA efficacy predictors. , 2004, Biochemical and biophysical research communications.

[23]  Richard M. Everson,et al.  Intelligent Data Engineering and Automated Learning – IDEAL 2004 , 2004, Lecture Notes in Computer Science.

[24]  Nasser Ghasem-Aghaee,et al.  A novel ACO-GA hybrid algorithm for feature selection in protein function prediction , 2009, Expert Syst. Appl..

[25]  Anil K. Jain,et al.  Dimensionality reduction using genetic algorithms , 2000, IEEE Trans. Evol. Comput..

[26]  P. Sætrom,et al.  Comparison of approaches for rational siRNA design leading to a new efficient and transparent method , 2007, Nucleic acids research.

[27]  Jihoon Yang,et al.  Feature Subset Selection Using a Genetic Algorithm , 1998, IEEE Intell. Syst..