Malware behavioural detection and vaccine development by using a support vector model classifier

Most existing approaches for detecting viruses involve signature-based analyses to match the precise patterns of malware threats. However, the problem of classification accuracy regarding unspecified malware detection depends on correct extraction and completeness of training signatures. In practice, malware detection system uses the generalization ability of support vector models (SVMs) to guarantee a small classification error by machine learning. This study developed an automatic malware detection system by training an SVM classifier based on behavioural signatures. A cross-validation scheme was used for solving classification accuracy problems by using SVMs associated with 60 families of real malware. The experimental results reveal that the classification error decreases as the sizing of testing data is increased. For different sizing (N) of malware samples, the prediction accuracy of malware detection goes up to 98.7% with N = 100 . The overall detection accuracy of the SVC is more than 85% for unspecific mobile malware.

[1]  Wilhelmiina Hämäläinen,et al.  Comparison of Machine Learning Methods for Intelligent Tutoring Systems , 2006, Intelligent Tutoring Systems.

[2]  Julian Jang,et al.  A survey of emerging threats in cybersecurity , 2014, J. Comput. Syst. Sci..

[3]  M. Niranjan,et al.  Comparison of Four Machine Learning Methods for Predicting PM10 Concentrations in Helsinki, Finland , 2002 .

[4]  Heng Yin,et al.  Dynamic Spyware Analysis , 2007, USENIX Annual Technical Conference.

[5]  Kang G. Shin,et al.  Behavioral detection of malware on mobile handsets , 2008, MobiSys '08.

[6]  Vladimir Vapnik,et al.  The Nature of Statistical Learning , 1995 .

[7]  Elmar Gerhards-Padilla,et al.  Using Infection Markers as a Vaccine against Malware Attacks , 2012, 2012 IEEE International Conference on Green Computing and Communications.

[8]  Young-Chan Lee,et al.  Bankruptcy prediction using support vector machine with optimal choice of kernel function parameters , 2005, Expert Syst. Appl..

[9]  Carsten Willems,et al.  Learning and Classification of Malware Behavior , 2008, DIMVA.

[10]  Ahmad Khonsari,et al.  Detection and mitigation of sinkhole attacks in wireless sensor networks , 2014, J. Comput. Syst. Sci..

[11]  Martin D. Buhmann,et al.  Radial Basis Functions: Theory and Implementations: Preface , 2003 .

[12]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[13]  Alfons Crespo,et al.  A hypervisor based platform to support real-time safety critical embedded java applications , 2013, Comput. Syst. Sci. Eng..

[14]  Somesh Jha,et al.  Static Analysis of Executables to Detect Malicious Patterns , 2003, USENIX Security Symposium.

[15]  S. Gunn Support Vector Machines for Classification and Regression , 1998 .

[16]  Sahin Albayrak,et al.  An Android Application Sandbox system for suspicious software detection , 2010, 2010 5th International Conference on Malicious and Unwanted Software.

[17]  Sotiris B. Kotsiantis,et al.  Supervised Machine Learning: A Review of Classification Techniques , 2007, Informatica.

[18]  Tao Zhang,et al.  RobotDroid: A Lightweight Malware Detection Framework On Smartphones , 2012, J. Networks.

[19]  N. Altman An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression , 1992 .

[20]  Sancheng Peng,et al.  Modeling the dynamics of worm propagation using two-dimensional cellular automata in smartphones , 2013, J. Comput. Syst. Sci..

[21]  Marcus A. Maloof,et al.  Learning to Detect and Classify Malicious Executables in the Wild , 2006, J. Mach. Learn. Res..

[22]  Chih-Jen Lin,et al.  A comparison of methods for multiclass support vector machines , 2002, IEEE Trans. Neural Networks.

[23]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[24]  Igor Santos,et al.  Collective classification for packed executable identification , 2011, CEAS '11.

[25]  Chih-Jen Lin,et al.  Asymptotic Behaviors of Support Vector Machines with Gaussian Kernel , 2003, Neural Computation.

[26]  Martin D. Buhmann,et al.  Radial Basis Functions , 2021, Encyclopedia of Mathematical Geosciences.

[27]  Hsuan-Tien Lin A Study on Sigmoid Kernels for SVM and the Training of non-PSD Kernels by SMO-type Methods , 2005 .

[28]  Yoav Goldberg,et al.  splitSVM: Fast, Space-Efficient, non-Heuristic, Polynomial Kernel Computation for NLP Applications , 2008, ACL.

[29]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[30]  Christopher Krügel,et al.  Dynamic Analysis of Malicious Code , 2006, Journal in Computer Virology.

[31]  Siani Pearson,et al.  Mechanisms for Protecting Sensitive Information in Cloud Computing , 2013, Comput. Syst. Sci. Eng..