Learning Malware Using Generalized Graph Kernels

Machine learning techniques were extensively applied to learn and detect malware. However, these techniques use often rough abstractions of programs. We propose in this work to use a more precise model for programs, namely extended API call graphs, where nodes correspond to API function calls, edges specify the execution order between the API functions, and edge labels indicate the dependence relation between API functions parameters. To learn such graphs, we propose to use Generalized Random Walk Graph Kernels (combined with Support Vector Machines). We implemented our techniques and obtained encouraging results for malware detection: 96.73% of detection rate with 0.73% of false alarms.

[1]  Somesh Jha,et al.  Synthesizing Near-Optimal Malware Specifications from Suspicious Behaviors , 2010, 2010 IEEE Symposium on Security and Privacy.

[2]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[3]  Christopher Krügel,et al.  On the Detection of Anomalous System Call Arguments , 2003, ESORICS.

[4]  Carsten Willems,et al.  Learning and Classification of Malware Behavior , 2008, DIMVA.

[5]  Hugo Daniel Macedo,et al.  Mining Malware Specifications through Static Reachability Analysis , 2013, ESORICS.

[6]  Tayssir Touili,et al.  Precise Extraction of Malicious Behaviors , 2018, 2018 IEEE 42nd Annual Computer Software and Applications Conference (COMPSAC).

[7]  Sulaiman Mohd Nor,et al.  FEATURE SELECTION AND MACHINE LEARNING CLASSIFICATION FOR MALWARE DETECTION , 2015 .

[8]  Lior Rokach,et al.  Mal-ID: Automatic Malware Detection Using Common Segment Analysis and Meta-Features , 2012, J. Mach. Learn. Res..

[9]  Salvatore J. Stolfo,et al.  Data mining methods for detection of new malicious executables , 2001, Proceedings 2001 IEEE Symposium on Security and Privacy. S&P 2001.

[10]  Chandrasekar Ravi,et al.  Malware Detection using Windows Api Sequence and Machine Learning , 2012 .

[11]  Radu State,et al.  Malware analysis with graph kernels and support vector machines , 2009, 2009 4th International Conference on Malicious and Unwanted Software (MALWARE).

[12]  Jian Xu,et al.  A similarity metric method of obfuscated malware using function-call graph , 2012, Journal of Computer Virology and Hacking Techniques.

[13]  Hichem Sahbi,et al.  Bags-of-daglets for action recognition , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[14]  Stavros D. Nikolopoulos,et al.  A graph-based model for malware detection and classification using system-call groups , 2017, Journal of Computer Virology and Hacking Techniques.

[15]  Guanhua Yan,et al.  Discriminant malware distance learning on structural information for automated malware classification , 2013, SIGMETRICS.

[16]  Hichem Sahbi,et al.  Directed Acyclic Graph Kernels for Action Recognition , 2013, 2013 IEEE International Conference on Computer Vision.

[17]  Bazara I. A. Barry,et al.  Improving the Detection of Malware Behaviour Using Simplified Data Dependent API Call Graph , 2013 .

[18]  Christopher Krügel,et al.  Effective and Efficient Malware Detection at the End Host , 2009, USENIX Security Symposium.

[19]  Joris Kinable,et al.  Malware classification based on call graph clustering , 2010, Journal in Computer Virology.

[20]  Hichem Sahbi,et al.  Context-Based Support Vector Machines for Interconnected Image Annotation , 2010, ACCV.

[21]  Tayssir Touili,et al.  Malware Detection based on Graph Classification , 2017, ICISSP.

[22]  Curtis B. Storlie,et al.  Graph-based malware detection using dynamic analysis , 2011, Journal in Computer Virology.

[23]  Javier Esparza,et al.  Reachability Analysis of Pushdown Automata: Application to Model-Checking , 1997, CONCUR.

[24]  Tayssir Touili,et al.  Automatic extraction of malicious behaviors , 2016, 2016 11th International Conference on Malicious and Unwanted Software (MALWARE).

[25]  Marcus A. Maloof,et al.  Learning to detect malicious executables in the wild , 2004, KDD.

[26]  Dragos Gavrilut,et al.  Malware Detection Using Perceptrons and Support Vector Machines , 2009, 2009 Computation World: Future Computing, Service Computation, Cognitive, Adaptive, Content, Patterns.

[27]  S. V. N. Vishwanathan,et al.  Graph kernels , 2007 .