Mining control flow graph as API call-grams to detect portable executable malware

Present day malware shows stealthy and dynamic capability and avails administrative rights to control the victim computers. Malware writers depend on evasion techniques like code obfuscation, packing, compression, encryption or polymorphism to avoid detection by Anti-Virus (AV) scanners as AV primarily use syntactic signature to detect a known malware. Our approach is based on semantic aspect of PE exectable that analyses API Call-grams to detect unknown malicious code. As in--exact source code is analysed, the machine is not infected by the executable. Moreover, static analysis covers all the paths of code which is not possible with dynamic behavioural methods as latter does not gurantee the execution of sample being analysed. Modern malicious samples also detect controlled virtual and emulated environments and stop the functioning. Semantic invariant approach is important as signature of known samples are changed by code obfuscation tools. Static analysis is performed by generating an API Call graph from control flow of an executable, then mining the Call graph as API Call-gram to detect malicious files.

[1]  Vipin Kumar,et al.  Introduction to Data Mining , 2022, Data Mining and Machine Learning Applications.

[2]  Yuval Elovici,et al.  Unknown malcode detection via text categorization and the imbalance problem , 2008, 2008 IEEE International Conference on Intelligence and Security Informatics.

[3]  Sattar Hashemi,et al.  Malware detection based on mining API calls , 2010, SAC '10.

[4]  Vijay Laxmi,et al.  PEAL - Packed Executable AnaLysis , 2011, ADCONS.

[5]  Heejo Lee,et al.  Code Graph for Malware Detection , 2008, 2008 International Conference on Information Networking.

[6]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[7]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[8]  Vijay Laxmi,et al.  MEDUSA: MEtamorphic malware dynamic analysis usingsignature from API , 2010, SIN.

[9]  Christopher Krügel,et al.  Polymorphic Worm Detection Using Structural Information of Executables , 2005, RAID.

[10]  Joris Kinable,et al.  Malware classification based on call graph clustering , 2010, Journal in Computer Virology.

[11]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[12]  Andrew Walenstein,et al.  Malware phylogeny generation using permutations of code , 2005, Journal in Computer Virology.

[13]  Yoav Freund,et al.  Large Margin Classification Using the Perceptron Algorithm , 1998, COLT.

[14]  Guillaume Bonfante,et al.  Architecture of a morphological malware detector , 2009, Journal in Computer Virology.

[15]  Md. Rafiqul Islam,et al.  Differentiating malware from cleanware using behavioural analysis , 2010, 2010 5th International Conference on Malicious and Unwanted Software.

[16]  Heejo Lee,et al.  Detecting metamorphic malwares using code graphs , 2010, SAC '10.

[17]  Vipin Kumar,et al.  Introduction to Data Mining, (First Edition) , 2005 .

[18]  Mark Stamp,et al.  Hunting for metamorphic engines , 2006, Journal in Computer Virology.

[19]  Qinghua Zhang,et al.  MetaAware: Identifying Metamorphic Malware , 2007, Twenty-Third Annual Computer Security Applications Conference (ACSAC 2007).

[20]  Thomas Dullien,et al.  Graph-based comparison of Executable Objects , 2005 .

[21]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[22]  Muhammad Zubair Shafiq,et al.  Using spatio-temporal information in API calls with machine learning algorithms for malware detection , 2009, AISec '09.

[23]  Andrew Walenstein,et al.  Normalizing Metamorphic Malware Using Term Rewriting , 2006, 2006 Sixth IEEE International Workshop on Source Code Analysis and Manipulation.

[24]  Yang Xiang,et al.  A Fast Flowgraph Based Classification System for Packed and Polymorphic Malware on the Endhost , 2010, 2010 24th IEEE International Conference on Advanced Information Networking and Applications.

[25]  Galen C. Hunt,et al.  Detours: binary interception of Win32 functions , 1999 .

[26]  Christopher Krügel,et al.  A survey on automated dynamic malware-analysis techniques and tools , 2012, CSUR.

[27]  Bezawada Bruhadeshwar,et al.  Signature Generation and Detection of Malware Families , 2008, ACISP.