Run-time classification of malicious processes using system call analysis

This study presents a malware classification system designed to classify malicious processes at run-time on production hosts. The system monitors process-level system call activity and uses information extracted from system call traces as inputs to the classifier. The system is advantageous because it does not require the use of specialized analysis environments. Instead, a `lightweight' service application monitors process execution and classifies new malware samples based on their behavioral similarity to known malware. This study compares the effectiveness of multiple feature sets, ground truth labeling schemes, and machine learning algorithms for malware classification. The accuracy of the classification system is evaluated against processlevel system call traces of recently discovered malware samples collected from production environments. Experimental results indicate that accurate classification results can be achieved using relatively short system call traces and simple representations.

[1]  V. Rao Vemuri,et al.  Using Text Categorization Techniques for Intrusion Detection , 2002, USENIX Security Symposium.

[2]  Ali A. Ghorbani,et al.  Exploring network-based malware classification , 2011, 2011 6th International Conference on Malicious and Unwanted Software.

[3]  Zhuoqing Morley Mao,et al.  Automated Classification and Analysis of Internet Malware , 2007, RAID.

[4]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[5]  Andrew Walenstein,et al.  VILO : A SHIELD IN THE MALWARE VARIATION BATTLE , 2007 .

[6]  Terran Lane,et al.  Improving malware classification: bridging the static/dynamic gap , 2012, AISec.

[7]  David Madigan,et al.  Large-Scale Bayesian Logistic Regression for Text Categorization , 2007, Technometrics.

[8]  T. Vinay Kumar M. Tech Malwise-An Effective and Efficient Classification System for Packed and Polymorphic Malware , 2014 .

[9]  Alva Erwin,et al.  Analysis of Machine learning Techniques Used in Behavior-Based Malware Detection , 2010, 2010 Second International Conference on Advances in Computing, Control, and Telecommunication Technologies.

[10]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[11]  Christopher Krügel,et al.  FORECAST: skimming off the malware cream , 2011, ACSAC '11.

[12]  Moshe Kam,et al.  System Call-Based Detection of Malicious Processes , 2015, 2015 IEEE International Conference on Software Quality, Reliability and Security.

[13]  Mark Stamp,et al.  Deriving common malware behavior through graph clustering , 2013, Comput. Secur..

[14]  U. Bayer,et al.  TTAnalyze: A Tool for Analyzing Malware , 2006 .

[15]  Eric Filiol,et al.  Behavioral detection of malware: from a survey towards an established taxonomy , 2008, Journal in Computer Virology.

[16]  George Karypis,et al.  Centroid-Based Document Classification: Analysis and Experimental Results , 2000, PKDD.

[17]  Alexander Ilin,et al.  Methodology for Behavioral-based Malware Analysis and Detection Using Random Projections and K-Nearest Neighbors Classifiers , 2011, 2011 Seventh International Conference on Computational Intelligence and Security.

[18]  Douglas S. Reeves,et al.  Fast malware classification by automated behavioral graph matching , 2010, CSIIRW '10.

[19]  Michael Meier,et al.  Measuring similarity of malware behavior , 2009, 2009 IEEE 34th Conference on Local Computer Networks.

[20]  Carsten Willems,et al.  Learning and Classification of Malware Behavior , 2008, DIMVA.

[21]  Paul A. Watters,et al.  RBACS: Rootkit Behavioral Analysis and Classification System , 2010, 2010 Third International Conference on Knowledge Discovery and Data Mining.

[22]  Ryan M. Rifkin,et al.  In Defense of One-Vs-All Classification , 2004, J. Mach. Learn. Res..

[23]  Katsumi Wasaki,et al.  Malware classification based on extracted API sequences using static analysis , 2012, AINTEC.

[24]  Vinod Yegneswaran,et al.  A comparative assessment of malware classification using binary texture analysis and dynamic analysis , 2011, AISec '11.

[25]  Marcus A. Maloof,et al.  Learning to Detect and Classify Malicious Executables in the Wild , 2006, J. Mach. Learn. Res..

[26]  Ali A. Ghorbani,et al.  Automated malware classification based on network behavior , 2013, 2013 International Conference on Computing, Networking and Communications (ICNC).

[27]  Felix C. Freiling,et al.  Toward Automated Dynamic Malware Analysis Using CWSandbox , 2007, IEEE Secur. Priv..

[28]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[29]  Hirofumi Yamaki,et al.  A Malware Classification Method Based on Similarity of Function Structure , 2012, 2012 IEEE/IPSJ 12th International Symposium on Applications and the Internet.

[30]  Carsten Willems,et al.  Automatic analysis of malware behavior using machine learning , 2011, J. Comput. Secur..

[31]  Wanlei Zhou,et al.  Malwise—An Effective and Efficient Classification System for Packed and Polymorphic Malware , 2013, IEEE Transactions on Computers.

[32]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[33]  Thomas L. Marzetta,et al.  Detection, Estimation, and Modulation Theory , 1976 .

[34]  Somesh Jha,et al.  Testing malware detectors , 2004, ISSTA '04.