Malware Classification Framework for Dynamic Analysis using Information Theory

Objectives: 1. To propose a framework for Malware Classification System (MCS) to analyze malware behavior dynamically using a concept of information theory and a machine learning technique. 2. To extract behavioral patterns from execution reports of malware in terms of its features and generates a data repository. 3. To select the most promising features using information theory based concepts. Methods/Statistical Analysis: Today, malware is a major concern of computer security experts. Variety and in- creasing number of malware affects millions of systems in the form of viruses, worms, Trojans etc. Many techniques have been proposed to analyze the malware to its class accurately. Some of analysis techniques analyzed malware based upon its structure, code flow, etc. without executing it (called static analysis), whereas other techniques (termed as dynamic analysis) focused to monitor the behavior of malware by executing it and comparing it with known malware behavior. Dynamic analysis has proved to be effective in malware detection as behavior is more difficult to mask while executing than its underlying code (static analysis). In this study, we propose a framework for Malware Classification System (MCS) to analyze malware behavior dynamically using a concept of information theory and a machine learning technique. The proposed framework extracts behavioral patterns from execution reports of malware in terms of its features and generates a data repository. Further, it selects the most promising features using information theory based concepts. Findings: The proposed framework detects the family of unknown malware samples after training of a classifier from malware data repository. We validated the applicability of the proposed framework by comparing with the other dynamic malware analysis technique on a real malware dataset from Virus Total. Application: The proposed framework is a Malware Classification System (MCS) to analyze malware behavior dynamically using a concept of information theory and a machine learning technique.

[1]  Carsten Willems,et al.  A Malware Instruction Set for Behavior-Based Analysis , 2010, Sicherheit.

[2]  Cristiano Giuffrida,et al.  Detection of Intrusions and Malware, and Vulnerability Assessment , 2018, Lecture Notes in Computer Science.

[3]  Md. Rafiqul Islam,et al.  Differentiating malware from cleanware using behavioural analysis , 2010, 2010 5th International Conference on Malicious and Unwanted Software.

[4]  Carsten Willems,et al.  Learning and Classification of Malware Behavior , 2008, DIMVA.

[5]  Teruo Higashino,et al.  Stabilization, Safety, and Security of Distributed Systems , 2013, Lecture Notes in Computer Science.

[6]  Richard Bellman,et al.  Adaptive Control Processes: A Guided Tour , 1961, The Mathematical Gazette.

[7]  B Li,et al.  Next generation malware analysis techniques and tools , 2015 .

[8]  Yanfang Ye,et al.  Malicious sequential pattern mining for automatic malware detection , 2016, Expert Syst. Appl..

[9]  Debahuti Mishra,et al.  Gene Selection Using Information Theory and Statistical Approach , 2015 .

[10]  Gulshan Kumar,et al.  An information theoretic approach for feature selection , 2012, Secur. Commun. Networks.

[11]  Kieran McLaughlin,et al.  Obfuscation: The Hidden Malware , 2011, IEEE Security & Privacy.

[12]  Felix C. Freiling,et al.  TrumanBox: Improving Dynamic Malware Analysis by Emulating the Internet , 2011, SSS.

[13]  Christopher Krügel,et al.  Dynamic Analysis of Malicious Code , 2006, Journal in Computer Virology.

[14]  Gulshan Kumar,et al.  AI based supervised classifiers: an analysis for intrusion detection , 2011, ACAI '11.

[15]  Christopher Krügel,et al.  Scalable, Behavior-Based Malware Clustering , 2009, NDSS.

[16]  Yang Xiang,et al.  Software Similarity and Classification , 2012, SpringerBriefs in Computer Science.

[17]  Zhuoqing Morley Mao,et al.  Automated Classification and Analysis of Internet Malware , 2007, RAID.

[18]  Eunjin Kim,et al.  A Novel Approach to Detect Malware Based on API Call Sequence Analysis , 2015, Int. J. Distributed Sens. Networks.

[19]  Sivabalan Arumugam,et al.  Improved Non Mutual Information based Multi-path Time Delay Estimation , 2014 .

[20]  Felix C. Freiling,et al.  Toward Automated Dynamic Malware Analysis Using CWSandbox , 2007, IEEE Secur. Priv..

[21]  Ali A. Ghorbani,et al.  Automated malware classification based on network behavior , 2013, 2013 International Conference on Computing, Networking and Communications (ICNC).

[22]  Usha Mary Sharma Hybrid Feature Based Face Verification and Recognition System Using Principal Component Analysis and Artificial Neural Network , 2015 .

[23]  Christopher Krügel,et al.  A survey on automated dynamic malware-analysis techniques and tools , 2012, CSUR.

[24]  Srinivas Mukkamala,et al.  Malware detection using assembly and API call sequences , 2011, Journal in Computer Virology.

[25]  Kris Mikael Krister Automated Analyses of Malicious Code , 2009 .

[26]  Vinod Yegneswaran,et al.  Eureka: A Framework for Enabling Static Malware Analysis , 2008, ESORICS.

[27]  Tzi-cker Chiueh,et al.  Automatic Generation of String Signatures for Malware Detection , 2009, RAID.

[28]  Christopher Krügel,et al.  Limits of Static Analysis for Malware Detection , 2007, Twenty-Third Annual Computer Security Applications Conference (ACSAC 2007).

[29]  M. Venkatesan,et al.  Artificial Neural Network based Prediction of Pressure Drop in Heat Exchangers , 2015 .