How I Met Your Mother? - An Empirical Study about Android Malware Phylogenesis

Android malware is becoming more and more aggressive, in terms of impact on the victim’s device and in terms of capability of evading detection. Not only smartphones with their sensitive information are targeted by attackers, but also devices such as watches, glasses and everything that can be connected to the Internet of Things. Current signature based antimalware or anomaly based detection are not able to detect zero-day attacks: even trivial code transformation can overcome detection. New malware is often not really new: malware writers are used to add functionality to existing malware, or merge different pieces of existing malware code: this determines the families of Android malware i.e. malware programs that have in common some essential features or behaviors and modify some other parts. To be able to recognize the malware familiy a malware belongs to is useful for malware analysis, fast infection response, and quick incident resolution. In this paper we introduce DescentDroid, a tool that traces back the malware descendant family. We experiment our technique with an extended dataset comprising malware and trusted applications, obtaining high precision in recognizing the malware family membership.

[1]  Guanhua Yan,et al.  Transductive malware label propagation: Find your lineage from your neighbors , 2014, IEEE INFOCOM 2014 - IEEE Conference on Computer Communications.

[2]  Dan Arp,et al.  Drebin : � Efficient and Explainable Detection of Android Malware in Your Pocket , 2014 .

[3]  Christian S. Collberg,et al.  Surreptitious Software - Obfuscation, Watermarking, and Tamperproofing for Software Protection , 2009, Addison-Wesley Software Security Series.

[4]  Andrew Walenstein,et al.  Malware phylogeny generation using permutations of code , 2005, Journal in Computer Virology.

[5]  Benjamin C. M. Fung,et al.  BinClone: Detecting Code Clones in Malware , 2014, 2014 Eighth International Conference on Software Security and Reliability.

[6]  Rui Yang,et al.  Malware variants identification based on byte frequency , 2010, 2010 Second International Conference on Networks Security, Wireless Communications and Trusted Computing.

[7]  Jonathan Oliver,et al.  Mining Malware to Detect Variants , 2014, 2014 Fifth Cybercrime and Trustworthy Computing Conference.

[8]  Jian Xu,et al.  Detecting malware variants via function-call graph similarity , 2010, 2010 5th International Conference on Malicious and Unwanted Software.

[9]  Aniello Cimitile,et al.  Mobile Malware Detection in the Real World , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering Companion (ICSE-C).

[10]  Chris Giannella,et al.  Spectral malware behavior clustering , 2015, 2015 IEEE International Conference on Intelligence and Security Informatics (ISI).

[11]  Stephanie Wehner,et al.  Analyzing worms and network traffic using compression , 2005, J. Comput. Secur..

[12]  Gerardo Canfora,et al.  Evaluating Op-Code Frequency Histograms in Malware and Third-Party Mobile Applications , 2015, ICETE.

[13]  Tudor Dumitras,et al.  Experimental Challenges in Cyber Security: A Story of Provenance and Lineage for Malware , 2011, CSET.

[14]  Carsten Willems,et al.  Automatic analysis of malware behavior using machine learning , 2011, J. Comput. Secur..

[15]  Barton P. Miller,et al.  Recovering the toolchain provenance of binary code , 2011, ISSTA '11.

[16]  Helen J. Wang,et al.  Finding diversity in remote code injection exploits , 2006, IMC '06.

[17]  Xu Chen,et al.  Towards an understanding of anti-virtualization and anti-debugging behavior in modern malware , 2008, 2008 IEEE International Conference on Dependable Systems and Networks With FTCS and DCC (DSN).

[18]  Hao Huang,et al.  Detect Android Malware Variants Using Component Based Topology Graph , 2014, 2014 IEEE 13th International Conference on Trust, Security and Privacy in Computing and Communications.

[19]  Kang G. Shin,et al.  Large-scale malware indexing using function-call graphs , 2009, CCS.

[20]  Hirofumi Yamaki,et al.  ARIGUMA Code Analyzer: Efficient Variant Detection by Identifying Common Instruction Sequences in Malware Families , 2013, 2013 IEEE 37th Annual Computer Software and Applications Conference.

[21]  Antonella Santone,et al.  Identification of Android Malware Families with Model Checking , 2016, ICISSP.

[22]  David Brumley,et al.  BitShred: feature hashing malware for scalable triage and semantic analysis , 2011, CCS '11.

[23]  Konrad Rieck,et al.  Structural detection of android malware using embedded call graphs , 2013, AISec.

[24]  Chanchal Kumar Roy,et al.  Comparison and evaluation of code clone detection techniques and tools: A qualitative approach , 2009, Sci. Comput. Program..

[25]  Hirofumi Yamaki,et al.  A Malware Classification Method Based on Similarity of Function Structure , 2012, 2012 IEEE/IPSJ 12th International Symposium on Applications and the Internet.

[26]  Chen Li,et al.  Malware variant detection using similarity search over content fingerprint , 2014, The 26th Chinese Control and Decision Conference (2014 CCDC).

[27]  Yajin Zhou,et al.  Dissecting Android Malware: Characterization and Evolution , 2012, 2012 IEEE Symposium on Security and Privacy.

[28]  Christopher Krügel,et al.  Scalable, Behavior-Based Malware Clustering , 2009, NDSS.

[29]  Enrique V. Carrera,et al.  Digital genome mapping: ad-vanced binary malware analysis , 2004 .

[30]  Yang Xiang,et al.  Malware Variant Detection Using Similarity Search over Sets of Control Flow Graphs , 2011, 2011IEEE 10th International Conference on Trust, Security and Privacy in Computing and Communications.

[31]  Gerardo Canfora,et al.  Mobile malware detection using op-code frequency histograms , 2015, 2015 12th International Joint Conference on e-Business and Telecommunications (ICETE).

[32]  Christopher Krügel,et al.  Polymorphic Worm Detection Using Structural Information of Executables , 2005, RAID.

[33]  Joris Kinable,et al.  Malware classification based on call graph clustering , 2010, Journal in Computer Virology.

[34]  Steven Jilcott Scalable malware forensics using phylogenetic analysis , 2015, 2015 IEEE International Symposium on Technologies for Homeland Security (HST).

[35]  Andrew Walenstein,et al.  A transformation-based model of malware derivation , 2012, 2012 7th International Conference on Malicious and Unwanted Software.

[36]  Cynthia A. Phillips,et al.  Constructing Computer Virus Phylogenies , 1996, CPM.

[37]  Jian Xu,et al.  A novel malware variants detection method based On function-call graph , 2013, IEEE Conference Anthology.

[38]  Josephine Micallef,et al.  Detection of global, metamorphic malware variants using control and data flow analysis , 2012, MILCOM 2012 - 2012 IEEE Military Communications Conference.