Understanding Android App Piggybacking: A Systematic Study of Malicious Code Grafting

The Android packaging model offers ample opportunities for malware writers to piggyback malicious code in popular apps, which can then be easily spread to a large user base. Although recent research has produced approaches and tools to identify piggybacked apps, the literature lacks a comprehensive investigation into such phenomenon. We fill this gap by: 1) systematically building a large set of piggybacked and benign apps pairs, which we release to the community; 2) empirically studying the characteristics of malicious piggybacked apps in comparison with their benign counterparts; and 3) providing insights on piggybacking processes. Among several findings providing insights analysis techniques should build upon to improve the overall detection and classification accuracy of piggybacked apps, we show that piggybacking operations not only concern app code, but also extensively manipulates app resource files, largely contradicting common beliefs. We also find that piggybacking is done with little sophistication, in many cases automatically, and often via library code.

[1]  Jacques Klein,et al.  Rebooting Research on Detecting Repackaged Android Apps: Literature Review and Benchmark , 2018, IEEE Transactions on Software Engineering.

[2]  Jacques Klein,et al.  Ungrafting Malicious Code from Piggybacked Android Apps , 2016 .

[3]  Latifur Khan,et al.  A Machine Learning Approach to Android Malware Detection , 2012, 2012 European Intelligence and Security Informatics Conference.

[4]  Jun Zhang,et al.  Clonewise - Detecting Package-Level Clones Using Machine Learning , 2013, SecureComm.

[5]  Jacques Klein,et al.  An Investigation into the Use of Common Libraries in Android Apps , 2015, 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[6]  Hao Chen,et al.  Attack of the Clones: Detecting Cloned Applications on Android Markets , 2012, ESORICS.

[7]  Alessandra Gorla,et al.  Mining Apps for Abnormal Usage of Sensitive Data , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[8]  Anthony Desnos,et al.  Android: Static Analysis Using Similarity Distance , 2012, 2012 45th Hawaii International Conference on System Sciences.

[9]  Hao Chen,et al.  AnDarwin: Scalable Detection of Android Application Clones Based on Semantics , 2015, IEEE Transactions on Mobile Computing.

[10]  Salvatore J. Stolfo,et al.  On the feasibility of online malware detection with performance counters , 2013, ISCA.

[11]  Kang G. Shin,et al.  Large-scale malware indexing using function-call graphs , 2009, CCS.

[12]  Yajin Zhou,et al.  Dissecting Android Malware: Characterization and Evolution , 2012, 2012 IEEE Symposium on Security and Privacy.

[13]  Barbara G. Ryder,et al.  Analysis of Code Heterogeneity for High-Precision Classification of Repackaged Malware , 2016, 2016 IEEE Security and Privacy Workshops (SPW).

[14]  Jacques Klein,et al.  Reflection-aware static analysis of Android apps , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).

[15]  Gerardo Canfora,et al.  A Classifier of Malicious Android Applications , 2013, 2013 International Conference on Availability, Reliability and Security.

[16]  T. Moon The expectation-maximization algorithm , 1996, IEEE Signal Process. Mag..

[17]  Xiapu Luo,et al.  DexHunter: Toward Extracting Hidden Code from Packed Android Applications , 2015, ESORICS.

[18]  Yang Xiang,et al.  Classification of malware using structured control flow , 2010 .

[19]  Sencun Zhu,et al.  Semantics-Based Repackaging Detection for Mobile Apps , 2016, ESSoS.

[20]  Jacques Klein,et al.  Potential Component Leaks in Android Apps: An Investigation into a New Feature Set for Malware Detection , 2015, 2015 IEEE International Conference on Software Quality, Reliability and Security.

[21]  Laurie Hendren,et al.  Jimple: Simplifying Java Bytecode for Analyses and Transformations , 1998 .

[22]  Arun Lakhotia,et al.  DroidLegacy: Automated Familial Classification of Android Malware , 2014, PPREW'14.

[23]  Yajin Zhou,et al.  Detecting repackaged smartphone applications in third-party android marketplaces , 2012, CODASPY '12.

[24]  Zhen Huang,et al.  PScout: analyzing the Android permission specification , 2012, CCS.

[25]  Jacques Klein,et al.  Improving Privacy on Android Smartphones Through In-Vivo Bytecode Instrumentation , 2012, ArXiv.

[26]  Olga Gadyatskaya,et al.  FSquaDRA: Fast Detection of Repackaged Applications , 2014, DBSec.

[27]  Jason Nieh,et al.  A measurement study of google play , 2014, SIGMETRICS '14.

[28]  Jacques Klein,et al.  IccTA: Detecting Inter-Component Privacy Leaks in Android Apps , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[29]  Jacques Klein,et al.  Static Analysis for Extracting Permission Checks of a Large Scale Framework: The Challenges and Solutions for Analyzing Android , 2014, IEEE Transactions on Software Engineering.

[30]  Markus Jakobsson,et al.  Crimeware: Understanding New Attacks and Defenses , 2008 .

[31]  Jacques Klein,et al.  AndroZoo: Collecting Millions of Android Apps for the Research Community , 2016, 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR).

[32]  Yajin Zhou,et al.  Fast, scalable detection of "Piggybacked" mobile applications , 2013, CODASPY.

[33]  Ondrej Lhoták,et al.  The Soot framework for Java program analysis: a retrospective , 2011 .

[34]  Swarat Chaudhuri,et al.  A Study of Android Application Security , 2011, USENIX Security Symposium.

[35]  H. B. Mann,et al.  On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other , 1947 .

[36]  Michael Carl Tschantz,et al.  Better Malware Ground Truth: Techniques for Weighting Anti-Virus Vendor Labels , 2015, AISec@CCS.

[37]  Hossain Shahriar,et al.  Detection of repackaged Android Malware , 2014, The 9th International Conference for Internet Technology and Secured Transactions (ICITST-2014).

[38]  Jacques Klein,et al.  Dexpler: converting Android Dalvik bytecode to Jimple for static analysis with Soot , 2012, SOAP '12.

[39]  Peng Liu,et al.  Achieving accuracy and scalability simultaneously in detecting application clones on Android markets , 2014, ICSE.

[40]  Sakir Sezer,et al.  A New Android Malware Detection Approach Using Bayesian Classification , 2013, 2013 IEEE 27th International Conference on Advanced Information Networking and Applications (AINA).

[41]  Jacques Klein,et al.  DroidRA: taming reflection to support whole-program analysis of Android apps , 2016, ISSTA.

[42]  David Brumley,et al.  BitShred: feature hashing malware for scalable triage and semantic analysis , 2011, CCS '11.

[43]  Steve Hanna,et al.  A survey of mobile malware in the wild , 2011, SPSM '11.

[44]  Lei Zhang,et al.  Towards a scalable resource-driven approach for detecting repackaged Android applications , 2014, ACSAC.

[45]  Konrad Rieck,et al.  DREBIN: Effective and Explainable Detection of Android Malware in Your Pocket , 2014, NDSS.

[46]  Juanru Li,et al.  AppSpear: Bytecode Decrypting and DEX Reassembling for Packed Android Malware , 2015, RAID.

[47]  Juan Caballero,et al.  AVclass: A Tool for Massive Malware Labeling , 2016, RAID.