Eight Years of Rider Measurement in the Android Malware Ecosystem

Despite the growing threat posed by Android malware, the research community is still lacking a comprehensive view of common behaviors and trends exposed by malware families active on the platform. Without such view, the researchers incur the risk of developing systems that only detect outdated threats, missing the most recent ones. In this paper, we conduct the largest measurement of Android malware behavior to date, analyzing over 1.2 million malware samples that belong to 1.2K families over a period of eight years (from 2010 to 2017). We aim at understanding how the behavior of Android malware has evolved over time, focusing on repackaging malware. In this type of threats different innocuous apps are piggybacked with a malicious payload (rider), allowing inexpensive malware manufacturing. One of the main challenges posed when studying repackaged malware is slicing the app to split benign components apart from the malicious ones. To address this problem, we use differential analysis to isolate software components that are irrelevant to the campaign and study the behavior of malicious riders alone. Our analysis framework relies on collective repositories and recent advances on the systematization of intelligence extracted from multiple anti-virus vendors. We find that since its infancy in 2010, the Android malware ecosystem has changed significantly, both in the type of malicious activity performed by the malicious samples and in the level of obfuscation used by malware to avoid detection. We then show that our framework can aid analysts who attempt to study unknown malware families. Finally, we discuss what our findings mean for Android malware detection research, highlighting areas that need further attention by the research community.

[1]  Jacques Klein,et al.  AndroZoo: Collecting Millions of Android Apps for the Research Community , 2016, 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR).

[2]  Yajin Zhou,et al.  Fast, scalable detection of "Piggybacked" mobile applications , 2013, CODASPY.

[3]  Swarat Chaudhuri,et al.  A Study of Android Application Security , 2011, USENIX Security Symposium.

[4]  Peng Wang,et al.  Finding Unknown Malice in 10 Seconds: Mass Vetting for New Threats at the Google-Play Scale , 2015, USENIX Security Symposium.

[5]  Heng Yin,et al.  DroidAPIMiner: Mining API-Level Features for Robust Malware Detection in Android , 2013, SecureComm.

[6]  Jacques Klein,et al.  Euphony: Harmonious Unification of Cacophonous Anti-Virus Vendor Labels for Android Malware , 2017, 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR).

[7]  Christopher Krügel,et al.  TriggerScope: Towards Detecting Logic Bombs in Android Applications , 2016, 2016 IEEE Symposium on Security and Privacy (SP).

[8]  Roberto Perdisci,et al.  From Throw-Away Traffic to Bots: Detecting the Rise of DGA-Based Malware , 2012, USENIX Security Symposium.

[9]  Xiapu Luo,et al.  DexHunter: Toward Extracting Hidden Code from Packed Android Applications , 2015, ESORICS.

[10]  Steve Hanna,et al.  Juxtapp: A Scalable System for Detecting Code Reuse among Android Applications , 2012, DIMVA.

[11]  Yajin Zhou,et al.  Detecting repackaged smartphone applications in third-party android marketplaces , 2012, CODASPY '12.

[12]  Eric Bodden,et al.  Harvesting Runtime Values in Android Applications That Feature Anti-Analysis Techniques , 2016, NDSS.

[13]  Felix C. Freiling,et al.  Measuring and Detecting Fast-Flux Service Networks , 2008, NDSS.

[14]  Christopher Krügel,et al.  Going Native: Using a Large-Scale Analysis of Android Apps to Create a Practical Native-Code Sandboxing Policy , 2016, NDSS.

[15]  Gianluca Stringhini,et al.  MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models (Extended Version) , 2016, NDSS 2017.

[16]  Li Li,et al.  Why are Android Apps Removed From Google Play? A Large-Scale Empirical Study , 2018, 2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR).

[17]  Shahid Alam,et al.  DroidNative: Automating and optimizing detection of Android native code malware variants , 2017, Comput. Secur..

[18]  Juan E. Tapiador,et al.  Dendroid: A text mining approach to analyzing and classifying code structures in Android malware families , 2014, Expert Syst. Appl..

[19]  Christian Platzer,et al.  MARVIN: Efficient and Comprehensive Mobile App Classification through Static and Dynamic Analysis , 2015, 2015 IEEE 39th Annual Computer Software and Applications Conference.

[20]  Juan E. Tapiador,et al.  Evolution, Detection and Analysis of Malware for Smart Devices , 2014, IEEE Communications Surveys & Tutorials.

[21]  Qinghua Zheng,et al.  Android Malware Familial Classification and Representative Sample Selection via Frequent Subgraph Analysis , 2018, IEEE Transactions on Information Forensics and Security.

[22]  Mansour Ahmadi,et al.  DroidSieve: Fast and Accurate Classification of Obfuscated Android Malware , 2017, CODASPY.

[23]  Ali Feizollah,et al.  The Evolution of Android Malware and Android Analysis Techniques , 2017, ACM Comput. Surv..

[24]  Yajin Zhou,et al.  Dissecting Android Malware: Characterization and Evolution , 2012, 2012 IEEE Symposium on Security and Privacy.

[25]  Ming Fan,et al.  DAPASA: Detecting Android Piggybacked Apps Through Sensitive Subgraph Analysis , 2017, IEEE Transactions on Information Forensics and Security.

[26]  Alessandra Gorla,et al.  How Do Apps Evolve in Their Permission Requests? A Preliminary Study , 2017, 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR).

[27]  Gianluca Stringhini,et al.  What Happens After You Are Pwnd: Understanding the Use of Leaked Webmail Credentials in the Wild , 2016, Internet Measurement Conference.

[28]  Gianluca Stringhini,et al.  AndrEnsemble: Leveraging API Ensembles to Characterize Android Malware Families , 2019, AsiaCCS.

[29]  Lorenzo Cavallaro,et al.  TESSERACT: Eliminating Experimental Bias in Malware Classification across Space and Time , 2018, USENIX Security Symposium.

[30]  Giovanni Vigna,et al.  When Malware is Packin’ Heat , 2018 .

[31]  Ling Huang,et al.  Reviewer Integration and Performance Measurement for Malware Detection , 2015, DIMVA.

[32]  Jacques Klein,et al.  Automatically Locating Malicious Packages in Piggybacked Android Apps , 2017, 2017 IEEE/ACM 4th International Conference on Mobile Software Engineering and Systems (MOBILESoft).

[33]  Ilia Nouretdinov,et al.  Transcend: Detecting Concept Drift in Malware Classification Models , 2017, USENIX Security Symposium.

[34]  Konrad Rieck,et al.  DREBIN: Effective and Explainable Detection of Android Malware in Your Pocket , 2014, NDSS.

[35]  Juan Caballero,et al.  AVclass: A Tool for Massive Malware Labeling , 2016, RAID.

[36]  Nic Herndon,et al.  Experimental Study with Real-world Data for Android App Security Analysis using Machine Learning , 2015, ACSAC.

[37]  Jacques Klein,et al.  An Investigation into the Use of Common Libraries in Android Apps , 2015, 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[38]  Davide Balzarotti,et al.  A Lustrum of Malware Network Communication: Evolution and Insights , 2017, 2017 IEEE Symposium on Security and Privacy (SP).

[39]  Sankardas Roy,et al.  Deep Ground Truth Analysis of Current Android Malware , 2017, DIMVA.

[40]  Narseo Vallina-Rodriguez,et al.  Beyond Google Play: A Large-Scale Comparative Study of Chinese Android App Markets , 2018, Internet Measurement Conference.

[41]  Yang Xiang,et al.  Classification of malware using structured control flow , 2010 .

[42]  Yajin Zhou,et al.  Malton: Towards On-Device Non-Invasive Mobile Malware Analysis for ART , 2017, USENIX Security Symposium.

[43]  Vladimir Vovk,et al.  Prescience: Probabilistic Guidance on the Retraining Conundrum for Malware Detection , 2016, AISec@CCS.

[44]  Yanick Fratantonio,et al.  ANDRUBIS -- 1,000,000 Apps Later: A View on Current Android Malware Behaviors , 2014, 2014 Third International Workshop on Building Analysis Datasets and Gathering Experience Returns for Security (BADGERS).

[45]  Leyla Bilge,et al.  Needles in a Haystack: Mining Information from Public Dynamic Analysis Sandboxes for Malware Intelligence , 2015, USENIX Security Symposium.

[46]  Gianluca Stringhini,et al.  MaMaDroid , 2019, ACM Trans. Priv. Secur..

[47]  Jacques Klein,et al.  Understanding Android App Piggybacking: A Systematic Study of Malicious Code Grafting , 2017, IEEE Transactions on Information Forensics and Security.

[48]  Christopher Krügel,et al.  Execute This! Analyzing Unsafe and Malicious Dynamic Code Loading in Android Applications , 2014, NDSS.

[49]  Juan E. Tapiador,et al.  Stegomalware: Playing Hide and Seek with Malicious Components in Smartphone Apps , 2014, Inscrypt.

[50]  Patrick Traynor,et al.  A Large Scale Investigation of Obfuscation Use in Google Play , 2018, ACSAC.

[51]  Hao Chen,et al.  AnDarwin: Scalable Detection of Android Application Clones Based on Semantics , 2015, IEEE Transactions on Mobile Computing.