Identifying Android Malware Using Network-Based Approaches

The proliferation of Android applications has resulted in many malicious apps entering the market and causing significant damage. Robust techniques that determine if an app is malicious are greatly needed. We propose the use of network-based approaches to effectively separate malicious from benign apps, based on a small labeled dataset. The apps in our dataset come from the Google Play Store and have been scanned for malicious behavior using VirusTotal to produce a ground truth dataset with labels malicious or benign. The apps in the resulting dataset have been represented in the form of binary feature vectors (where the features represent permissions, intent actions, discriminative APIs, obfuscation signatures, and native code signatures). We have used these vectors to build a weighted network that captures the “closeness” between apps. We propagate labels from the labeled apps to unlabeled apps, and evaluate the effectiveness of the approaches studied using the Fl-measure. We have conducted experiments to compare three variants of the label propagation approaches on datasets that consist of increasingly larger amounts of labeled data.