Identifying Android Malware Using Network-Based Approaches

The proliferation of Android apps has resulted in many malicious apps entering the market and causing significant damage. Robust techniques that determine if an app is malicious are greatly needed. We propose the use of a network-based approach to effectively separate malicious from benign apps, based on a small labeled dataset. The apps in our dataset come from the Google Play Store and have been scanned for malicious behavior using Virus Total to produce a ground truth dataset with labels malicous or benign. The apps in the resulting dataset have been represented using binary feature vectors (where the features represent permissions, intent actions, discriminative APIs, obfuscation signatures, and native code signatures). We have used the feature vectors corresponding to apps to build a weighted network that captures the “closeness” between apps. We propagate labels from the labeled apps to unlabeled apps, and evaluate the effectiveness of the proposed approach using the F1-measure. We have conducted experiments to compare three variants of the label propagation approaches on datasets that include increasingly larger amounts of labeled data. The results have shown that a variant proposed in this study gives the best results overall.