Mining AndroZoo: A Retrospect

This paper presents a retrospect of an Android app collection named AndroZoo and some research works conducted on top of the collection. AndroZoo is a growing collection of Android apps from various markets including the official Google Play. At the moment, over five million Android apps have been collected. Based on AndroZoo, we have explored several directions that mine Android apps for resolving various challenges. In this work, we summarize those resolved mining challenges in three research dimensions, including code analysis, app evolution analysis, malware analysis, and present in each dimension several case studies that experimentally demonstrate the usefulness of AndroZoo.

[1]  Jacques Klein,et al.  AndroZoo: Collecting Millions of Android Apps for the Research Community , 2016, 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR).

[2]  Jacques Klein,et al.  SimiDroid: Identifying and Explaining Similarities in Android Apps , 2017, 2017 IEEE Trustcom/BigDataSE/ICESS.

[3]  Jacques Klein,et al.  The Multi-Generation Repackaging Hypothesis , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering Companion (ICSE-C).

[4]  Haoyu Wang,et al.  WuKong: a scalable and accurate two-phase approach to Android app clone detection , 2015, ISSTA.

[5]  Matthew L. Dering,et al.  Composite Constant Propagation: Application to Android Inter-Component Communication Analysis , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[6]  Jacques Klein,et al.  Static analysis of android apps: A systematic literature review , 2017, Inf. Softw. Technol..

[7]  Jacques Klein,et al.  FlowDroid: precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for Android apps , 2014, PLDI.

[8]  Jacques Klein,et al.  Understanding Android App Piggybacking , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering Companion (ICSE-C).

[9]  Yuanyuan Zhang,et al.  A Survey of App Store Analysis for Software Engineering , 2017, IEEE Transactions on Software Engineering.

[10]  Massimiliano Di Penta,et al.  Mining Android Apps to Recommend Permissions , 2016, 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[11]  Jacques Klein,et al.  Towards a generic framework for automating extensive analysis of Android applications , 2016, SAC.

[12]  Alessandra Gorla,et al.  Checking app behavior against app descriptions , 2014, ICSE.

[13]  Jacques Klein,et al.  Understanding Android App Piggybacking: A Systematic Study of Malicious Code Grafting , 2017, IEEE Transactions on Information Forensics and Security.

[14]  Hui Zang,et al.  AdRob: examining the landscape and impact of android application plagiarism , 2013, MobiSys '13.

[15]  Li Li,et al.  Boosting Static Security Analysis of Android Apps through Code Instrumentation , 2016 .

[16]  Jacques Klein,et al.  IccTA: Detecting Inter-Component Privacy Leaks in Android Apps , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[17]  Jacques Klein,et al.  Potential Component Leaks in Android Apps: An Investigation into a New Feature Set for Malware Detection , 2015, 2015 IEEE International Conference on Software Quality, Reliability and Security.

[18]  Jacques Klein,et al.  Mining families of android applications for extractive SPL adoption , 2016, SPLC.

[19]  Jacques Klein,et al.  Accessing Inaccessible Android APIs: An Empirical Study , 2016, 2016 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[20]  Gabriele Bavota,et al.  Mining energy-greedy API usage patterns in Android apps: an empirical study , 2014, MSR 2014.

[21]  Alessandra Gorla,et al.  Mining Apps for Abnormal Usage of Sensitive Data , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[22]  Jacques Klein,et al.  Automatically Locating Malicious Packages in Piggybacked Android Apps , 2017, 2017 IEEE/ACM 4th International Conference on Mobile Software Engineering and Systems (MOBILESoft).

[23]  Jacques Klein,et al.  Effective inter-component communication mapping in Android with Epicc: an essential step towards holistic security analysis , 2013 .

[24]  Jacques Klein,et al.  Automatically Exploiting Potential Component Leaks in Android Applications , 2014, 2014 IEEE 13th International Conference on Trust, Security and Privacy in Computing and Communications.

[25]  Jacques Klein,et al.  Combining static analysis with probabilistic models to enable market-scale Android inter-component analysis , 2016, POPL.

[26]  Jacques Klein,et al.  I know what leaked in your pocket: uncovering privacy leaks on Android Apps with Static Taint Analysis , 2014, ArXiv.

[27]  Jacques Klein,et al.  DroidRA: taming reflection to support whole-program analysis of Android apps , 2016, ISSTA.

[28]  Xuxian Jiang,et al.  Unsafe exposure analysis of mobile in-app advertisements , 2012, WISEC '12.

[29]  Jacques Klein,et al.  Comprehending Malicious Android Apps By Mining Topic-Specific Data Flow Signatures , 2017 .

[30]  Li Li Boosting Static Analysis of Android Apps through Code Instrumentation , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering Companion (ICSE-C).

[31]  Jacques Klein,et al.  Bottom-Up Technologies for Reuse: Automated Extractive Adoption of Software Product Lines , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering Companion (ICSE-C).

[32]  Jacques Klein,et al.  Reflection-aware static analysis of Android apps , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).

[33]  Jacques Klein,et al.  ApkCombiner: Combining Multiple Android Apps to Support Inter-App Analysis , 2015, SEC.