Checking App Behavior Against App Descriptions: What If There are No App Descriptions?

Classifying mobile apps based on their description is beneficial for several purposes. However, many app descriptions do not reflect app functionalities, whether accidentally or on purpose. Most importantly, these app classification methods do not work if the app description is unavailable. This paper investigates a Reverse Engineering-based Approach to Classify mobile apps using The data that exists in the app, called REACT. To validate the proposed REACT method, we use a large set of Android apps (24,652 apps in total). We also show REACTs’ extendibility for malware/anomaly detection and prove its reliability and scalability. However, our analysis shows some limitations in REACT procedure and implementation, especially for similar feature based app grouping. We discuss the root cause of these failures, our key lessons learned, and some future enhancement ideas. We also share our REACT tools and reproduced datasets for the app market analyst, mobile app developers and software engineering research communities for further research purposes.

[1]  Jacques Klein,et al.  AndroZoo: Collecting Millions of Android Apps for the Research Community , 2016, 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR).

[2]  Jacques Klein,et al.  Automated Testing of Android Apps: A Systematic Literature Review , 2019, IEEE Transactions on Reliability.

[3]  John Grundy,et al.  Vision: Improved Development of Mobile eHealth Applications , 2018, 2018 IEEE/ACM 5th International Conference on Mobile Software Engineering and Systems (MOBILESoft).

[4]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[5]  Jacques Klein,et al.  On Identifying and Explaining Similarities in Android Apps , 2019, Journal of Computer Science and Technology.

[6]  Jacques Klein,et al.  Static analysis of android apps: A systematic literature review , 2017, Inf. Softw. Technol..

[7]  Bhagyashree Vyankatrao Barde,et al.  An overview of topic modeling methods and tools , 2017, 2017 International Conference on Intelligent Computing and Control Systems (ICICCS).

[8]  Michalis Faloutsos,et al.  ProfileDroid: multi-layer profiling of android applications , 2012, Mobicom '12.

[9]  Li Li,et al.  Why are Android Apps Removed From Google Play? A Large-Scale Empirical Study , 2018, 2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR).

[10]  John Grundy,et al.  Human-Centric Issues in eHealth App Development and Usage: A Preliminary Assessment , 2021, 2021 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER).

[11]  Oksana Zelenko,et al.  Mobile App Rating Scale: A New Tool for Assessing the Quality of Health Mobile Apps , 2015, JMIR mHealth and uHealth.

[12]  Giacomo Berardi,et al.  Multi-store metadata-based supervised mobile app classification , 2015, SAC.

[13]  Mamta Pandey,et al.  Perception-Based Classification of Mobile Apps: A Critical Review , 2019, Smart Computational Strategies: Theoretical and Practical Aspects.

[14]  Yuanyuan Zhang,et al.  Clustering Mobile Apps Based on Mined Textual Features , 2016, ESEM.

[15]  Ainuddin Wahid Abdul Wahab,et al.  A review on feature selection in mobile malware detection , 2015, Digit. Investig..

[16]  Jan vom Brocke,et al.  Enriching iTunes App Store Categories via Topic Modeling , 2014, ICIS.

[17]  W. John Wilbur,et al.  The ineffectiveness of within-document term frequency in text classification , 2008, Information Retrieval.

[18]  Aruna Seneviratne,et al.  App Miscategorization Detection: A Case Study on Google Play , 2017, IEEE Transactions on Knowledge and Data Engineering.

[19]  Alessandra Gorla,et al.  Checking app behavior against app descriptions , 2014, ICSE.

[20]  Jacques Klein,et al.  Understanding Android App Piggybacking: A Systematic Study of Malicious Code Grafting , 2017, IEEE Transactions on Information Forensics and Security.

[21]  Yuanyuan Zhang,et al.  A Survey of App Store Analysis for Software Engineering , 2017, IEEE Transactions on Software Engineering.

[22]  Jacques Klein,et al.  Rebooting Research on Detecting Repackaged Android Apps: Literature Review and Benchmark , 2018, IEEE Transactions on Software Engineering.

[23]  Jacques Klein,et al.  Characterizing malicious Android apps by mining topic-specific data flow signatures , 2017, Inf. Softw. Technol..

[24]  Robert H. Deng,et al.  Active Semi-supervised Approach for Checking App Behavior against Its Description , 2015, 2015 IEEE 39th Annual Computer Software and Applications Conference.