ATVHunter: Reliable Version Detection of Third-Party Libraries for Vulnerability Identification in Android Applications

Third-party libraries (TPLs) as essential parts in the mobile ecosystem have become one of the most significant contributors to the huge success of Android, which facilitate the fast development of Android applications. Detecting TPLs in Android apps is also important for downstream tasks, such as malware and repackaged apps identification. To identify in-app TPLs, we need to solve several challenges, such as TPL dependency, code obfuscation, precise version representation. Unfortunately, existing TPL detection tools have been proved that they have not solved these challenges very well, let alone specify the exact TPL versions. To this end, we propose a system, named ATVHunter, which can pinpoint the precise vulnerable in-app TPL versions and provide detailed information about the vulnerabilities and TPLs. We propose a two-phase detection approach to identify specific TPL versions. Specifically, we extract the Control Flow Graphs as the coarse-grained feature to match potential TPLs in the pre-defined TPL database, and then extract opcode in each basic block of CFG as the fine-grained feature to identify the exact TPL versions. We build a comprehensive TPL database (189,545 unique TPLs with 3,006,676 versions) as the reference database. Meanwhile, to identify the vulnerable in-app TPL versions, we also construct a comprehensive and known vulnerable TPL database containing 1,180 CVEs and 224 security bugs. Experimental results show AtVHunter outperforms state-of-the-art TPL detection tools, achieving 90.55% precision and 88.79% recall with high efficiency, and is also resilient to widely-used obfuscation techniques and scalable for large-scale TPL detection. Furthermore, to investigate the ecosystem of the vulnerable TPLs used by apps, we exploit newtool to conduct a large-scale analysis on 104,446 apps and find that 9,050 apps include vulnerable TPL versions with 53,337 vulnerabilities and 7,480 security bugs, most of which are with high risks and are not recognized by app developers.

[1]  Laura J. Bowman Statista , 2022, Journal of Business & Finance Librarianship.

[2]  Lingling Fan,et al.  Why My App Crashes? Understanding and Benchmarking Framework-Specific Exceptions of Android Apps , 2022, IEEE Transactions on Software Engineering.

[3]  Sufyan bin Uzayr GitHub , 2022, Mastering Git.

[4]  Minhui Xue,et al.  GUI-Squatting Attack: Automated Generation of Android Phishing Apps , 2019, IEEE Transactions on Dependable and Secure Computing.

[5]  Benchmark Data , 2021, Encyclopedia of Autism Spectrum Disorders.

[6]  Pei Wang,et al.  Large-Scale Third-Party Library Detection in Android Markets , 2020, IEEE Transactions on Software Engineering.

[7]  Li Li,et al.  Automated Third-Party Library Detection for Android Applications: Are We There Yet? , 2020, 2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[8]  Zicheng Zhang,et al.  An empirical study of potentially malicious third-party libraries in Android apps , 2020, WISEC.

[9]  Yinxing Xue,et al.  An Empirical Assessment of Security Risks of Global Android Banking Apps , 2018, 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE).

[10]  Alastair R. Beresford,et al.  LibID: reliable identification of obfuscated third-party Android libraries , 2019, ISSTA.

[11]  Mitsuaki Akiyama,et al.  Understanding the Responsiveness of Mobile App Developers to Software Library Updates , 2019, CODASPY.

[12]  Lingling Fan,et al.  A Large-Scale Empirical Study on Industrial Fake Apps , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP).

[13]  Lingling Fan,et al.  StoryDroid: Automated Generation of Storyboard for Android Apps , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE).

[14]  Lingling Fan,et al.  Are mobile banking apps secure? what can be improved? , 2018, ESEC/SIGSOFT FSE.

[15]  Yang Liu,et al.  Efficiently Manifesting Asynchronous Programming Errors in Android Apps , 2018, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[16]  Yan Wang,et al.  Orlis: Obfuscation-Resilient Library Detection for Android , 2018, 2018 IEEE/ACM 5th International Conference on Mobile Software Engineering and Systems (MOBILESoft).

[17]  Yuan Zhang,et al.  Detecting third-party libraries in Android applications with high precision and recall , 2018, 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER).

[18]  Junwei Tang,et al.  Identify and Inspect Libraries in Android Applications , 2018, Wirel. Pers. Commun..

[19]  Yang Liu,et al.  Large-Scale Analysis of Framework-Specific Exceptions in Android Apps , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[20]  Bo Li,et al.  Automated poisoning attacks and defenses in malware detection systems: An adversarial machine learning approach , 2017, Comput. Secur..

[21]  Wenke Lee,et al.  Identifying Open-Source License Violation and 1-day Security Risk at Large Scale , 2017, CCS.

[22]  Jian Liu,et al.  LibD: Scalable and Precise Third-Party Library Detection in Android Markets , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[23]  Jacques Klein,et al.  Automatically Locating Malicious Packages in Piggybacked Android Apps , 2017, 2017 IEEE/ACM 4th International Conference on Mobile Software Engineering and Systems (MOBILESoft).

[24]  Erik Derr,et al.  Reliable Third-Party Library Detection in Android and its Security Applications , 2016, CCS.

[25]  Vrizlynn L. L. Thing,et al.  Control flow obfuscation for Android applications , 2016, Comput. Secur..

[26]  Haoyu Wang,et al.  LibRadar: Fast and Accurate Detection of Third-Party Libraries in Android Apps , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering Companion (ICSE-C).

[27]  Annamalai Narayanan,et al.  LibSift: Automated Detection of Third-Party Libraries in Android Applications , 2016, 2016 23rd Asia-Pacific Software Engineering Conference (APSEC).

[28]  Peng Wang,et al.  Finding Unknown Malice in 10 Seconds: Mass Vetting for New Threats at the Google-Play Scale , 2015, USENIX Security Symposium.

[29]  Hongxia Jin,et al.  Efficient Privilege De-Escalation for Ad Libraries in Mobile Apps , 2015, MobiSys.

[30]  Sencun Zhu,et al.  ViewDroid: towards obfuscation-resilient mobile application repackaging detection , 2014, WiSec '14.

[31]  Xin Sun,et al.  Detecting Code Reuse in Android Applications Using Component-Based Control Flow Graph , 2014, SEC.

[32]  Peng Liu,et al.  Achieving accuracy and scalability simultaneously in detecting application clones on Android markets , 2014, ICSE.

[33]  Annamalai Narayanan,et al.  AdDetect: Automated detection of Android ad libraries using semantic analysis , 2014, 2014 IEEE Ninth International Conference on Intelligent Sensors, Sensor Networks and Information Processing (ISSNIP).

[34]  Воробьев Антон Александрович Анализ уязвимостей вычислительных систем на основе алгебраических структур и потоков данных National Vulnerability Database , 2013 .

[35]  Yajin Zhou,et al.  Detecting repackaged smartphone applications in third-party android marketplaces , 2012, CODASPY '12.

[36]  D. Hurlbut Fuzzy Hashing for Digital Forensic Investigators , 2009 .

[37]  S. Radack The Common Vulnerability Scoring System (CVSS) , 2007 .