LibD: Scalable and Precise Third-Party Library Detection in Android Markets

With the thriving of the mobile app markets, third-party libraries are pervasively integrated in the Android applications. Third-party libraries provide functionality such as advertisements, location services, and social networking services, making multi-functional app development much more productive. However, the spread of vulnerable or harmful third-party libraries may also hurt the entire mobile ecosystem, leading to various security problems. The Android platform suffers severely from such problems due to the way its ecosystem is constructed and maintained. Therefore, third-party Android library identification has emerged as an important problem which is the basis of many security applications such as repackaging detection and malware analysis. According to our investigation, existing work on Android library detection still requires improvement in many aspects, including accuracy and obfuscation resilience. In response to these limitations, we propose a novel approach to identifying third-party Android libraries. Our method utilizes the internal code dependencies of an Android app to detect and classify library candidates. Different from most previous methods which classify detected library candidates based on similarity comparison, our method is based on feature hashing and can better handle code whose package and method names are obfuscated. Based on this approach, we have developed a prototypical tool called LibD and evaluated it with an update-to-date and large-scale dataset. Our experimental results on 1,427,395 apps show that compared to existing tools, LibD can better handle multi-package third-party libraries in the presence of name-based obfuscation, leading to significantly improved precision without the loss of scalability.

[1]  Saumya K. Debray,et al.  Deobfuscation: reverse engineering obfuscated code , 2005, 12th Working Conference on Reverse Engineering (WCRE'05).

[2]  Mario Linares Vásquez,et al.  On automatically detecting similar Android apps , 2016, 2016 IEEE 24th International Conference on Program Comprehension (ICPC).

[3]  Sencun Zhu,et al.  Semantics-based obfuscation-resilient binary code similarity comparison with applications to software plagiarism detection , 2014, SIGSOFT FSE.

[4]  Yajin Zhou,et al.  Fast, scalable detection of "Piggybacked" mobile applications , 2013, CODASPY.

[5]  Hui Zang,et al.  AdRob: examining the landscape and impact of android application plagiarism , 2013, MobiSys '13.

[6]  Shinji Kusumoto,et al.  CCFinder: A Multilinguistic Token-Based Code Clone Detection System for Large Scale Source Code , 2002, IEEE Trans. Software Eng..

[7]  Hao Chen,et al.  Investigating User Privacy in Android Ad Libraries , 2012 .

[8]  Haoyu Wang,et al.  LibRadar: Fast and Accurate Detection of Third-Party Libraries in Android Apps , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering Companion (ICSE-C).

[9]  Hao Chen,et al.  Attack of the Clones: Detecting Cloned Applications on Android Markets , 2012, ESORICS.

[10]  Jacques Klein,et al.  An Investigation into the Use of Common Libraries in Android Apps , 2015, 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[11]  Yuanyuan Zhou,et al.  CP-Miner: A Tool for Finding Copy-paste and Related Bugs in Operating System Code , 2004, OSDI.

[12]  Norman M. Sadeh,et al.  Modeling Users' Mobile App Privacy Preferences: Restoring Usability in a Sea of Permission Settings , 2014, SOUPS.

[13]  Annamalai Narayanan,et al.  AdDetect: Automated detection of Android ad libraries using semantic analysis , 2014, 2014 IEEE Ninth International Conference on Intelligent Sensors, Sensor Networks and Information Processing (ISSNIP).

[14]  Hao Chen,et al.  AnDarwin: Scalable Detection of Semantically Similar Android Applications , 2013, ESORICS.

[15]  Zhendong Su,et al.  DECKARD: Scalable and Accurate Tree-Based Detection of Code Clones , 2007, 29th International Conference on Software Engineering (ICSE'07).

[16]  Peng Liu,et al.  Achieving accuracy and scalability simultaneously in detecting application clones on Android markets , 2014, ICSE.

[17]  Haoyu Wang,et al.  WuKong: a scalable and accurate two-phase approach to Android app clone detection , 2015, ISSTA.

[18]  Tongxin Li,et al.  Mayhem in the Push Clouds: Understanding and Mitigating Security Hazards in Mobile Push-Messaging Services , 2014, CCS.

[19]  Mario Linares Vásquez,et al.  Revisiting Android reuse studies in the context of code obfuscation and library usages , 2014, MSR 2014.

[20]  Douglas Low,et al.  Java Control Flow Obfuscation , 1998 .

[21]  Latifur Khan,et al.  SMV-Hunter: Large Scale, Automated Detection of SSL/TLS Man-in-the-Middle Vulnerabilities in Android Apps , 2014, NDSS.

[22]  Subhasish Mazumdar,et al.  Introducing Privacy Threats from Ad Libraries to Android Users Through Privacy Granules , 2015 .

[23]  Steve Hanna,et al.  Juxtapp: A Scalable System for Detecting Code Reuse among Android Applications , 2012, DIMVA.

[24]  J. Crussell,et al.  Scalable Semantics-Based Detection of Similar Android Applications , 2013 .

[25]  Hongxia Jin,et al.  Efficient Privilege De-Escalation for Ad Libraries in Mobile Apps , 2015, MobiSys.

[26]  Sencun Zhu,et al.  ViewDroid: towards obfuscation-resilient mobile application repackaging detection , 2014, WiSec '14.

[27]  Heng Yin,et al.  Code Injection Attacks on HTML5-based Mobile Apps: Characterization, Detection and Mitigation , 2014, CCS.

[28]  Christian Rossow,et al.  Cross-Architecture Bug Search in Binary Executables , 2015, 2015 IEEE Symposium on Security and Privacy.

[29]  Bin Ma,et al.  Following Devil's Footprints: Cross-Platform Analysis of Potentially Harmful Libraries on Android and iOS , 2016, 2016 IEEE Symposium on Security and Privacy (SP).

[30]  Susan Horwitz,et al.  Using Slicing to Identify Duplication in Source Code , 2001, SAS.

[31]  Xuxian Jiang,et al.  Unsafe exposure analysis of mobile in-app advertisements , 2012, WISEC '12.

[32]  Dan S. Wallach,et al.  Longitudinal Analysis of Android Ad Library Permissions , 2013, ArXiv.

[33]  Guofei Gu,et al.  SmartDroid: an automatic system for revealing UI-based trigger conditions in android applications , 2012, SPSM '12.