Chronological Analysis of Source Code Reuse Impact on Android Application Security

Application developers consider open discussion forum on software development such as question and answer (Q&A) forums to be very important. There are cases where snippets which are partial source code on such forums contains vulnerabilities, and application developers divert snippets without knowing it. Previous works were focused on security-related codes such as a TLS connection, and not on actual vulnerable codes that are used widely. Thus, a time series investigation on the spread of such codes has not been conducted. In this paper, a method that enables the chronological analysis of source code reuse is proposed. By determining source code reuse in applications, we can investigate the context using time information such as the respective publication dates and time, and clarify how many cases are not source code reuse. An evaluation of the proposed method is achieved using large-scale data which includes 527,249 snippets of source code and 249,987 applications. The result shows that the appearance rate of applications having the same code as the snippet has increased after the release of the snippet. Furthermore, experiments on extracting vulnerable snippets from all snippets show that vulnerable snippets often have a greater impact than the overall snippet trend.

[1]  Michelle L. Mazurek,et al.  Security Developer Studies with GitHub Users: Exploring a Convenience Sample , 2017, SOUPS.

[2]  David Brumley,et al.  An empirical study of cryptographic misuse in android applications , 2013, CCS.

[3]  Michael Backes,et al.  You Get Where You're Looking for: The Impact of Information Sources on Code Security , 2016, 2016 IEEE Symposium on Security and Privacy (SP).

[4]  Chanchal K. Roy,et al.  A Survey on Software Clone Detection Research , 2007 .

[5]  Jacques Klein,et al.  AndroZoo: Collecting Millions of Android Apps for the Research Community , 2016, 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR).

[6]  Patrick Traynor,et al.  A Large Scale Investigation of Obfuscation Use in Google Play , 2018, ACSAC.

[7]  Chanchal Kumar Roy,et al.  SeByte: A semantic clone detection tool for intermediate languages , 2012, 2012 20th IEEE International Conference on Program Comprehension (ICPC).

[8]  Patrick Traynor,et al.  Mo(bile) Money, Mo(bile) Problems , 2017, ACM Trans. Priv. Secur..

[9]  Michael Backes,et al.  Stack Overflow Considered Harmful? The Impact of Copy&Paste on Android Application Security , 2017, 2017 IEEE Symposium on Security and Privacy (SP).

[10]  Vitaly Shmatikov,et al.  The most dangerous code in the world: validating SSL certificates in non-browser software , 2012, CCS.

[11]  Michael Backes,et al.  A Stitch in Time: Supporting Android Developers in WritingSecure Code , 2017, CCS.

[12]  Maninder Singh,et al.  Software clone detection: A systematic review , 2013, Inf. Softw. Technol..