Stack Overflow Considered Harmful? The Impact of Copy&Paste on Android Application Security

Online programming discussion platforms such as Stack Overflow serve as a rich source of information for software developers. Available information include vibrant discussions and oftentimes ready-to-use code snippets. Previous research identified Stack Overflow as one of the most important information sources developers rely on. Anecdotes report that software developers copy and paste code snippets from those information sources for convenience reasons. Such behavior results in a constant flow of community-provided code snippets into production software. To date, the impact of this behaviour on code security is unknown. We answer this highly important question by quantifying the proliferation of security-related code snippets from Stack Overflow in Android applications available on Google Play. Access to the rich source of information available on Stack Overflow including ready-to-use code snippets provides huge benefits for software developers. However, when it comes to code security there are some caveats to bear in mind: Due to the complex nature of code security, it is very difficult to provide ready-to-use and secure solutions for every problem. Hence, integrating a security-related code snippet from Stack Overflow into production software requires caution and expertise. Unsurprisingly, we observed insecure code snippets being copied into Android applications millions of users install from Google Play every day. To quantitatively evaluate the extent of this observation, we scanned Stack Overflow for code snippets and evaluated their security score using a stochastic gradient descent classifier. In order to identify code reuse in Android applications, we applied state-of-the-art static analysis. Our results are alarming: 15.4% of the 1.3 million Android applications we analyzed, contained security-related code snippets from Stack Overflow. Out of these 97.9% contain at least one insecure code snippet.

[1]  Serge Vaudenay,et al.  On the Weak Keys of Blowfish , 1996, FSE.

[2]  Bruce Schneier,et al.  Related-Key Cryptanalysis of 3-WAY , 1997 .

[3]  Sihan Qing,et al.  Proceedings of the First International Conference on Information and Communication Security , 1997 .

[4]  Bruce Schneier,et al.  Related-key cryptanalysis of 3-WAY, Biham-DES, CAST, DES-X, NewDES, RC2, and TEA , 1997, ICICS.

[5]  Daniel Bleichenbacher,et al.  Chosen Ciphertext Attacks Against Protocols Based on the RSA Encryption Standard PKCS #1 , 1998, CRYPTO.

[6]  Serge Vaudenay Fast software encryption : 5th International Workshop, FSE '98, Paris, France, March 23-25, 1998 : proceedings , 1998, FSE 1998.

[7]  Stefan Lucks,et al.  Attacking Triple Encryption , 1998, FSE.

[8]  Burton S. Kaliski,et al.  PKCS #5: Password-Based Cryptography Specification Version 2.0 , 2000, RFC.

[9]  Bernhard Schölkopf,et al.  Statistical Learning and Kernel Methods , 2001, Data Fusion and Perception.

[10]  James Manger,et al.  A Chosen Ciphertext Attack on RSA Optimal Asymmetric Encryption Padding (OAEP) as Standardized in PKCS #1 v2.0 , 2001, CRYPTO.

[11]  Gerhard Goos,et al.  Fast Software Encryption , 2001, Lecture Notes in Computer Science.

[12]  Lindsay I. Smith,et al.  A tutorial on Principal Components Analysis , 2002 .

[13]  Serge Vaudenay,et al.  Security Flaws Induced by CBC Padding - Applications to SSL, IPSEC, WTLS , 2002, EUROCRYPT.

[14]  Zhendong Su,et al.  DECKARD: Scalable and Accurate Tree-Based Detection of Code Clones , 2007, 29th International Conference on Software Engineering (ICSE'07).

[15]  Orhun Kara,et al.  A New Class of Weak Keys for Blowfish , 2007, FSE.

[16]  Ralph Weischedel,et al.  PERFORMANCE MEASURES FOR INFORMATION EXTRACTION , 2007 .

[17]  Eric Rescorla,et al.  The Transport Layer Security (TLS) Protocol Version 1.2 , 2008, RFC.

[18]  Laurie J. Hendren,et al.  Enabling static analysis for partial java programs , 2008, OOPSLA.

[19]  Kam-Fai Wong,et al.  Interpreting TF-IDF term weights as making relevance decisions , 2008, TOIS.

[20]  Kaspar Riesen,et al.  Graph Classification Based on Vector Space Embedding , 2009, Int. J. Pattern Recognit. Artif. Intell..

[21]  Byung-Gon Chun,et al.  TaintDroid: An Information-Flow Tracking System for Realtime Privacy Monitoring on Smartphones , 2010, OSDI.

[22]  Thai Duong,et al.  Cryptography in the Web: The Case of Cryptographic Design Flaws in ASP.NET , 2011, 2011 IEEE Symposium on Security and Privacy.

[23]  Elaine B. Barker,et al.  Transitions: Recommendation for Transitioning the Use of Cryptographic Algorithms and Key Lengths , 2011 .

[24]  Sean Turner,et al.  Prohibiting Secure Sockets Layer (SSL) Version 2.0 , 2011, RFC.

[25]  Elaine B. Barker,et al.  SP 800-131A. Transitions: Recommendation for Transitioning the Use of Cryptographic Algorithms and Key Lengths , 2011 .

[26]  Swarat Chaudhuri,et al.  A Study of Android Application Security , 2011, USENIX Security Symposium.

[27]  Steve Hanna,et al.  Android permissions demystified , 2011, CCS '11.

[28]  Christoph Treude,et al.  How do programmers ask and answer questions on the web?: NIER track , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[29]  Steve Hanna,et al.  Juxtapp: A Scalable System for Detecting Code Reuse among Android Applications , 2012, DIMVA.

[30]  Hao Chen,et al.  Attack of the Clones: Detecting Cloned Applications on Android Markets , 2012, ESORICS.

[31]  Bernd Freisleben,et al.  Why eve and mallory love android: an analysis of android SSL (in)security , 2012, CCS.

[32]  Vitaly Shmatikov,et al.  The most dangerous code in the world: validating SSL certificates in non-browser software , 2012, CCS.

[33]  Matthew Smith,et al.  Hey, You, Get Off of My Clipboard - On How Usability Trumps Security in Android Password Managers , 2013, Financial Cryptography.

[34]  David Brumley,et al.  An empirical study of cryptographic misuse in android applications , 2013, CCS.

[35]  Hao Chen,et al.  AnDarwin: Scalable Detection of Semantically Similar Android Applications , 2013, ESORICS.

[36]  Kenneth G. Paterson,et al.  On the Security of RC4 in TLS , 2013, USENIX Security Symposium.

[37]  Alexander Serebrenik,et al.  StackOverflow and GitHub: Associations between Software Development and Crowdsourced Knowledge , 2013, 2013 International Conference on Social Computing.

[38]  Matthew Smith,et al.  Rethinking SSL development in an appified world , 2013, CCS.

[39]  Matthew Smith,et al.  Hey, NSA: Stay Away from my Market! Future Proofing App Markets against Powerful Attackers , 2014, CCS.

[40]  Christopher Krügel,et al.  Execute This! Analyzing Unsafe and Malicious Dynamic Code Loading in Android Applications , 2014, NDSS.

[41]  Gabriele Bavota,et al.  How do API changes trigger stack overflow discussions? a study on the Android SDK , 2014, ICPC 2014.

[42]  Reid Holmes,et al.  Live API documentation , 2014, ICSE.

[43]  Peng Liu,et al.  Achieving accuracy and scalability simultaneously in detecting application clones on Android markets , 2014, ICSE.

[44]  Jose L. Muñoz,et al.  Evaluation of Cryptographic Capabilities for the Android Platform , 2015, FNSS.

[45]  Selwyn Piramuthu,et al.  Future Network Systems and Security , 2015, Communications in Computer and Information Science.

[46]  Peter Saint-Andre,et al.  Recommendations for Secure Use of Transport Layer Security (TLS) and Datagram Transport Layer Security (DTLS) , 2015, RFC.

[47]  Matthew Smith,et al.  To Pin or Not to Pin-Helping App Developers Bullet Proof Their TLS Connections , 2015, USENIX Security Symposium.

[48]  Michael Backes,et al.  You Get Where You're Looking for: The Impact of Information Sources on Code Security , 2016, 2016 IEEE Symposium on Security and Privacy (SP).

[49]  Matthew Smith,et al.  SoK: Lessons Learned from Android Security Research for Appified Software Platforms , 2016, 2016 IEEE Symposium on Security and Privacy (SP).

[50]  Eric Rescorla,et al.  The Transport Layer Security (TLS) Protocol Version 1.3 , 2018, RFC.