Understanding security mistakes developers make: Qualitative analysis from Build It, Break It, Fix It

Secure software development is a challenging task requiring consideration of many possible threats and mitigations. This paper investigates how and why programmers, despite a baseline of security experience, make security-relevant errors. To do this, we conducted an in-depth analysis of 94 submissions to a secure-programming contest designed to mimic real-world constraints: correctness, performance, and security. In addition to writing secure code, participants were asked to search for vulnerabilities in other teams’ programs; in total, teams submitted 866 exploits against the submissions we considered. Over an intensive six-month period, we used iterative open coding to manually, but systematically, characterize each submitted project and vulnerability (including vulnerabilities we identified ourselves). We labeled vulnerabilities by type, attacker control allowed, and ease of exploitation, and projects according to security implementation strategy. Several patterns emerged. For example, simple mistakes were least common: only 21% of projects introduced such an error. Conversely, vulnerabilities arising from a misunderstanding of security concepts were significantly more common, appearing in 78% of projects. Our results have implications for improving secure-programming APIs, API documentation, vulnerability-finding tools, and security education.

[1]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[2]  Michelle L. Mazurek,et al.  Security Developer Studies with GitHub Users: Exploring a Convenience Sample , 2017, SOUPS.

[3]  Bernd Freisleben,et al.  Why eve and mallory love android: an analysis of android SSL (in)security , 2012, CCS.

[4]  Matthew Finifter Exploring the Relationship Between Web Application Development Tools and Security , 2011, WebApps.

[5]  Vitaly Shmatikov,et al.  The most dangerous code in the world: validating SSL certificates in non-browser software , 2012, CCS.

[6]  Laurie A. Williams,et al.  One Technique is Not Enough: A Comparison of Vulnerability Discovery Techniques , 2011, 2011 International Symposium on Empirical Software Engineering and Measurement.

[7]  Pavol Zavarsky,et al.  Trend Analysis of the CVE for Software Vulnerability Management , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.

[8]  Jerome H. Saltzer,et al.  The protection of information in computer systems , 1975, Proc. IEEE.

[9]  J. Day,et al.  Computer and Internet Use in the United States: 2003 , 2005 .

[10]  Andrew Ruef,et al.  Build It, Break It, Fix It: Contesting Secure Development , 2016, CCS.

[11]  Giovanni Vigna,et al.  Why Johnny Can't Pentest: An Analysis of Black-Box Web Vulnerability Scanners , 2010, DIMVA.

[12]  J. T. Wulu,et al.  Regression analysis of count data , 2002 .

[13]  Matthew Smith,et al.  Why Do Developers Get Password Storage Wrong?: A Qualitative Usability Study , 2017, CCS.

[14]  Mary Frances Theofanos,et al.  "We make it a big deal in the company": Security Mindsets in Organizations that Develop Cryptographic Products , 2018, SOUPS @ USENIX Security Symposium.

[15]  Matthew Green,et al.  Developers are Not the Enemy!: The Need for Usable Security APIs , 2016, IEEE Security & Privacy.

[16]  Cristiano Calcagno,et al.  Infer: An Automatic Program Verifier for Memory Safety of C Programs , 2011, NASA Formal Methods.

[17]  Marco Vieira,et al.  Comparing the Effectiveness of Penetration Testing and Static Code Analysis on the Detection of SQL Injection Vulnerabilities in Web Services , 2009, 2009 15th IEEE Pacific Rim International Symposium on Dependable Computing.

[18]  Simson L. Garfinkel,et al.  Comparing the Usability of Cryptographic APIs , 2017, 2017 IEEE Symposium on Security and Privacy (SP).

[19]  Wouter Joosen,et al.  Static analysis versus penetration testing: A controlled experiment , 2013, 2013 IEEE 24th International Symposium on Software Reliability Engineering (ISSRE).

[20]  Dirk Fox,et al.  Open Web Application Security Project , 2006, Datenschutz und Datensicherheit - DuD.

[21]  Matthew Smith,et al.  Deception Task Design in Developer Password Studies: Exploring a Student Sample , 2018, SOUPS @ USENIX Security Symposium.

[22]  Matthew Smith,et al.  VCCFinder: Finding Potential Vulnerabilities in Open-Source Projects to Assist Code Audits , 2015, CCS.

[23]  Jacob Cohen Statistical Power Analysis for the Behavioral Sciences , 1969, The SAGE Encyclopedia of Research Design.

[24]  R. Fisher On the Interpretation of χ 2 from Contingency Tables , and the Calculation of P Author , 2022 .

[25]  Robert A. Martin,et al.  Vulnerability Type Distributions in CVE , 2007 .

[26]  William Newhouse,et al.  National Initiative for Cybersecurity Education (NICE) Cybersecurity Workforce Framework (Portuguese translation) , 2017 .

[27]  Christopher Krügel,et al.  Rise of the HaCRS: Augmenting Autonomous Cyber Reasoning Systems with Human Assistance , 2017, CCS.

[28]  H. Cramér Mathematical Methods of Statistics (PMS-9), Volume 9 , 1946 .

[29]  Christopher Krügel,et al.  SOK: (State of) The Art of War: Offensive Techniques in Binary Analysis , 2016, 2016 IEEE Symposium on Security and Privacy (SP).

[30]  Angelos D. Keromytis,et al.  libdft: practical dynamic data flow tracking for commodity systems , 2012, VEE '12.

[31]  Xi Wang,et al.  Why does cryptographic software fail?: a case study and open problems , 2014, APSys.

[32]  Elissa M. Redmiles,et al.  Hackers vs. Testers: A Comparison of Software Vulnerability Discovery Processes , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[33]  Andrew Meneely,et al.  When a Patch Goes Bad: Exploring the Properties of Vulnerability-Contributing Commits , 2013, 2013 ACM / IEEE International Symposium on Empirical Software Engineering and Measurement.

[34]  Joint Task Force on Cybersecurity Education Cybersecurity Curricula 2017: Curriculum Guidelines for Post-Secondary Degree Programs in Cybersecurity , 2020 .

[35]  Jerome L. Myers,et al.  Research Design & Statistical Analysis , 1995 .

[36]  Vladimir Klebanov,et al.  Practical Detection of Entropy Loss in Pseudo-Random Number Generators , 2016, CCS.

[37]  Jorge L. Díaz-Herrera,et al.  Improving software practice through education: Challenges and future trends , 2007, Future of Software Engineering (FOSE '07).

[38]  Mariana Hentea,et al.  Towards Changes in Information Security Education , 2006, J. Inf. Technol. Educ..

[39]  Matthew Smith,et al.  "If you want, I can store the encrypted password": A Password-Storage Field Study with Freelance Developers , 2019, CHI.

[40]  Byung-Gon Chun,et al.  TaintDroid: An Information-Flow Tracking System for Realtime Privacy Monitoring on Smartphones , 2010, OSDI.

[41]  Andreas Zeller,et al.  When do changes induce fixes? , 2005, ACM SIGSOFT Softw. Eng. Notes.

[42]  David Brumley,et al.  An empirical study of cryptographic misuse in android applications , 2013, CCS.

[43]  Dawson R. Engler,et al.  KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs , 2008, OSDI.

[44]  Klaus Krippendorff,et al.  Answering the Call for a Standard Reliability Measure for Coding Data , 2007 .

[45]  Yuriy Brun,et al.  API Blindspots: Why Experienced Developers Write Vulnerable Code , 2018, SOUPS @ USENIX Security Symposium.

[46]  Andrew Meneely,et al.  Interactive churn metrics , 2012, ACM SIGSOFT Softw. Eng. Notes.

[47]  Bill Chu,et al.  Security During Application Development: an Application Security Expert Perspective , 2018, CHI.

[48]  A. Raftery Bayesian Model Selection in Social Research , 1995 .

[49]  Mira Mezini,et al.  CrySL: An Extensible Approach to Validating the Correct Usage of Cryptographic APIs , 2019 .

[50]  Lars Lundberg,et al.  Improving software security with static automated code analysis in an industry setting , 2013, Softw. Pract. Exp..

[51]  Patrick Traynor,et al.  Mo(bile) Money, Mo(bile) Problems , 2017, ACM Trans. Priv. Secur..

[52]  K. Pearson On the Criterion that a Given System of Deviations from the Probable in the Case of a Correlated System of Variables is Such that it Can be Reasonably Supposed to have Arisen from Random Sampling , 1900 .

[53]  Sonia Chiasson,et al.  Security in the Software Development Lifecycle , 2018, SOUPS @ USENIX Security Symposium.

[54]  David A. Wagner,et al.  An Empirical Study on the Effectiveness of Security Code Review , 2013, ESSoS.

[55]  Jacques Klein,et al.  FlowDroid: precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for Android apps , 2014, PLDI.

[56]  Jeffrey S. Foster,et al.  A comparison of bug finding tools for Java , 2004, 15th International Symposium on Software Reliability Engineering.

[57]  Michael Backes,et al.  You Get Where You're Looking for: The Impact of Information Sources on Code Security , 2016, 2016 IEEE Symposium on Security and Privacy (SP).

[58]  Andrew Meneely,et al.  An empirical investigation of socio-technical code review metrics and security vulnerabilities , 2014, SSE@SIGSOFT FSE.

[59]  William R. Harris,et al.  Program synthesis for interactive-security systems , 2017, Formal Methods Syst. Des..

[60]  Karen Scarfone,et al.  Common Vulnerability Scoring System , 2006, IEEE Security & Privacy.

[61]  M. Angela Sasse,et al.  Obstacles to the Adoption of Secure Communication Tools , 2017, 2017 IEEE Symposium on Security and Privacy (SP).

[62]  A. Strauss,et al.  Basics of Qualitative Research , 1992 .