Understanding Privacy-Related Questions on Stack Overflow

We analyse Stack Overflow (SO) to understand challenges and confusions developers face while dealing with privacy-related topics. We apply topic modelling techniques to 1,733 privacy-related questions to identify topics and then qualitatively analyse a random sample of 315 privacy-related questions. Identified topics include privacy policies, privacy concerns, access control, and version changes. Results show that developers do ask SO for support on privacy-related issues. We also find that platforms such as Apple and Google are defining privacy requirements for developers by specifying what "sensitive" information is and what types of information developers need to communicate to users (e.g. privacy policies). We also examine the accepted answers in our sample and find that 28% of them link to official documentation and more than half are answered by SO users without references to any external resources.

[1]  Frank Maurer,et al.  What makes a good code example?: A study of programming Q&A in StackOverflow , 2012, 2012 28th IEEE International Conference on Software Maintenance (ICSM).

[2]  Michael Backes,et al.  You Get Where You're Looking for: The Impact of Information Sources on Code Security , 2016, 2016 IEEE Symposium on Security and Privacy (SP).

[3]  Lorrie Faith Cranor,et al.  The Privacy and Security Behaviors of Smartphone App Developers , 2014 .

[4]  Robert W. Bowdidge,et al.  Why don't software developers use static analysis tools to find bugs? , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[5]  Sonia Chiasson,et al.  'Think secure from the beginning': A Survey with Software Developers , 2019, CHI.

[6]  Bashar Nuseibeh,et al.  An Anatomy of Security Conversations in Stack Overflow , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Society (ICSE-SEIS).

[7]  Walid Maalej,et al.  On lawful disclosure of personal user data: What should app developers do? , 2014, 2014 IEEE 7th International Workshop on Requirements Engineering and Law (RELAW).

[8]  Lorrie Faith Cranor,et al.  Engineering Privacy , 2009, IEEE Transactions on Software Engineering.

[9]  G. Breukelen Analysis of covariance (ANCOVA) , 2010 .

[10]  Charles A. Sutton,et al.  Why, when, and what: Analyzing Stack Overflow questions by topic, type, and code , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[11]  Robert Biddle,et al.  Cesar: Visual representation of source code vulnerabilities , 2016, 2016 IEEE Symposium on Visualization for Cyber Security (VizSec).

[12]  Giuseppe Bianco,et al.  Toxic Code Snippets on Stack Overflow , 2018, IEEE Transactions on Software Engineering.

[13]  Harry Hochheiser,et al.  Interviews and focus groups , 2017 .

[14]  Jaap-Henk Hoepman,et al.  Open-source intelligence and privacy by design , 2013, Comput. Law Secur. Rev..

[15]  Christoph Treude,et al.  How do programmers ask and answer questions on the web?: NIER track , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[16]  Nalin Asanka Gamagedara Arachchilage,et al.  Why developers cannot embed privacy into software systems?: An empirical investigation , 2018, EASE.

[17]  Foutse Khomh,et al.  Stack Overflow: A code laundering platform? , 2017, 2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER).

[18]  Florian Schaub,et al.  "We Can't Live Without Them!" App Developers' Adoption of Ad Networks and Their Considerations of Consumer Risks , 2019, SOUPS @ USENIX Security Symposium.

[19]  Norman M. Sadeh,et al.  MAPS: Scaling Privacy Compliance Analysis to a Million Apps , 2019, Proc. Priv. Enhancing Technol..

[20]  Jing Xie,et al.  Why do programmers make security errors? , 2011, 2011 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC).

[21]  Neil Salkind,et al.  Encyclopedia of research design , 2010 .

[22]  Daniel J. Solove A Taxonomy of Privacy , 2006 .

[23]  Laurie A. Williams,et al.  Challenges with Responding to Static Analysis Tool Alerts , 2019, 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR).

[24]  Xinli Yang,et al.  What Security Questions Do Developers Ask? A Large-Scale Study of Stack Overflow Posts , 2016, Journal of Computer Science and Technology.

[25]  James LaBouchardiere,et al.  App Review , 2016 .

[26]  Laurie A. Williams,et al.  What Questions Do Programmers Ask about Configuration as Code? , 2018, 2018 IEEE/ACM 4th International Workshop on Rapid Continuous Software Engineering (RCoSE).

[27]  A. Cavoukian,et al.  Privacy by Design: essential for organizational accountability and strong business practices , 2010 .

[28]  Carmela Troncoso,et al.  Engineering Privacy by Design , 2011 .

[29]  Eran Toch,et al.  Privacy by designers: software developers’ privacy mindset , 2017, Empirical Software Engineering.

[30]  Ahmed E. Hassan,et al.  What are developers talking about? An analysis of topics and trends in Stack Overflow , 2014, Empirical Software Engineering.

[31]  Jaap-Henk Hoepman,et al.  Privacy Design Strategies (The Little Blue Book) , 2018 .

[32]  Kami Vaniea,et al.  A Survey on Developer-Centred Security , 2019, 2019 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW).

[33]  Martin Pinzger,et al.  A Manual Categorization of Android App Development Issues on Stack Overflow , 2014, 2014 IEEE International Conference on Software Maintenance and Evolution.

[34]  Simson L. Garfinkel,et al.  Comparing the Usability of Cryptographic APIs , 2017, 2017 IEEE Symposium on Security and Privacy (SP).

[35]  Mira Mezini,et al.  "Jumping Through Hoops": Why do Java Developers Struggle with Cryptography APIs? , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[36]  Harry Hochheiser,et al.  Research Methods for Human-Computer Interaction , 2008 .

[37]  Heng Xu,et al.  Information Privacy Research: An Interdisciplinary Review , 2011, MIS Q..

[38]  Matthew Green,et al.  Developers are Not the Enemy!: The Need for Usable Security APIs , 2016, IEEE Security & Privacy.

[39]  Deirdre K. Mulligan,et al.  Bringing Design to the Privacy Table: Broadening “Design” in “Privacy by Design” Through the Lens of HCI , 2019, CHI.

[40]  Emad Shihab,et al.  What are mobile developers asking about? A large scale study using stack overflow , 2016, Empirical Software Engineering.

[41]  Lena Mamykina,et al.  Design lessons from the fastest q&a site in the west , 2011, CHI.

[42]  Vitaly Shmatikov,et al.  The most dangerous code in the world: validating SSL certificates in non-browser software , 2012, CCS.

[43]  Bashar Nuseibeh,et al.  An Investigation of Security Conversations in Stack Overflow: Perceptions of Security and Community Involvement , 2018, 2018 IEEE/ACM 1st International Workshop on Security Awareness from Design to Deployment (SEAD).

[44]  Michael Backes,et al.  Stack Overflow Considered Harmful? The Impact of Copy&Paste on Android Application Security , 2017, 2017 IEEE Symposium on Security and Privacy (SP).

[45]  Tianshi Li,et al.  Coconut: An IDE Plugin for Developing Privacy-Friendly Apps , 2018, Proc. ACM Interact. Mob. Wearable Ubiquitous Technol..

[46]  Michael Backes,et al.  A Stitch in Time: Supporting Android Developers in WritingSecure Code , 2017, CCS.

[47]  Christoph Treude,et al.  Crowd Documentation : Exploring the Coverage and the Dynamics of API Discussions on Stack Overflow , 2012 .

[48]  Katie Shilton,et al.  Linking Platforms, Practices, and Developer Ethics: Levers for Privacy Discourse in Mobile Application Development , 2017, Journal of Business Ethics.

[49]  Benjamin Livshits,et al.  Just-in-time static analysis , 2016, ISSTA.

[50]  Derek E. Bambauer Privacy Versus Security , 2013 .

[51]  Matthew Smith,et al.  Rethinking SSL development in an appified world , 2013, CCS.

[52]  Emily Cauble Tax Elections: How to Live with Them If We Can't Live Without Them , 2013 .

[53]  Fanglin Chen,et al.  PrivacyStreams , 2017, Proc. ACM Interact. Mob. Wearable Ubiquitous Technol..

[54]  Sonia Chiasson,et al.  Security in the Software Development Lifecycle , 2018, SOUPS @ USENIX Security Symposium.

[55]  Paul Dourish,et al.  Collective Information Practice: Exploring Privacy and Security as Social and Cultural Phenomena , 2006, Hum. Comput. Interact..

[56]  Kang G. Shin,et al.  Polisis: Automated Analysis and Presentation of Privacy Policies Using Deep Learning , 2018, USENIX Security Symposium.

[57]  Katsuro Inoue,et al.  How do developers utilize source code from stack overflow? , 2018, Empirical Software Engineering.

[58]  Lorrie Faith Cranor,et al.  Improving App Privacy: Nudging App Developers to Protect User Privacy , 2014, IEEE Security & Privacy.

[59]  Simon N. Foley,et al.  Developer-centered security and the symmetry of ignorance , 2017, NSPW.

[60]  Lorrie Faith Cranor,et al.  Designing Effective Privacy Notices and Controls , 2017, IEEE Internet Computing.

[61]  Tao Xie,et al.  PolicyLint: Investigating Internal Privacy Policy Contradictions on Google Play , 2019, USENIX Security Symposium.

[62]  Janne Lindqvist,et al.  Should I Protect You? Understanding Developers' Behavior to Privacy-Preserving APIs , 2014 .

[63]  Nikhil Patnaik,et al.  Usability Smells: An Analysis of Developers' Struggle With Crypto Libraries , 2019, SOUPS @ USENIX Security Symposium.

[64]  Michael Backes,et al.  How Internet Resources Might Be Helping You Develop Faster but Less Securely , 2017, IEEE Security & Privacy.

[65]  David Brumley,et al.  An empirical study of cryptographic misuse in android applications , 2013, CCS.

[66]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[67]  Katie Shilton,et al.  Platform privacies: Governance, collaboration, and the different meanings of “privacy” in iOS and Android development , 2018, New Media Soc..

[68]  Marc Langheinrich,et al.  Engineering Privacy by Design: Are engineers ready to live up to the challenge? , 2018, Inf. Soc..