Understanding Developers Privacy Concerns Through Reddit Thread Analysis

With the growing global emphasis on regulating the protection of personal information and increasing user expectation of the same, developing with privacy in mind is becoming ever more important. In this paper, we study the concerns, questions, and solutions developers discuss on Reddit forums to enhance our understanding of their perceptions and challenges while developing applications in the current privacy-focused world. We perform various forms of Natural Language Processing (NLP) on 437,317 threads from subreddits such as r/webdev, r/androiddev, and r/iOSProgramming to identify both common points of discussion and how these points change over time as new regulations are passed around the globe. Our results show that there are common trends in privacy topics among the different subreddits while the frequency of those topics differs between web and mobile applications.

[1]  F. Schaub,et al.  “It’s up to the Consumer to be Smart”: Understanding the Security and Privacy Attitudes of Smart Home Users on Reddit , 2023, 2023 IEEE Symposium on Security and Privacy (SP).

[2]  Serge Egelman,et al.  Developers Say the Darnedest Things: Privacy Compliance Processes Followed by Developers of Child-Directed Apps , 2022, Proc. Priv. Enhancing Technol..

[3]  Julia Bernd,et al.  Privacy, Permissions, and the Health App Ecosystem: A Stack Overflow Exploration , 2022, EuroUSEC.

[4]  Larissa Chazette,et al.  On the Subjectivity of Emotions in Software Projects: How Reliable are Pre-Labeled Data Sets for Sentiment Analysis? , 2022, J. Syst. Softw..

[5]  Sai Teja Peddinti,et al.  Hark: A Deep Learning System for Navigating Privacy Feedback at Scale , 2022, 2022 IEEE Symposium on Security and Privacy (SP).

[6]  Kami Vaniea,et al.  Understanding Privacy-Related Advice on Stack Overflow , 2022, Proc. Priv. Enhancing Technol..

[7]  Kuldar Taveter,et al.  Mining Reddit as a New Source for Software Requirements , 2021, 2021 IEEE 29th International Requirements Engineering Conference (RE).

[8]  Kami Vaniea,et al.  “Developers Are Responsible”: What Ad Networks Tell Developers About Privacy , 2021, CHI Extended Abstracts.

[9]  M. V. Kleek,et al.  “Money makes the world go around”: Identifying Barriers to Better Privacy in Children’s Apps From Developers’ Perspectives , 2021, CHI.

[10]  Marco Autili,et al.  Challenges in Developing Desktop Web Apps: a Study of Stack Overflow and GitHub , 2021, 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR).

[11]  Casey Fiesler,et al.  Studying Reddit: A Systematic Overview of Disciplines, Approaches, Methods, and Ethics , 2021, Social Media + Society.

[12]  Laura A. Dabbish,et al.  How Developers Talk About Personal Data and What It Means for User Privacy , 2021, Proc. ACM Hum. Comput. Interact..

[13]  Naomi Saphra,et al.  Understanding Privacy-Related Questions on Stack Overflow , 2020, CHI.

[14]  Nicole Novielli,et al.  Can We Use SE-specific Sentiment Analysis Tools in a Cross-Platform Setting? , 2020, 2020 IEEE/ACM 17th International Conference on Mining Software Repositories (MSR).

[15]  Ruben L. Bach,et al.  New Data Sources in Social Science Research: Things to Know Before Working With Reddit Data , 2019, Social Science Computer Review.

[16]  Marc Langheinrich,et al.  Inside the Organization: Why Privacy and Security Engineering Is a Challenge for Engineers , 2018, Proceedings of the IEEE.

[17]  Agustí Verde Parera,et al.  General Data Protection Regulation , 2018, Data Protection Law in the EU: Roles, Responsibilities and Liability.

[18]  Percy Liang,et al.  Know What You Don’t Know: Unanswerable Questions for SQuAD , 2018, ACL.

[19]  Katie Shilton,et al.  Platform privacies: Governance, collaboration, and the different meanings of “privacy” in iOS and Android development , 2018, New Media Soc..

[20]  Eran Toch,et al.  Privacy by designers: software developers’ privacy mindset , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[21]  Alexander Serebrenik,et al.  Security and emotion: sentiment analysis of security discussions on GitHub , 2014, MSR 2014.

[22]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[23]  Hendrik De Smet,et al.  Corpus data , 2001, Form and Function of Parasyntactic Presentation Structures.

[24]  Jaap-Henk Hoepman,et al.  Privacy Design Strategies (The Little Blue Book) , 2018 .

[25]  Jin H. Im,et al.  Privacy , 2002, Encyclopedia of Information Systems.