Introducing Differential Privacy Mechanisms for Mobile App Analytics of Dynamic Content

Mobile app analytics gathers detailed data about millions of app users. Both customers and governments are becoming increasingly concerned about the privacy implications of such data gathering. Thus, it is highly desirable to design privacy-preserving versions of mobile app analytics. We aim to achieve this goal using differential privacy, a leading algorithm design framework for privacy-preserving data analysis.We apply differential privacy to dynamically-created content that is retrieved from a content server and is displayed to the app user. User interactions with this content are then reported to the app analytics infrastructure. Unlike problems considered in related prior work, such analytics could convey a wealth of sensitive information—for example, about an app user’s political beliefs, dietary choices, health conditions, or travel interests. To provide rigorous privacy protections for this information, we design a differentially-private solution for such data gathering.Our first contribution is a conceptual design for data collection. Since existing approaches cannot be used to solve this problem, we develop a new design to determine how the app gathers data at run time and how it randomizes it to achieve differential privacy. Our second contribution is an instantiation of this design for Android apps that use Google Firebase. This approach keeps privacy logic separate from the app code, and uses code rewriting to automate the introduction and evolution of privacy-related code. Finally, we develop techniques for automated design space characterization. By simulating different execution scenarios and characterizing their privacy/accuracy trade-offs, our analysis provides critical pre-deployment insights to app developers.

[1]  Úlfar Erlingsson,et al.  RAPPOR: Randomized Aggregatable Privacy-Preserving Ordinal Response , 2014, CCS.

[2]  Uri Stemmer,et al.  Heavy Hitters and the Structure of Local Privacy , 2017, PODS.

[3]  Raef Bassily,et al.  Differentially-Private Control-Flow Node Coverage for Software Usage Analysis , 2020, USENIX Security Symposium.

[4]  Ninghui Li,et al.  Locally Differentially Private Protocols for Frequency Estimation , 2017, USENIX Security Symposium.

[5]  Sebastian G. Elbaum,et al.  An empirical study of profiling strategies for released software and their impact on testing activities , 2004, ISSTA '04.

[6]  Chen Fu,et al.  Is Data Privacy Always Good for Software Testing? , 2010, 2010 IEEE 21st International Symposium on Software Reliability Engineering.

[7]  Ram Krishnan,et al.  Toward a Framework for Detecting Privacy Policy Violations in Android Application Code , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[8]  Malcolm Hall,et al.  ProtectMyPrivacy: detecting and mitigating privacy leaks on iOS devices using crowdsourcing , 2013, MobiSys '13.

[9]  Vitaly Shmatikov,et al.  De-anonymizing Social Networks , 2009, 2009 30th IEEE Symposium on Security and Privacy.

[10]  Xiao-Yuan Jing,et al.  On the Multiple Sources and Privacy Preservation Issues for Heterogeneous Defect Prediction , 2019, IEEE Transactions on Software Engineering.

[11]  Rayid Ghani,et al.  Testing software in age of data privacy: a balancing act , 2011, ESEC/FSE '11.

[12]  Sencun Zhu,et al.  Alde: Privacy Risk Analysis of Analytics Libraries in the Android Ecosystem , 2016, SecureComm.

[13]  Tim Menzies,et al.  Balancing Privacy and Utility in Cross-Company Defect Prediction , 2013, IEEE Transactions on Software Engineering.

[14]  Vitaly Shmatikov,et al.  Robust De-anonymization of Large Sparse Datasets , 2008, 2008 IEEE Symposium on Security and Privacy (sp 2008).

[15]  Atanas Rountev,et al.  A study of event frequency profiling with differential privacy , 2020, CC.

[16]  Andreas Haeberlen,et al.  Differential Privacy: An Economic Method for Choosing Epsilon , 2014, 2014 IEEE 27th Computer Security Foundations Symposium.

[17]  Martin J. Wainwright,et al.  Local privacy and statistical minimax rates , 2013, 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[18]  Kang G. Shin,et al.  LinkDroid: Reducing Unregulated Aggregation of App Usage Behaviors , 2015, USENIX Security Symposium.

[19]  Xue Qin,et al.  GUILeak: Tracing Privacy Policy Claims on User Input Data for Android Applications , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[20]  Tim Menzies,et al.  Privacy and utility for defect prediction: Experiments with MORPH , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[21]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[22]  Lieven Eeckhout,et al.  Statistically rigorous java performance evaluation , 2007, OOPSLA.

[23]  Atanas Rountev,et al.  Introducing Privacy in Screen Event Frequency Analysis for Android Apps , 2019, 2019 19th International Working Conference on Source Code Analysis and Manipulation (SCAM).

[24]  Aruna Seneviratne,et al.  A measurement study of tracking in paid mobile applications , 2015, WISEC.

[25]  Alessandro Orso,et al.  Camouflage: automated anonymization of field data , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[26]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[27]  D. Wetherall,et al.  A Study of Third-Party Tracking by Mobile Apps in the Wild , 2012 .

[28]  Eran Toch,et al.  Privacy by designers: software developers’ privacy mindset , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[29]  Janardhan Kulkarni,et al.  Collecting Telemetry Data Privately , 2017, NIPS.

[30]  Raef Bassily,et al.  Practical Locally Private Heavy Hitters , 2017, NIPS.

[31]  Miguel Castro,et al.  Better bug reporting with better privacy , 2008, ASPLOS 2008.

[32]  David Lo,et al.  kb-anonymity: a model for anonymized behaviour-preserving test and debugging data , 2011, PLDI '11.

[33]  David Lo,et al.  kbe-anonymity: test data anonymization for evolving programs , 2012, 2012 Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering.

[34]  Roksana Boreli,et al.  Information leakage through mobile analytics services , 2014, HotMobile.

[35]  Kobbi Nissim,et al.  Clustering Algorithms for the Centralized and Local Models , 2017, ALT.