Differentially-Private Control-Flow Node Coverage for Software Usage Analysis

There are significant privacy concerns about the collection of usage data from deployed software. We propose a novel privacy-preserving solution for a problem of central importance to software usage analysis: control-flow graph coverage analysis over many deployed software instances. Our solution employs the machinery of differential privacy and its generalizations, and develops the following technical contributions: (1) a new notion of privacy guarantees based on a neighbor relation between control-flow graphs that prevents causalitybased inference, (2) a new differentially-private algorithm design based on a novel definition of sensitivity with respect to differences between neighbors, (3) an efficient implementation of the algorithm using dominator trees derived from control-flow graphs, (4) a pruning approach to reduce the noise level by tightening the sensitivity bound using restricted sensitivity, and (5) a refined notion of relaxed indistinguishability based on distances between neighbors. Our evaluation demonstrates that these techniques can achieve practical accuracy while providing principled privacy-by-design guarantees.

[1]  Vaidy S. Sunderam,et al.  Monitoring web browsing behavior with differential privacy , 2014, WWW.

[2]  Tianqing Zhu,et al.  Correlated Differential Privacy: Hiding Information in Non-IID Data Set , 2015, IEEE Transactions on Information Forensics and Security.

[3]  Raef Bassily,et al.  Local, Private, Efficient Protocols for Succinct Histograms , 2015, STOC.

[4]  Sofya Raskhodnikova,et al.  Analyzing Graphs with Node Differential Privacy , 2013, TCC.

[5]  Catuscia Palamidessi,et al.  Broadening the Scope of Differential Privacy Using Metrics , 2013, Privacy Enhancing Technologies.

[6]  Alessandro Orso,et al.  BugRedux: Reproducing field failures for in-house debugging , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[7]  Vitaly Shmatikov,et al.  Robust De-anonymization of Large Sparse Datasets , 2008, 2008 IEEE Symposium on Security and Privacy (sp 2008).

[8]  Prateek Mittal,et al.  Dependence Makes You Vulnberable: Differential Privacy Under Dependent Tuples , 2016, NDSS.

[9]  Sofya Raskhodnikova,et al.  Lipschitz Extensions for Node-Private Graph Statistics and the Generalized Exponential Mechanism , 2016, 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS).

[10]  Philip S. Yu,et al.  Correlated network data publication via differential privacy , 2013, The VLDB Journal.

[11]  Thomas Steinke,et al.  Differential Privacy: A Primer for a Non-Technical Audience , 2018 .

[12]  Michal Young,et al.  Residual test coverage monitoring , 1999, Proceedings of the 1999 International Conference on Software Engineering (IEEE Cat. No.99CB37002).

[13]  Kang G. Shin,et al.  LinkDroid: Reducing Unregulated Aggregation of App Usage Behaviors , 2015, USENIX Security Symposium.

[14]  Zhendong Su,et al.  Profile-guided program simplification for effective testing and analysis , 2008, SIGSOFT '08/FSE-16.

[15]  Tadayoshi Kohno,et al.  Internet Jones and the Raiders of the Lost Trackers: An Archaeological Study of Web Tracking from 1996 to 2016 , 2016, USENIX Security Symposium.

[16]  Roksana Boreli,et al.  Information leakage through mobile analytics services , 2014, HotMobile.

[17]  Raef Bassily,et al.  Practical Locally Private Heavy Hitters , 2017, NIPS.

[18]  Michael I. Jordan,et al.  Bug isolation via remote program sampling , 2003, PLDI '03.

[19]  Andrew Chin,et al.  Differential Privacy as a Response to the Reidentification Threat: The Facebook Advertiser Case Study , 2012 .

[20]  Ilya Mironov,et al.  Rényi Differential Privacy , 2017, 2017 IEEE 30th Computer Security Foundations Symposium (CSF).

[21]  Marco Canini,et al.  Toward Online Testing of Federated and Heterogeneous Distributed Systems , 2011, USENIX Annual Technical Conference.

[22]  Ting Yu,et al.  Publishing Attributed Social Graphs with Formal Privacy Guarantees , 2016, SIGMOD Conference.

[23]  Avrim Blum,et al.  Differentially private data analysis of social networks via restricted sensitivity , 2012, ITCS '13.

[24]  Ninghui Li,et al.  Publishing Graph Degree Distribution with Node Differential Privacy , 2016, SIGMOD Conference.

[25]  Sencun Zhu,et al.  Alde: Privacy Risk Analysis of Analytics Libraries in the Android Ecosystem , 2016, SecureComm.

[26]  Pankaj Dhoolia,et al.  Distributed program tracing , 2013, ESEC/FSE 2013.

[27]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[28]  Alessandro Orso,et al.  A Technique for Enabling and Supporting Debugging of Field Failures , 2007, 29th International Conference on Software Engineering (ICSE'07).

[29]  Chandra Krintz,et al.  Efficient remote profiling for resource-constrained devices , 2006, TACO.

[30]  Ting Chen,et al.  Statistical debugging using compound boolean predicates , 2007, ISSTA '07.

[31]  Sofya Raskhodnikova,et al.  Private analysis of graph structure , 2011, Proc. VLDB Endow..

[32]  Trishul M. Chilimbi,et al.  HOLMES: Effective statistical debugging via efficient path profiling , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[33]  Úlfar Erlingsson,et al.  Building a RAPPOR with the Unknown: Privacy-Preserving Learning of Associations and Data Dictionaries , 2015, Proc. Priv. Enhancing Technol..

[34]  Sebastian G. Elbaum,et al.  An empirical study of profiling strategies for released software and their impact on testing activities , 2004, ISSTA '04.

[35]  Yin Yang,et al.  Generating Synthetic Decentralized Social Graphs with Local Differential Privacy , 2017, CCS.

[36]  Alessandro Orso,et al.  Camouflage: automated anonymization of field data , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[37]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[38]  D. Wetherall,et al.  A Study of Third-Party Tracking by Mobile Apps in the Wild , 2012 .

[39]  Paul Francis,et al.  Private-by-Design Advertising Meets the Real World , 2014, CCS.

[40]  Alessandro Orso,et al.  Applying classification techniques to remotely-collected program execution data , 2005, ESEC/FSE-13.

[41]  Prateek Mittal,et al.  LinkMirage: Enabling Privacy-preserving Analytics on Social Relationships , 2016, NDSS.

[42]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[43]  Michael I. Jordan,et al.  Scalable statistical bug isolation , 2005, PLDI '05.

[44]  Irit Dinur,et al.  Revealing information while preserving privacy , 2003, PODS.

[45]  Ashwin Machanavajjhala,et al.  No free lunch in data privacy , 2011, SIGMOD '11.

[46]  Michael I. Jordan,et al.  Statistical Debugging of Sampled Programs , 2003, NIPS.

[47]  Robert E. Tarjan,et al.  A fast algorithm for finding dominators in a flowgraph , 1979, TOPL.

[48]  Wenke Lee,et al.  The Price of Free: Privacy Leakage in Personalized Mobile In-Apps Ads , 2016, NDSS.

[49]  S L Warner,et al.  Randomized response: a survey technique for eliminating evasive answer bias. , 1965, Journal of the American Statistical Association.

[50]  Ashwin Machanavajjhala,et al.  A rigorous and customizable framework for privacy , 2012, PODS.

[51]  Ben Y. Zhao,et al.  Sharing graphs using differentially private graph models , 2011, IMC '11.

[52]  Úlfar Erlingsson,et al.  RAPPOR: Randomized Aggregatable Privacy-Preserving Ordinal Response , 2014, CCS.

[53]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[54]  Aruna Seneviratne,et al.  A measurement study of tracking in paid mobile applications , 2015, WISEC.

[55]  Alessandro Orso,et al.  Leveraging field data for impact analysis and regression testing , 2003, ESEC/FSE-11.

[56]  Takao Murakami,et al.  Utility-Optimized Local Differential Privacy Mechanisms for Distribution Estimation , 2018, USENIX Security Symposium.

[57]  Zhendong Su,et al.  Context-aware statistical debugging: from bug predictors to faulty control flow paths , 2007, ASE.

[58]  David Wetherall,et al.  Detecting and Defending Against Third-Party Tracking on the Web , 2012, NSDI.

[59]  Claude Castelluccia,et al.  On the uniqueness of Web browsing history patterns , 2014, Ann. des Télécommunications.

[60]  Sofya Raskhodnikova,et al.  Smooth sensitivity and sampling in private data analysis , 2007, STOC '07.