MET𝔸P: revisiting Privacy-Preserving Data Publishing using secure devices

The goal of Privacy-Preserving Data Publishing (PPDP) is to generate a sanitized (i.e. harmless) view of sensitive personal data (e.g. a health survey), to be released to some agencies or simply the public. However, traditional PPDP practices all make the assumption that the process is run on a trusted central server. In this article, we argue that the trust assumption on the central server is far too strong. We propose Met𝔸P, a generic fully distributed protocol, to execute various forms of PPDP algorithms on an asymmetric architecture composed of low power secure devices and a powerful but untrusted infrastructure. We show that this protocol is both correct and secure against honest-but-curious or malicious adversaries. Finally, we provide an experimental validation showing that this protocol can support PPDP processes scaling up to nation-wide surveys.

[1]  David J. DeWitt,et al.  Mondrian Multidimensional K-Anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[2]  Chris Clifton,et al.  A secure distributed framework for achieving k-anonymity , 2006, The VLDB Journal.

[3]  Stratis Viglas,et al.  Data management over flash memory , 2011, SIGMOD '11.

[4]  Yuval Ishai,et al.  Founding Cryptography on Tamper-Proof Hardware Tokens , 2010, IACR Cryptol. ePrint Arch..

[5]  Hakan Hacigümüs,et al.  Executing SQL over encrypted data in the database-service-provider model , 2002, SIGMOD '02.

[6]  Benjamin C. M. Fung,et al.  Centralized and Distributed Anonymization for High-Dimensional Healthcare Data , 2010, TKDD.

[7]  Ninghui Li,et al.  t-Closeness: Privacy Beyond k-Anonymity and l-Diversity , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[8]  Martin P. Loeb,et al.  CSI/FBI Computer Crime and Security Survey , 2004 .

[9]  Phillip Rogaway,et al.  Proceedings of the 31st annual conference on Advances in cryptology , 2011 .

[10]  Ashwin Machanavajjhala,et al.  Data Publishing against Realistic Adversaries , 2009, Proc. VLDB Endow..

[11]  Yufei Tao,et al.  Anatomy: simple and effective privacy preservation , 2006, VLDB.

[12]  Yuval Ishai,et al.  Interactive Locking, Zero-Knowledge PCPs, and Unconditional Cryptography , 2010, Electron. Colloquium Comput. Complex..

[13]  Radu Sion,et al.  Proceedings of the 14th international conference on Financial Cryptography and Data Security , 2010 .

[14]  Philippe Pucheral,et al.  Sanitizing Microdata without Leak: Combining Preventive and Curative Actions , 2011, ISPEC.

[15]  Benjamin C. M. Fung,et al.  Privacy-preserving data mashup , 2009, EDBT '09.

[16]  David Pointcheval,et al.  Proceedings of the 31st Annual international conference on Theory and Applications of Cryptographic Techniques , 2012 .

[17]  Dawn Xiaodong Song,et al.  Secure Distributed Data Aggregation , 2011, Found. Trends Databases.

[18]  Panos Kalnis,et al.  Fast Data Anonymization with Low Information Loss , 2007, VLDB.

[19]  Tal Rabin Advances in Cryptology - CRYPTO 2010, 30th Annual Cryptology Conference, Santa Barbara, CA, USA, August 15-19, 2010. Proceedings , 2010, CRYPTO.

[20]  Oded Goldreich,et al.  Foundations of Cryptography - A Primer , 2005, Found. Trends Theor. Comput. Sci..

[21]  Aggelos Kiayias Proceedings of the 11th international conference on Topics in cryptology: CT-RSA 2011 , 2011 .

[22]  Vic Barnett,et al.  Outliers in Statistical Data , 1980 .

[23]  Ashwin Machanavajjhala,et al.  No free lunch in data privacy , 2011, SIGMOD '11.

[24]  Dan Suciu,et al.  The Boundary Between Privacy and Utility in Data Publishing , 2007, VLDB.

[25]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[26]  Philip S. Yu,et al.  Differentially private data release for data mining , 2011, KDD.

[27]  Adam Meyerson,et al.  On the complexity of optimal K-anonymity , 2004, PODS.

[28]  Ahmad-Reza Sadeghi,et al.  Embedded SFE: Offloading Server and Network Using Hardware Tokens , 2010, Financial Cryptography.

[29]  Li Xiong,et al.  Distributed Anonymization: Achieving Privacy for Both Data Subjects and Data Providers , 2009, DBSec.

[30]  Yehuda Lindell,et al.  Constructions of truly practical secure protocols using standardsmartcards , 2008, CCS.

[31]  Benny Pinkas,et al.  Secure Set Intersection with Untrusted Hardware Tokens , 2011, CT-RSA.

[32]  Daniel Kifer,et al.  Attacks on privacy and deFinetti's theorem , 2009, SIGMOD Conference.

[33]  Andrew Chi-Chih Yao,et al.  Protocols for secure computations , 1982, FOCS 1982.

[34]  Sheng Zhong,et al.  Privacy-enhancing k-anonymization of customer data , 2005, PODS.

[35]  Sheng Zhong,et al.  k-Anonymous data collection , 2009, Inf. Sci..

[36]  Panos Kalnis,et al.  SABRE: a Sensitive Attribute Bucketization and REdistribution framework for t-closeness , 2011, The VLDB Journal.

[37]  Jayant R. Haritsa,et al.  A Framework for High-Accuracy Privacy-Preserving Mining , 2005, ICDE.

[38]  Bing-Rong Lin,et al.  Towards an axiomatization of statistical privacy and utility , 2010, PODS.

[39]  Radu Sion,et al.  TrustedDB: A Trusted Hardware-Based Database with Privacy and Data Confidentiality , 2011, IEEE Transactions on Knowledge and Data Engineering.

[40]  Silvio Micali,et al.  How to play ANY mental game , 1987, STOC.

[41]  Luc Bouganim,et al.  Secure personal data servers , 2010, Proc. VLDB Endow..

[42]  Ashwin Machanavajjhala,et al.  l-Diversity: Privacy Beyond k-Anonymity , 2006, ICDE.

[43]  ASHWIN MACHANAVAJJHALA,et al.  L-diversity: privacy beyond k-anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[44]  Luc Bouganim,et al.  Pluggable personal data servers , 2010, SIGMOD Conference.

[45]  Tristan Allard Sanitizing microdata without leak : a decentralized approach , 2011 .

[46]  Jonathan Katz,et al.  Universally Composable Multi-party Computation Using Tamper-Proof Hardware , 2007, EUROCRYPT.

[47]  Cynthia Dwork,et al.  Practical privacy: the SuLQ framework , 2005, PODS.

[48]  R. Power CSI/FBI computer crime and security survey , 2001 .

[49]  Nathan Chenette,et al.  Order-Preserving Encryption Revisited: Improved Security Analysis and Alternative Solutions , 2011, CRYPTO.

[50]  Ashwin Machanavajjhala,et al.  A rigorous and customizable framework for privacy , 2012, PODS.

[51]  Yannis Rouselakis,et al.  Property Preserving Symmetric Encryption , 2012, EUROCRYPT.

[52]  Philippe Pucheral,et al.  Safe realization of the Generalization privacy mechanism , 2011, 2011 Ninth Annual International Conference on Privacy, Security and Trust.

[53]  Chedy Raïssi,et al.  Distributed Privacy Preserving Data Collection , 2011, DASFAA.

[54]  Dan Suciu,et al.  The Boundary Between Privacy and Utility in Data Anonymization , 2006, ArXiv.

[55]  Benjamin C. M. Fung,et al.  Privacy-preserving data publishing , 2007 .

[56]  Philip S. Yu,et al.  Privacy-preserving data publishing: A survey of recent developments , 2010, CSUR.

[57]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[58]  Wei Zhao,et al.  Distributed Privacy Preserving Information Sharing , 2005, VLDB.

[59]  Graham Cormode,et al.  Personal privacy vs population privacy: learning to attack anonymization , 2011, KDD.