Privacy streamliner: a two-stage approach to improving algorithm efficiency

In releasing data with sensitive information, a data owner usually has seemingly conflicting goals, including privacy preservation, utility optimization, and algorithm efficiency. In this paper, we observe that a high computational complexity is usually incurred when an algorithm conflates the processes of privacy preservation and utility optimization. We then propose a novel privacy streamliner approach to decouple those two processes for improving algorithm efficiency. More specifically, we first identify a set of potential privacy-preserving solutions satisfying that an adversary's knowledge about this set itself will not help him/her to violate the privacy property; we can then optimize utility within this set without worrying about privacy breaches since such an optimization is now simulatable by adversaries. To make our approach more concrete, we study it in the context of micro-data release with publicly known generalization algorithms. The analysis and experiments both confirm our algorithms to be more efficient than existing solutions.

[1]  Philip S. Yu,et al.  Bottom-up generalization: a data mining solution to privacy protection , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[2]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[3]  Raymond Chi-Wing Wong,et al.  Privacy-Preserving Data Publishing: An Overview , 2010, Privacy-Preserving Data Publishing: An Overview.

[4]  Xin Jin,et al.  Algorithm-safe privacy-preserving data publishing , 2010, EDBT '10.

[5]  Raymond Chi-Wing Wong,et al.  Minimality Attack in Privacy Preserving Data Publishing , 2007, VLDB.

[6]  Yufei Tao,et al.  Transparent anonymization: Thwarting adversaries who know the algorithm , 2010, TODS.

[7]  Nina Mishra,et al.  Simulatable auditing , 2005, PODS.

[8]  Philip S. Yu,et al.  Privacy-preserving data publishing: A survey of recent developments , 2010, CSUR.

[9]  Sushil Jajodia,et al.  Exclusive Strategy for Generalization Algorithms in Micro-data Disclosure , 2008, DBSec.

[10]  Yufei Tao,et al.  Anatomy: simple and effective privacy preservation , 2006, VLDB.

[11]  Ninghui Li,et al.  Provably Private Data Anonymization: Or, k-Anonymity Meets Differential Privacy , 2011, ArXiv.

[12]  ASHWIN MACHANAVAJJHALA,et al.  L-diversity: privacy beyond k-anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[13]  L. Cox Suppression Methodology and Statistical Disclosure Control , 1980 .

[14]  Roberto J. Bayardo,et al.  Data privacy through optimal k-anonymization , 2005, 21st International Conference on Data Engineering (ICDE'05).

[15]  Rui Wang,et al.  Side-Channel Leaks in Web Applications: A Reality Today, a Challenge Tomorrow , 2010, 2010 IEEE Symposium on Security and Privacy.

[16]  S. Ruggles Integrated Public Use Microdata Series , 2021, Encyclopedia of Gerontology and Population Aging.

[17]  Michael Backes,et al.  Speaker Recognition in Encrypted Voice Streams , 2010, ESORICS.

[18]  Ninghui Li,et al.  t-Closeness: Privacy Beyond k-Anonymity and l-Diversity , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[19]  Sushil Jajodia,et al.  Information disclosure under realistic assumptions: privacy versus optimality , 2007, CCS '07.

[20]  Steven Ruggles,et al.  Integrated Public Use Microdata Series: Version 3 , 2003 .

[21]  Yufei Tao,et al.  M-invariance: towards privacy preserving re-publication of dynamic datasets , 2007, SIGMOD '07.

[22]  Pierangela Samarati,et al.  Protecting Respondents' Identities in Microdata Release , 2001, IEEE Trans. Knowl. Data Eng..

[23]  Rajeev Motwani,et al.  Anonymizing Tables , 2005, ICDT.

[24]  David J. DeWitt,et al.  Mondrian Multidimensional K-Anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[25]  Lingyu Wang,et al.  k-jump strategy for preserving privacy in micro-data disclosure , 2010, ICDT '10.

[26]  David C. Parkes,et al.  On non-cooperative location privacy: a game-theoretic analysis , 2009, CCS.

[27]  Philip S. Yu,et al.  Top-down specialization for information and privacy preservation , 2005, 21st International Conference on Data Engineering (ICDE'05).

[28]  David J. DeWitt,et al.  Incognito: efficient full-domain K-anonymity , 2005, SIGMOD '05.

[29]  Sushil Jajodia,et al.  L-Cover: Preserving Diversity by Anonymity , 2009, Secure Data Management.

[30]  Philip W. L. Fong,et al.  A Privacy Preservation Model for Facebook-Style Social Network Systems , 2009, ESORICS.

[31]  Vitaly Shmatikov,et al.  De-anonymizing Social Networks , 2009, 2009 30th IEEE Symposium on Security and Privacy.