Relationship privacy: output perturbation for queries with joins

We study privacy-preserving query answering over data containing relationships. A social network is a prime example of such data, where the nodes represent individuals and edges represent relationships. Nearly all interesting queries over social networks involve joins, and for such queries, existing output perturbation algorithms severely distort query answers. We propose an algorithm that significantly improves utility over competing techniques, typically reducing the error bound from polynomial in the number of nodes to polylogarithmic. The algorithm is, to the best of our knowledge, the first to answer such queries with acceptable accuracy, even for worst-case inputs. The improved utility is achieved by relaxing the privacy condition. Instead of ensuring strict differential privacy, we guarantee a weaker (but still quite practical) condition based on adversarial privacy. To explain precisely the nature of our relaxation in privacy, we provide a new result that characterizes the relationship between ε-indistinguishability~(a variant of the differential privacy definition) and adversarial privacy, which is of independent interest: an algorithm is ε-indistinguishable iff it is private for a particular class of adversaries (defined precisely herein). Our perturbation algorithm guarantees privacy against adversaries in this class whose prior distribution is numerically bounded.

[1]  Dan Suciu,et al.  A formal analysis of information disclosure in data exchange , 2004, SIGMOD '04.

[2]  Dan Suciu,et al.  The Boundary Between Privacy and Utility in Data Publishing , 2007, VLDB.

[3]  Adam D. Smith,et al.  Composition attacks and auxiliary information in data privacy , 2008, KDD.

[4]  Alexandre V. Evfimievski,et al.  Limiting privacy breaches in privacy preserving data mining , 2003, PODS.

[5]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[6]  Alina Campan,et al.  A Clustering Approach for Data and Structural Anonymity in Social Networks , 2008 .

[7]  Jian Pei,et al.  Preserving Privacy in Social Networks Against Neighborhood Attacks , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[8]  Lise Getoor,et al.  Preserving the Privacy of Sensitive Relationships in Graph Data , 2007, PinKDD.

[9]  Ashwin Machanavajjhala,et al.  l-Diversity: Privacy Beyond k-Anonymity , 2006, ICDE.

[10]  G. Glauberman Proof of Theorem A , 1977 .

[11]  J. George Shanthikumar,et al.  On uniform conditional stochastic order conditioned on planar regions , 1990 .

[12]  Moshe Shaked,et al.  Some Concepts of Negative Dependence , 1982 .

[13]  Tomás Feder,et al.  Balanced matroids , 1992, STOC '92.

[14]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[15]  R. Varga,et al.  Proof of Theorem 4 , 1983 .

[16]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[17]  Van H. Vu,et al.  Concentration of non‐Lipschitz functions and applications , 2002, Random Struct. Algorithms.

[18]  K. Liu,et al.  Towards identity anonymization on graphs , 2008, SIGMOD Conference.

[19]  Jon M. Kleinberg,et al.  Wherefore art thou R3579X? , 2011, Commun. ACM.

[20]  Tasos C. Christofides,et al.  A connection between supermodular ordering and positive/negative association , 2004 .

[21]  Xiaowei Ying,et al.  Randomizing Social Networks: a Spectrum Preserving Approach , 2008, SDM.

[22]  Sofya Raskhodnikova,et al.  Smooth sensitivity and sampling in private data analysis , 2007, STOC '07.

[23]  Donald F. Towsley,et al.  Resisting structural re-identification in anonymized social networks , 2010, The VLDB Journal.