Fast Private Norm Estimation and Heavy Hitters

We consider the problems of computing the Euclidean norm of the difference of two vectors and, as an application, computing the large components (Heavy Hitters) in the difference. We provide protocols that are approximate but private in the semi-honest model and efficient in terms of time and communication in the vector length N. We provide the following, which can serve as building blocks to other protocols: - Euclidean norm problem: we give a protocol with quasi-linear local computation and polylogarithmic communication in N leaking only the true value of the norm. For processing massive datasets, the intended application, where N is typically huge, our improvement over a recent result with quadratic runtime is significant. - Heavy Hitters problem: suppose, for a prescribed B, we want the B largest components in the difference vector. We give a protocol with quasi-linear local computation and polylogarithmic communication leaking only the set of true B largest components and the Euclidean norm of the difference vector. We justify the leakage as (1) desirable, since it gives a measure of goodness of approximation; or (2) inevitable, since we show that there are contexts where linear communication is required for approximating the Heavy Hitters.

[1]  Graham Cormode,et al.  What's hot and what's not: tracking most frequent items dynamically , 2003, PODS '03.

[2]  Hoeteck Wee,et al.  Toward Privacy in Public Databases , 2005, TCC.

[3]  Noga Alon,et al.  The space complexity of approximating the frequency moments , 1996, STOC '96.

[4]  Bernard Chazelle,et al.  Approximate nearest neighbors and the fast Johnson-Lindenstrauss transform , 2006, STOC '06.

[5]  Moni Naor,et al.  Communication preserving protocols for secure function evaluation , 2001, STOC '01.

[6]  Sudipto Guha,et al.  Fast, small-space algorithms for approximate histogram maintenance , 2002, STOC '02.

[7]  Aggelos Kiayias,et al.  Traceable Signatures , 2004, EUROCRYPT.

[8]  David P. Woodruff,et al.  Polylogarithmic Private Approximations and Efficient Matching , 2006, TCC.

[9]  Rafail Ostrovsky,et al.  Replication is not needed: single database, computationally-private information retrieval , 1997, Proceedings 38th Annual Symposium on Foundations of Computer Science.

[10]  Joan Feigenbaum,et al.  Secure multiparty computation of approximations , 2001, TALG.

[11]  V. Milman,et al.  Asymptotic Theory Of Finite Dimensional Normed Spaces , 1986 .

[12]  Eyal Kushilevitz,et al.  Private information retrieval , 1998, JACM.

[13]  Paz Carmi,et al.  Private approximation of search problems , 2006, STOC '06.

[14]  Avi Wigderson,et al.  Completeness theorems for non-cryptographic fault-tolerant distributed computation , 1988, STOC '88.

[15]  Yehuda Lindell,et al.  Privacy Preserving Data Mining , 2002, Journal of Cryptology.

[16]  Eyal Kushilevitz,et al.  Learning Decision Trees Using the Fourier Spectrum , 1993, SIAM J. Comput..

[17]  Benny Pinkas,et al.  Secure Computation of the k th-Ranked Element , 2004, EUROCRYPT.

[18]  Silvio Micali,et al.  Computationally Private Information Retrieval with Polylogarithmic Communication , 1999, EUROCRYPT.

[19]  Eyal Kushilevitz,et al.  Learning Decision Trees Using the Fourier Sprectrum (Extended Abstract) , 1991, Symposium on the Theory of Computing.

[20]  Noga Alon,et al.  Tracking join and self-join sizes in limited storage , 1999, PODS '99.

[21]  Robert Krauthgamer,et al.  Private approximation of NP-hard functions , 2001, STOC '01.

[22]  Joan Feigenbaum,et al.  Secure Multiparty Computation of Approximations (Extended Abstract) , 2001 .

[23]  Andrew Chi-Chih Yao,et al.  Protocols for secure computations , 1982, FOCS 1982.

[24]  Benny Pinkas,et al.  Efficient Private Matching and Set Intersection , 2004, EUROCRYPT.