Differentially Private Continual Monitoring of Heavy Hitters from Distributed Streams

We consider applications scenarios where an untrusted aggregator wishes to continually monitor the heavy-hitters across a set of distributed streams. Since each stream can contain sensitive data, such as the purchase history of customers, we wish to guarantee the privacy of each stream, while allowing the untrusted aggregator to accurately detect the heavy hitters and their approximate frequencies. Our protocols are scalable in settings where the volume of streaming data is large, since we guarantee low memory usage and processing overhead by each data source, and low communication overhead between the data sources and the aggregator.

[1]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[2]  Oded Goldreich,et al.  Foundations of Cryptography: Volume 2, Basic Applications , 2004 .

[3]  Richard M. Karp,et al.  A simple algorithm for finding frequent elements in streams and bags , 2003, TODS.

[4]  Lap-Kei Lee,et al.  A simpler and more efficient deterministic scheme for finding frequent items over sliding windows , 2006, PODS '06.

[5]  Aleksandar Nikolov,et al.  Pan-private algorithms via statistics on sketches , 2011, PODS.

[6]  Tim Roughgarden,et al.  Universally utility-maximizing privacy mechanisms , 2008, STOC '09.

[7]  Elaine Shi,et al.  Privacy-Preserving Aggregation of Time-Series Data , 2011, NDSS.

[8]  Elaine Shi,et al.  Private and Continual Release of Statistics , 2010, ICALP.

[9]  Frank McSherry,et al.  Privacy integrated queries: an extensible platform for privacy-preserving data analysis , 2009, SIGMOD Conference.

[10]  Lap-Kei Lee,et al.  Continuous Monitoring of Distributed Data Streams over a Time-Based Sliding Window , 2011, Algorithmica.

[11]  Adam D. Smith,et al.  Composition attacks and auxiliary information in data privacy , 2008, KDD.

[12]  Erik D. Demaine,et al.  Frequency Estimation of Internet Packet Streams with Limited Space , 2002, ESA.

[13]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[14]  Moni Naor,et al.  Differential privacy under continual observation , 2010, STOC '10.

[15]  Qin Zhang,et al.  Optimal tracking of distributed heavy hitters and quantiles , 2009, PODS.

[16]  Gurmeet Singh Manku,et al.  Approximate counts and quantiles over sliding windows , 2004, PODS.

[17]  Jayadev Misra,et al.  Finding Repeated Elements , 1982, Sci. Comput. Program..

[18]  Oded Goldreich,et al.  The Foundations of Cryptography - Volume 2: Basic Applications , 2001 .

[19]  George Danezis,et al.  Privacy-Friendly Aggregation for the Smart-Grid , 2011, PETS.

[20]  Suman Nath,et al.  Differentially private aggregation of distributed time-series with transformation and encryption , 2010, SIGMOD Conference.

[21]  Moni Naor,et al.  Pan-Private Streaming Algorithms , 2010, ICS.