Privacy-utility trade-off under continual observation

In the online setting, a user continuously releases a time-series that is correlated with his private data, to a service provider to derive some utility. Due to correlations, the continual observation of the time-series puts the user at risk of inference attacks against his private data. To protect the user's privacy, the time-series is randomized prior to its release according to a probabilistic privacy mapping. This mapping should be designed in a way that balances privacy and utility requirements over time. First, we formalize the framework for the design of utility-aware privacy mappings for time-series, under both online and batch models. We introduce two threat models, for which we respectively show that under the log-loss cost function, the information leakage can be modeled by the mutual or directed information between the randomized time-series and the private data. Second, we prove that the design of the privacy mapping can be cast as a convex optimization. We provide a sequential online scheme that allows to design privacy mappings at scale, that accounts for privacy risk from the history of released data and future releases to come. Third, we prove the equivalence of the optimal mappings under the batch and the online models, in the case of a Hidden Markov Model. Evaluations on real-world time-series data show that smart-meter data can be randomized to prevent disaggregation of per-device energy consumption, while maintaining the utility of the randomized series.

[1]  H. Vincent Poor,et al.  Smart Meter Privacy: A Theoretical Framework , 2013, IEEE Transactions on Smart Grid.

[2]  J. Zico Kolter,et al.  REDD : A Public Data Set for Energy Disaggregation Research , 2011 .

[3]  G. Danezis,et al.  Privacy Technologies for Smart Grids - A Survey of Options , 2012 .

[4]  Flávio du Pin Calmon,et al.  Privacy against statistical inference , 2012, 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[5]  Tommi S. Jaakkola,et al.  Approximate Inference in Additive Factorial HMMs with Application to Energy Disaggregation , 2012, AISTATS.

[6]  H. Marko,et al.  The Bidirectional Communication Theory - A Generalization of Information Theory , 1973, IEEE Transactions on Communications.

[7]  H. Poor,et al.  Utility-Privacy Tradeoff in Databases : An Information-theoretic Approach , 2013 .

[8]  Gerhard Kramer,et al.  Directed information for channels with feedback , 1998 .

[9]  Moni Naor,et al.  Differential privacy under continual observation , 2010, STOC '10.

[10]  Muriel Médard,et al.  From the Information Bottleneck to the Privacy Funnel , 2014, 2014 IEEE Information Theory Workshop (ITW 2014).

[11]  Hirosuke Yamamoto,et al.  A source coding problem for sources with additional outputs to keep secret from the receiver or wiretappers , 1983, IEEE Trans. Inf. Theory.

[12]  H. Vincent Poor,et al.  Utility-Privacy Tradeoffs in Databases: An Information-Theoretic Approach , 2011, IEEE Transactions on Information Forensics and Security.

[13]  Stephen B. Wicker,et al.  Inferring Personal Information from Demand-Response Systems , 2010, IEEE Security & Privacy.