Membership Inference Attacks on Aggregated Time Series with Linear Programming

: Aggregating data is a widely used technique to protect privacy. Membership inference attacks on aggregated data aim to infer whether a specific target belongs to a given aggregate. We propose to study how aggregated time series data can be susceptible to simple membership inference privacy attacks in the presence of adversarial background knowledge. We design a linear programming attack that strongly benefits from the number of data points published in the series and show on multiple public datasets how vulnerable the published data can be if the size of the aggregated data is not carefully balanced with the published time series length. We perform an extensive experimental evaluation of the attack on multiple publicly available datasets. We show the vulnerability of aggregates made of thousands of time series when the aggregate length is not carefully balanced with the published length of the time series.

[1]  Vincent Bindschaedler,et al.  Towards Realistic Membership Inferences: The Case of Survey Data , 2020, ACSAC.

[2]  Anqi Zhang,et al.  LocMIA: Membership Inference Attacks Against Aggregated Location Data , 2020, IEEE Internet of Things Journal.

[3]  C. Troncoso,et al.  Synthetic Data - A Privacy Mirage , 2020, ArXiv.

[4]  Carl A. Gunter,et al.  A Pragmatic Approach to Membership Inferences on Machine Learning Models , 2020, 2020 IEEE European Symposium on Security and Privacy (EuroS&P).

[5]  M. Rigaki,et al.  A Survey of Privacy Attacks in Machine Learning , 2020, ACM Comput. Surv..

[6]  Lingxiao Wang,et al.  Revisiting Membership Inference Under Realistic Assumptions , 2020, Proc. Priv. Enhancing Technol..

[7]  Open, Useful and Re-usable data (OURdata) Index: 2019 , 2020, OECD Public Governance Policy Papers.

[8]  Wenqi Wei,et al.  Effects of Differential Privacy and Data Skewness on Membership Inference Vulnerability , 2019, 2019 First IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications (TPS-ISA).

[9]  Ciro Cattuto,et al.  Gender gaps in urban mobility , 2019, Humanities and Social Sciences Communications.

[10]  Emiliano De Cristofaro,et al.  Measuring Membership Privacy on Aggregate Location Time-Series , 2019, Proc. ACM Meas. Anal. Comput. Syst..

[11]  Kobbi Nissim,et al.  Linear Program Reconstruction in Practice , 2018, J. Priv. Confidentiality.

[12]  Mario Fritz,et al.  ML-Leaks: Model and Data Independent Membership Inference Attacks and Defenses on Machine Learning Models , 2018, NDSS.

[13]  Vitaly Shmatikov,et al.  Exploiting Unintended Feature Leakage in Collaborative Learning , 2018, 2019 IEEE Symposium on Security and Privacy (SP).

[14]  K. Paterson,et al.  Improved Reconstruction Attacks on Encrypted Data Using Range Query Leakage , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[15]  Stefan Katzenbeisser,et al.  Two Is Not Enough: Privacy Assessment of Aggregation Schemes in Smart Metering , 2017, Proc. Priv. Enhancing Technol..

[16]  Somesh Jha,et al.  Privacy Risk in Machine Learning: Analyzing the Connection to Overfitting , 2017, 2018 IEEE 31st Computer Security Foundations Symposium (CSF).

[17]  Emiliano De Cristofaro,et al.  Knock Knock, Who's There? Membership Inference on Aggregate Location Data , 2017, NDSS.

[18]  C. Dwork,et al.  Exposed! A Survey of Attacks on Private Data , 2017, Annual Review of Statistics and Its Application.

[19]  Vitaly Shmatikov,et al.  Membership Inference Attacks Against Machine Learning Models , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[20]  Zhicong Huang,et al.  Quantifying Genomic Privacy via Inference Attack with High-Order SNV Correlations , 2015, 2015 IEEE Security and Privacy Workshops.

[21]  Lia Purpura On Tools , 2012 .

[22]  Cynthia Dwork,et al.  New Efficient Attacks on Statistical Disclosure Control Mechanisms , 2008, CRYPTO.

[23]  S. Nelson,et al.  Resolving Individuals Contributing Trace Amounts of DNA to Highly Complex Mixtures Using High-Density SNP Genotyping Microarrays , 2008, PLoS genetics.

[24]  Alexander J. Smola,et al.  Estimating labels from label proportions , 2008, ICML '08.

[25]  Cynthia Dwork,et al.  The price of privacy and the limits of LP decoding , 2007, STOC '07.

[26]  Documentation , 2006 .

[27]  Irit Dinur,et al.  Revealing information while preserving privacy , 2003, PODS.

[28]  G. W. Hart,et al.  Nonintrusive appliance load monitoring , 1992, Proc. IEEE.

[29]  Hans Kellerer,et al.  The Subset Sum Problem , 2004 .

[30]  Nature Genetics: doi:10.1038/ng.436Contents , 2022 .