PDA: Semantically Secure Time-Series Data Analytics with Dynamic User Groups

Third-party analysis on private records is becoming increasingly important due to the widespread data collection for various analysis purposes. However, the data in its original form often contains sensitive information about individuals, and its publication will severely breach their privacy. In this paper, we present a novel Privacy-preserving Data Analytics framework PDA, which allows a third-party aggregator to obliviously conduct many different types of polynomial-based analysis on private data records provided by a dynamic sub-group of users. Notably, every user needs to keep only <inline-formula><tex-math notation="LaTeX">$O(n)$</tex-math><alternatives> <inline-graphic xlink:href="jung-ieq1-2577034.gif"/></alternatives></inline-formula> keys to join data analysis among <inline-formula><tex-math notation="LaTeX">$O(2^n)$</tex-math><alternatives> <inline-graphic xlink:href="jung-ieq2-2577034.gif"/></alternatives></inline-formula> different groups of users, and any data analysis that is represented by polynomials is supported by our framework. Besides, a real implementation shows the performance of our framework is comparable to the peer works who present ad-hoc solutions for specific data analysis applications. Despite such nice properties of PDA, it is provably secure against a very powerful attacker (chosen-plaintext attack) even in the Dolev-Yao network model where all communication channels are insecure.

[1]  Chenglin Miao,et al.  Cloud-Enabled Privacy-Preserving Truth Discovery in Crowd Sensing Systems , 2015, SenSys.

[2]  Shaojie Tang,et al.  Privacy-preserving data aggregation without secure channel: Multivariate polynomial evaluation , 2013, 2013 Proceedings IEEE INFOCOM.

[3]  Josep Domingo-Ferrer,et al.  Database Anonymization: Privacy Models, Data Utility, and Microaggregation-based Inter-model Connections , 2016, Database Anonymization.

[4]  Benny Pinkas,et al.  FairplayMP: a system for secure multi-party computation , 2008, CCS.

[5]  Philip S. Yu,et al.  Privacy-preserving data publishing: A survey of recent developments , 2010, CSUR.

[6]  George Danezis,et al.  Smart meter aggregation via secret-sharing , 2013, SEGS '13.

[7]  Josep Domingo-Ferrer,et al.  Statistical Disclosure Control , 2012 .

[8]  Alex Jadad,et al.  The Internet and evidence-based decision-making: a needed synergy for efficient knowledge management in health care , 2000, CMAJ : Canadian Medical Association journal = journal de l'Association medicale canadienne.

[9]  Chris Clifton,et al.  Tools for privacy preserving distributed data mining , 2002, SKDD.

[10]  Stratis Ioannidis,et al.  Privacy tradeoffs in predictive analytics , 2014, SIGMETRICS '14.

[11]  Carmela Troncoso,et al.  The Wisdom of Crowds: Attacks and Optimal Constructions , 2009, ESORICS.

[12]  Sanjay Goel,et al.  Collaborative Search Log Sanitization: Toward Differential Privacy and Boosted Utility , 2015, IEEE Transactions on Dependable and Secure Computing.

[13]  Ninghui Li,et al.  Closeness: A New Privacy Measure for Data Publishing , 2010, IEEE Transactions on Knowledge and Data Engineering.

[14]  Michael J. A. Berry,et al.  Data mining techniques - for marketing, sales, and customer support , 1997, Wiley computer publishing.

[15]  Dan Boneh,et al.  The Decision Diffie-Hellman Problem , 1998, ANTS.

[16]  Jonathan Katz,et al.  Secure Multi-Party Computation of Boolean Circuits with Applications to Privacy in On-Line Marketplaces , 2012, CT-RSA.

[17]  Claude Castelluccia,et al.  I Have a DREAM! (DiffeRentially privatE smArt Metering) , 2011, Information Hiding.

[18]  Barbara Carminati,et al.  Privacy in Social Networks: How Risky is Your Social Graph? , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[19]  Stefan Katzenbeisser,et al.  Secure computations on non-integer values with applications to privacy-preserving sequence analysis , 2013, Inf. Secur. Tech. Rep..

[20]  Jure Leskovec,et al.  Hidden factors and hidden topics: understanding rating dimensions with review text , 2013, RecSys.

[21]  Carmela Troncoso,et al.  You cannot hide for long: de-anonymization of real-world dynamic behaviour , 2013, WPES.

[22]  Barbara Carminati,et al.  SHARE: Secure information sharing framework for emergency management , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[23]  Lakshminarayanan Subramanian,et al.  Two-Party Computation Model for Privacy-Preserving Queries over Distributed Databases , 2009, NDSS.

[24]  Ivan Damgård,et al.  Asynchronous Multiparty Computation: Theory and Implementation , 2008, IACR Cryptol. ePrint Arch..

[25]  Massimo Barbaro,et al.  A Face Is Exposed for AOL Searcher No , 2006 .

[26]  Li Xiong,et al.  An Adaptive Approach to Real-Time Aggregate Monitoring With Differential Privacy , 2014, IEEE Trans. Knowl. Data Eng..

[27]  Cynthia Dwork,et al.  Differential Privacy: A Survey of Results , 2008, TAMC.

[28]  James Bennett,et al.  The Netflix Prize , 2007 .

[29]  Arjun Narayan,et al.  Distributed Differential Privacy and Applications , 2015 .

[30]  Hongxia Jin,et al.  Location sharing privacy preference: analysis and personalized recommendation , 2014, IUI.

[31]  Elaine Shi,et al.  Privacy-Preserving Aggregation of Time-Series Data , 2011, NDSS.

[32]  Marc Joye,et al.  Cryptanalysis of a Privacy-Preserving Aggregation Protocol , 2017, IEEE Transactions on Dependable and Secure Computing.

[33]  Ninghui Li,et al.  Membership privacy: a unifying framework for privacy definitions , 2013, CCS.

[34]  G. Hans Note, Privacy Policies, Terms of Service, and FTC Enforcement: Broadening Unfairness Regulation for a New Era , 2012 .

[35]  Philip S. Yu,et al.  A General Survey of Privacy-Preserving Data Mining Models and Algorithms , 2008, Privacy-Preserving Data Mining.

[36]  Xenofontas A. Dimitropoulos,et al.  SEPIA: Privacy-Preserving Aggregation of Multi-Domain Network Events and Statistics , 2010, USENIX Security Symposium.

[37]  David Heckerman,et al.  Empirical Analysis of Predictive Algorithms for Collaborative Filtering , 1998, UAI.

[38]  Aziz Mohaisen,et al.  Dynamix: anonymity on dynamic social structures , 2013, ASIA CCS '13.

[39]  J. Platt Sequential Minimal Optimization : A Fast Algorithm for Training Support Vector Machines , 1998 .

[40]  F. Sera,et al.  Quality Control Methods in Accelerometer Data Processing: Defining Minimum Wear Time , 2013, PloS one.

[41]  Alessandro Perina,et al.  Person re-identification by symmetry-driven accumulation of local features , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[42]  Jan Camenisch,et al.  Proving in Zero-Knowledge that a Number Is the Product of Two Safe Primes , 1998, EUROCRYPT.

[43]  Marina Blanton Achieving Full Security in Privacy-Preserving Data Mining , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.

[44]  Jean-Sébastien Coron,et al.  On the Exact Security of Full Domain Hash , 2000, CRYPTO.

[45]  Xiang-Yang Li,et al.  Collusion-Tolerable Privacy-Preserving Sum and Product Calculation without Secure Channel , 2015, IEEE Transactions on Dependable and Secure Computing.

[46]  Jure Leskovec,et al.  From amateurs to connoisseurs: modeling the evolution of user expertise through online reviews , 2013, WWW.

[47]  Brent Waters,et al.  Functional Encryption: Definitions and Challenges , 2011, TCC.

[48]  Mihir Bellare,et al.  Lecture Notes on Cryptography , 2001 .

[49]  Pascal Paillier,et al.  Public-Key Cryptosystems Based on Composite Degree Residuosity Classes , 1999, EUROCRYPT.

[50]  Matthew K. Franklin,et al.  Efficient and Secure Evaluation of Multivariate Polynomials and Applications , 2010, ACNS.

[51]  Marc Joye,et al.  A Scalable Scheme for Privacy-Preserving Aggregation of Time-Series Data , 2013, Financial Cryptography.

[52]  Taneli Mielikäinen,et al.  Cryptographically private support vector machines , 2006, KDD '06.

[53]  Josep Domingo-Ferrer,et al.  Co-utile Collaborative Anonymization of Microdata , 2015, MDAI.

[54]  Michael Merritt,et al.  Distributed Computing and Cryptography: Proceedings of the DIMACS Workshop , 1991 .

[55]  Eric K. Clemons,et al.  The future of advertising and the value of social network websites: some preliminary examinations , 2007, ICEC.

[56]  Yao Zheng,et al.  Scalable and Secure Sharing of Personal Health Records in Cloud Computing Using Attribute-Based Encryption , 2019, IEEE Transactions on Parallel and Distributed Systems.

[57]  Stratis Ioannidis,et al.  Privacy-Preserving Ridge Regression on Hundreds of Millions of Records , 2013, 2013 IEEE Symposium on Security and Privacy.

[58]  Moti Yung,et al.  Secure Efficient Multiparty Computing of Multivariate Polynomials and Applications , 2011, ACNS.

[59]  Rakesh Agrawal,et al.  Privacy-preserving data mining , 2000, SIGMOD 2000.

[60]  Jan Camenisch,et al.  A Group Signature Scheme with Improved Efficiency , 1998, ASIACRYPT.

[61]  Silvio Micali,et al.  Probabilistic Encryption , 1984, J. Comput. Syst. Sci..

[62]  Andrew Chi-Chih Yao,et al.  Protocols for secure computations , 1982, FOCS 1982.

[63]  Yihua Zhang,et al.  Secure Computation on Floating Point Numbers , 2013, NDSS.

[64]  Ron Kohavi,et al.  Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid , 1996, KDD.

[65]  Claude Castelluccia,et al.  Differentially private sequential data publication via variable-length n-grams , 2012, CCS.