Chapter 12 – Security and Privacy in Big Data

In the Big Data era, it is more and more important and challenging to store massive data securely and process it effectively. Generally, Big Data is processed and queried on the cloud computing platform. Cloud computing can provide various elastic and scalable IT services in a pay-as-you-go fashion, but also it brings privacy and security problems. Additionally, adversaries can utilize correlation in Big Data and background knowledge to steal sensitive information. But the data correlations in Big Data are very complex and quite different from traditional relationships between data stored in relational databases. So it is more and more challenging to ensure security and protect privacy in Big Data. In this chapter, we will discuss important security and privacy issues about Big Data from three aspects. One is secure queries over encrypted cloud data; the second is security technology related to Big Data; the last is the security and privacy of correlative Big Data.

[1]  Hiroshi Nakagawa,et al.  Bayesian Differential Privacy on Correlated Data , 2015, SIGMOD Conference.

[2]  Ninghui Li,et al.  t-Closeness: Privacy Beyond k-Anonymity and l-Diversity , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[3]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[4]  Radu Sion,et al.  Rights protection for relational data , 2003, IEEE Transactions on Knowledge and Data Engineering.

[5]  Jian Pei,et al.  A brief survey on anonymization techniques for privacy preserving publishing of social network data , 2008, SKDD.

[6]  Tim Roughgarden,et al.  Interactive privacy via the median mechanism , 2009, STOC '10.

[7]  Gui-Sheng Chen,et al.  A method for trust management in cloud computing: Data coloring by cloud watermarking , 2011, Int. J. Autom. Comput..

[8]  Sushil Jajodia,et al.  A fragile watermarking scheme for detecting malicious modifications of database relations , 2006, Inf. Sci..

[9]  Michael Mitzenmacher,et al.  Detecting Novel Associations in Large Data Sets , 2011, Science.

[10]  Cong Wang,et al.  Privacy-preserving multi-keyword ranked search over encrypted cloud data , 2011, 2011 Proceedings IEEE INFOCOM.

[11]  Cong Wang,et al.  Secure Ranked Keyword Search over Encrypted Cloud Data , 2010, 2010 IEEE 30th International Conference on Distributed Computing Systems.

[12]  Aaron Roth,et al.  Iterative Constructions and Private Data Release , 2011, TCC.

[13]  Peter J. Haas,et al.  Watermarking relational data: framework, algorithms and analysis , 2003, The VLDB Journal.

[14]  Guy N. Rothblum,et al.  A Multiplicative Weights Mechanism for Privacy-Preserving Data Analysis , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[15]  Kunal Talwar,et al.  On the geometry of differential privacy , 2009, STOC '10.

[16]  Ming Li,et al.  Verifiable Privacy-Preserving Multi-Keyword Text Search in the Cloud Supporting Similarity-Based Ranking , 2013, IEEE Transactions on Parallel and Distributed Systems.

[17]  Jun Zhang,et al.  PrivBayes: private data release via bayesian networks , 2014, SIGMOD Conference.

[18]  Aaron Roth,et al.  Privately releasing conjunctions and the statistical query barrier , 2010, STOC '11.

[19]  Benjamin C. M. Fung,et al.  Anonymizing sequential releases , 2006, KDD '06.

[20]  Rakesh Agrawal,et al.  Watermarking Relational Databases , 2002, Very Large Data Bases Conference.

[21]  Nikos Mamoulis,et al.  Secure kNN computation on encrypted databases , 2009, SIGMOD Conference.

[22]  David J. DeWitt,et al.  Mondrian Multidimensional K-Anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[23]  Jianmin Wang,et al.  Fingerprinting relational databases , 2006, SAC '06.

[24]  Moni Naor,et al.  On the complexity of differentially private data release: efficient algorithms and hardness results , 2009, STOC '09.

[25]  Andrew McGregor,et al.  Optimizing linear counting queries under differential privacy , 2009, PODS.

[26]  Roberto J. Bayardo,et al.  Data privacy through optimal k-anonymization , 2005, 21st International Conference on Data Engineering (ICDE'05).

[27]  Cynthia Dwork,et al.  Practical privacy: the SuLQ framework , 2005, PODS.

[28]  Aaron Roth,et al.  A learning theory approach to noninteractive database privacy , 2011, JACM.

[29]  David J. DeWitt,et al.  Incognito: efficient full-domain K-anonymity , 2005, SIGMOD '05.

[30]  Rafail Ostrovsky,et al.  Searchable symmetric encryption: improved definitions and efficient constructions , 2006, CCS '06.

[31]  K. Liu,et al.  Towards identity anonymization on graphs , 2008, SIGMOD Conference.

[32]  Latanya Sweeney,et al.  Achieving k-Anonymity Privacy Protection Using Generalization and Suppression , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[33]  M. Newman,et al.  Hierarchical structure and the prediction of missing links in networks , 2008, Nature.

[34]  Weining Zhang,et al.  Edge Anonymity in Social Network Graphs , 2009, 2009 International Conference on Computational Science and Engineering.

[35]  Guy N. Rothblum,et al.  Boosting and Differential Privacy , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[36]  Ke Zeng,et al.  Publicly Verifiable Remote Data Integrity , 2008, ICICS.

[37]  ASHWIN MACHANAVAJJHALA,et al.  L-diversity: privacy beyond k-anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[38]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[39]  Claudia Keser,et al.  Fuzzy Multi-Level Security: An Experiment on Quantified Risk-Adaptive Access Control , 2007, 2007 IEEE Symposium on Security and Privacy (SP '07).

[40]  Ashwin Machanavajjhala,et al.  No free lunch in data privacy , 2011, SIGMOD '11.

[41]  Brian D. Davison,et al.  Link formation analysis in microblogs , 2011, SIGIR.

[42]  Philip S. Yu,et al.  Correlated network data publication via differential privacy , 2013, The VLDB Journal.

[43]  Jia Liu,et al.  K-isomorphism: privacy preserving network publication against structural attacks , 2010, SIGMOD Conference.