Security and privacy in big data

In the Big Data era, it is more and more important and challenging to store massive data securely and process it effectively. Generally, Big Data is processed and queried on the cloud computing platform. Cloud computing can provide various elastic and scalable IT services in a pay-as-you-go fashion, but also it brings privacy and security problems. Additionally, adversaries can utilize correlation in Big Data and background knowledge to steal sensitive information. But the data correlations in Big Data are very complex and quite different from traditional relationships between data stored in relational databases. So it is more and more challenging to ensure security and protect privacy in Big Data. In this chapter, we will discuss important security and privacy issues about Big Data from three aspects. One is secure queries over encrypted cloud data; the second is security technology related to Big Data; the last is the security and privacy of correlative Big Data.

[1]  Brian D. Davison,et al.  Link formation analysis in microblogs , 2011, SIGIR.

[2]  David J. DeWitt,et al.  Mondrian Multidimensional K-Anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[3]  Jianmin Wang,et al.  Fingerprinting relational databases , 2006, SAC '06.

[4]  Xiaowei Ying,et al.  Randomizing Social Networks: a Spectrum Preserving Approach , 2008, SDM.

[5]  C. Dwork,et al.  On the Utility of Privacy-Preserving Histograms , 2004 .

[6]  Michael Mitzenmacher,et al.  Detecting Novel Associations in Large Data Sets , 2011, Science.

[7]  Rafail Ostrovsky,et al.  Public Key Encryption with Keyword Search , 2004, EUROCRYPT.

[8]  Philip S. Yu,et al.  Correlated network data publication via differential privacy , 2013, The VLDB Journal.

[9]  Moni Naor,et al.  On the complexity of differentially private data release: efficient algorithms and hardness results , 2009, STOC '09.

[10]  Andrew McGregor,et al.  Optimizing linear counting queries under differential privacy , 2009, PODS.

[11]  Ivan Stojmenovic,et al.  An overview of Fog computing and its security issues , 2016, Concurr. Comput. Pract. Exp..

[12]  Jiankun Hu,et al.  Partial fingerprint indexing: a combination of local and reconstructed global features , 2016, Concurr. Comput. Pract. Exp..

[13]  ASHWIN MACHANAVAJJHALA,et al.  L-diversity: privacy beyond k-anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[14]  Cong Wang,et al.  Privacy-Preserving Multi-Keyword Ranked Search over Encrypted Cloud Data , 2014 .

[15]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[16]  Qiong Huang,et al.  A restricted proxy re‐encryption with keyword search for fine‐grained data access control in cloud storage , 2016, Concurr. Comput. Pract. Exp..

[17]  Yang Heng-fu Study of Database Public Watermarking Based On JADE Algorithm , 2006 .

[18]  Ahmad Almogren,et al.  QoS and trust‐aware coalition formation game in data‐intensive cloud federations , 2016, Concurr. Comput. Pract. Exp..

[19]  Cynthia Dwork,et al.  Practical privacy: the SuLQ framework , 2005, PODS.

[20]  Vitaly Shmatikov,et al.  Robust De-anonymization of Large Sparse Datasets , 2008, 2008 IEEE Symposium on Security and Privacy (sp 2008).

[21]  Cynthia Dwork,et al.  Wherefore art thou r3579x?: anonymized social networks, hidden patterns, and structural steganography , 2007, WWW '07.

[22]  Claudia Keser,et al.  Fuzzy Multi-Level Security: An Experiment on Quantified Risk-Adaptive Access Control , 2007, 2007 IEEE Symposium on Security and Privacy (SP '07).

[23]  Dongning Zhao,et al.  A Method of Protecting Relational Databases Copyright with Cloud Watermark , 2007 .

[24]  Li Xu,et al.  Secure routing and resource allocation based on game theory in cooperative cognitive radio networks , 2016, Concurr. Comput. Pract. Exp..

[25]  Lei Zou,et al.  K-Automorphism: A General Framework For Privacy Preserving Network Publication , 2009, Proc. VLDB Endow..

[26]  Qianhong Wu,et al.  Versatile lightweight key distribution for big data privacy in vehicular ad hoc networks , 2016, Concurr. Comput. Pract. Exp..

[27]  Aaron Roth,et al.  A learning theory approach to noninteractive database privacy , 2011, JACM.

[28]  Weining Zhang,et al.  Edge Anonymity in Social Network Graphs , 2009, 2009 International Conference on Computational Science and Engineering.

[29]  Xiaoqian Zhang,et al.  Quantum private comparison protocol with cloud quantum computing , 2016, Concurr. Comput. Pract. Exp..

[30]  Sushil Jajodia,et al.  A fragile watermarking scheme for detecting malicious modifications of database relations , 2006, Inf. Sci..

[31]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[32]  Aaron Roth,et al.  Iterative Constructions and Private Data Release , 2011, TCC.

[33]  Dan Suciu,et al.  Boosting the accuracy of differentially private histograms through consistency , 2009, Proc. VLDB Endow..

[34]  Peter J. Haas,et al.  Watermarking relational data: framework, algorithms and analysis , 2003, The VLDB Journal.

[35]  M. Newman,et al.  Hierarchical structure and the prediction of missing links in networks , 2008, Nature.

[36]  Guy N. Rothblum,et al.  A Multiplicative Weights Mechanism for Privacy-Preserving Data Analysis , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[37]  Kunal Talwar,et al.  On the geometry of differential privacy , 2009, STOC '10.

[38]  Ming Li,et al.  Verifiable Privacy-Preserving Multi-Keyword Text Search in the Cloud Supporting Similarity-Based Ranking , 2013, IEEE Transactions on Parallel and Distributed Systems.

[39]  Sajal K. Das,et al.  Applications of k -Anonymity and ℓ -Diversity in Publishing Online Social Networks , 2013 .

[40]  Guy N. Rothblum,et al.  Boosting and Differential Privacy , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[41]  Ke Zeng,et al.  Publicly Verifiable Remote Data Integrity , 2008, ICICS.

[42]  Jun Zhang,et al.  PrivBayes: private data release via bayesian networks , 2014, SIGMOD Conference.

[43]  Ashwin Machanavajjhala,et al.  No free lunch in data privacy , 2011, SIGMOD '11.

[44]  Rakesh Agrawal,et al.  Watermarking Relational Databases , 2002, Very Large Data Bases Conference.

[45]  Aaron Roth,et al.  Privately releasing conjunctions and the statistical query barrier , 2010, STOC '11.

[46]  Benjamin C. M. Fung,et al.  Anonymizing sequential releases , 2006, KDD '06.

[47]  Nikos Mamoulis,et al.  Secure kNN computation on encrypted databases , 2009, SIGMOD Conference.

[48]  Katrina Ligett,et al.  A Simple and Practical Algorithm for Differentially Private Data Release , 2010, NIPS.

[49]  Yehuda Lindell,et al.  Introduction to Modern Cryptography , 2004 .

[50]  Gui-Sheng Chen,et al.  A method for trust management in cloud computing: Data coloring by cloud watermarking , 2011, Int. J. Autom. Comput..

[51]  Jia Liu,et al.  K-isomorphism: privacy preserving network publication against structural attacks , 2010, SIGMOD Conference.

[52]  David J. DeWitt,et al.  Incognito: efficient full-domain K-anonymity , 2005, SIGMOD '05.

[53]  Jemal H. Abawajy,et al.  Vertex re‐identification attack using neighbourhood‐pair properties , 2016, Concurr. Comput. Pract. Exp..

[54]  Dawn Xiaodong Song,et al.  Practical techniques for searches on encrypted data , 2000, Proceeding 2000 IEEE Symposium on Security and Privacy. S&P 2000.

[55]  Jian Pei,et al.  A brief survey on anonymization techniques for privacy preserving publishing of social network data , 2008, SKDD.

[56]  Tim Roughgarden,et al.  Interactive privacy via the median mechanism , 2009, STOC '10.

[57]  Radu Sion,et al.  On Watermarking Numeric Sets , 2002, IWDW.

[58]  Latanya Sweeney,et al.  Achieving k-Anonymity Privacy Protection Using Generalization and Suppression , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[59]  Radu Sion,et al.  Rights protection for relational data , 2003, IEEE Transactions on Knowledge and Data Engineering.

[60]  Cong Wang,et al.  Secure Ranked Keyword Search over Encrypted Cloud Data , 2010, 2010 IEEE 30th International Conference on Distributed Computing Systems.

[61]  Roberto J. Bayardo,et al.  Data privacy through optimal k-anonymization , 2005, 21st International Conference on Data Engineering (ICDE'05).

[62]  Hiroshi Nakagawa,et al.  Bayesian Differential Privacy on Correlated Data , 2015, SIGMOD Conference.

[63]  Ninghui Li,et al.  t-Closeness: Privacy Beyond k-Anonymity and l-Diversity , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[64]  Chunhua Su,et al.  Improved handover authentication and key pre‐distribution for wireless mesh networks , 2016, Concurr. Comput. Pract. Exp..

[65]  Qi Xia,et al.  SDIVIP2: shared data integrity verification with identity privacy preserving in mobile clouds , 2016, Concurr. Comput. Pract. Exp..

[66]  Rafail Ostrovsky,et al.  Searchable symmetric encryption: improved definitions and efficient constructions , 2006, CCS '06.

[67]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[68]  K. Liu,et al.  Towards identity anonymization on graphs , 2008, SIGMOD Conference.