Security and Privacy for Big Data

Security and privacy is one of the critical issues for big data and has drawn great attention of both industry and research community. Following this major trend, in this chapter we provide an overview of state-of-the-art research issues and achievements in the field of security and privacy of big data, by highlighting recent advances in data encryption, privacy preservation and trust management. In section of data encryption, searchable encryption, order-preserving encryption, structured encryption and homomorphic encryption are respectively analyzed. In section of privacy preservation, three representative mechanisms including access control, auditing and statistical privacy, are reviewed. In section of trust management, several approaches especially trusted computing based approaches and trust and reputation models are investigated. Besides, current security measures for big data platforms, particularly for Apache Hadoop, are also discussed. The approaches presented in the chapter selected for this survey represent only a small fraction of the wide research effort within security and privacy of big data. Nevertheless, they serve as an indication of the diversity of challenges that are being addressed.

[1]  Athman Bouguettaya,et al.  Web Services Reputation Assessment Using a Hidden Markov Model , 2009, ICSOC/ServiceWave.

[2]  ASHWIN MACHANAVAJJHALA,et al.  L-diversity: privacy beyond k-anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[3]  Cong Wang,et al.  Privacy-Preserving Public Auditing for Data Storage Security in Cloud Computing , 2010, 2010 Proceedings IEEE INFOCOM.

[4]  Reza Curtmola,et al.  Provable data possession at untrusted stores , 2007, CCS '07.

[5]  Nathan Chenette,et al.  Order-Preserving Symmetric Encryption , 2009, IACR Cryptol. ePrint Arch..

[6]  Dawn Xiaodong Song,et al.  Practical techniques for searches on encrypted data , 2000, Proceeding 2000 IEEE Symposium on Security and Privacy. S&P 2000.

[7]  Craig Gentry,et al.  (Leveled) fully homomorphic encryption without bootstrapping , 2012, ITCS '12.

[8]  Ting Yu,et al.  SecureMR: A Service Integrity Assurance Framework for MapReduce , 2009, 2009 Annual Computer Security Applications Conference.

[9]  Prasant Mohapatra,et al.  Trust Computations and Trust Dynamics in Mobile Adhoc Networks: A Survey , 2012, IEEE Communications Surveys & Tutorials.

[10]  John C. Duchi,et al.  Privacy and Statistical Risk: Formalisms and Minimax Bounds , 2014, ArXiv.

[11]  Tharam S. Dillon,et al.  SLA-Based Trust Model for Cloud Computing , 2010, 2010 13th International Conference on Network-Based Information Systems.

[12]  Nathan Chenette,et al.  Order-Preserving Encryption Revisited: Improved Security Analysis and Alternative Solutions , 2011, CRYPTO.

[13]  Theodore Y. Ts'o,et al.  Kerberos: an authentication service for computer networks , 1994, IEEE Communications Magazine.

[14]  Sagar Naik,et al.  Enhancing Data Integrity and Privacy in the Cloud: An Agenda , 2013, Computer.

[15]  Elaine Shi,et al.  Practical dynamic proofs of retrievability , 2013, CCS.

[16]  Jinjun Chen,et al.  A security framework in G-Hadoop for big data computing across distributed Cloud data centres , 2014, J. Comput. Syst. Sci..

[17]  Athanasios V. Vasilakos,et al.  A survey on trust management for Internet of Things , 2014, J. Netw. Comput. Appl..

[18]  Yannis Rouselakis,et al.  Property Preserving Symmetric Encryption , 2012, EUROCRYPT.

[19]  Shan Wang,et al.  A Survey of Extended Role-Based Access Control in Cloud Computing , 2015 .

[20]  Audun Jøsang,et al.  A survey of trust and reputation systems for online service provision , 2007, Decis. Support Syst..

[21]  Lei Wei,et al.  Garbled Circuits via Structured Encryption , 2013, Financial Cryptography Workshops.

[22]  Benjamin C. Pierce,et al.  Distance makes the types grow stronger: a calculus for differential privacy , 2010, ICFP '10.

[23]  Bu-Sung Lee,et al.  Towards Achieving Accountability, Auditability and Trust in Cloud Computing , 2011, ACC.

[24]  Cong Wang,et al.  Enabling Public Verifiability and Data Dynamics for Storage Security in Cloud Computing , 2009, ESORICS.

[25]  Elaine Shi,et al.  Multi-Dimensional Range Query over Encrypted Data , 2007, 2007 IEEE Symposium on Security and Privacy (SP '07).

[26]  Gerome Miklau,et al.  Exponential random graph estimation under differential privacy , 2014, KDD.

[27]  Ari Juels,et al.  Pors: proofs of retrievability for large files , 2007, CCS '07.

[28]  Ran Canetti,et al.  Modular Order-Preserving Encryption, Revisited , 2015, SIGMOD Conference.

[29]  Kaoru Kurosawa,et al.  UC-Secure Searchable Symmetric Encryption , 2012, Financial Cryptography.

[30]  Hovav Shacham,et al.  Compact Proofs of Retrievability , 2008, ASIACRYPT.

[31]  Hugo Krawczyk,et al.  Highly-Scalable Searchable Symmetric Encryption with Support for Boolean Queries , 2013, IACR Cryptol. ePrint Arch..

[32]  Pramod Viswanath,et al.  The Composition Theorem for Differential Privacy , 2013, IEEE Transactions on Information Theory.

[33]  Dong Kun Noh,et al.  Attribute-Based Access Control with Efficient Revocation in Data Outsourcing Systems , 2011, IEEE Transactions on Parallel and Distributed Systems.

[34]  Tamir Tassa,et al.  k-Concealment: An Alternative Model of k-Type Anonymity , 2012, Trans. Data Priv..

[35]  Guy Gogniat,et al.  Recent Advances in Homomorphic Encryption: A Possible Future for Signal Processing in the Encrypted Domain , 2013, IEEE Signal Processing Magazine.

[36]  Brent Waters,et al.  Homomorphic Encryption from Learning with Errors: Conceptually-Simpler, Asymptotically-Faster, Attribute-Based , 2013, CRYPTO.

[37]  Vinod Vaikuntanathan,et al.  Efficient Fully Homomorphic Encryption from (Standard) LWE , 2011, 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science.

[38]  I-Ling Yen,et al.  A Note for the Ideal Order-Preserving Encryption Object and Generalized Order-Preserving Encryption , 2012, IACR Cryptol. ePrint Arch..

[39]  Nickolai Zeldovich,et al.  An Ideal-Security Protocol for Order-Preserving Encoding , 2013, 2013 IEEE Symposium on Security and Privacy.

[40]  Robert H. Deng,et al.  HASBE: A Hierarchical Attribute-Based Solution for Flexible and Scalable Access Control in Cloud Computing , 2012, IEEE Transactions on Information Forensics and Security.

[41]  Kai Hwang,et al.  Cloud Security with Virtualized Defense and Reputation-Based Trust Mangement , 2009, 2009 Eighth IEEE International Conference on Dependable, Autonomic and Secure Computing.

[42]  Rajkumar Buyya,et al.  Interconnected Cloud Computing Environments , 2014, ACM Comput. Surv..

[43]  Slava Kisilevich,et al.  Efficient Multidimensional Suppression for K-Anonymity , 2010, IEEE Transactions on Knowledge and Data Engineering.

[44]  Raymond Chi-Wing Wong,et al.  (α, k)-anonymous data publishing , 2009, Journal of Intelligent Information Systems.

[45]  Pierangela Samarati,et al.  Generalizing Data to Provide Anonymity when Disclosing Information , 1998, PODS 1998.

[46]  Vinod Vaikuntanathan,et al.  Fully Homomorphic Encryption from Ring-LWE and Security for Key Dependent Messages , 2011, CRYPTO.

[47]  Brent Waters,et al.  Conjunctive, Subset, and Range Queries on Encrypted Data , 2007, TCC.

[48]  Andreas Haeberlen,et al.  Linear dependent types for differential privacy , 2013, POPL.

[49]  Moni Naor,et al.  Synthesizers and Their Application to the Parallel Construction of Pseudo-Random Functions , 1999, J. Comput. Syst. Sci..

[50]  Benoit Hudzia,et al.  Future Generation Computer Systems Optimis: a Holistic Approach to Cloud Service Provisioning , 2022 .

[51]  Lujo Bauer,et al.  Toward strong, usable access control for shared distributed data , 2014, FAST.

[52]  Charles Elkan,et al.  Differential Privacy and Machine Learning: a Survey and Review , 2014, ArXiv.

[53]  Bu-Sung Lee,et al.  TrustCloud: A Framework for Accountability and Trust in Cloud Computing , 2011, 2011 IEEE World Congress on Services.

[54]  George J. Pappas,et al.  Differentially Private Filtering , 2012, IEEE Transactions on Automatic Control.

[55]  Walid G. Aref,et al.  A Distributed Access Control Architecture for Cloud Computing , 2012, IEEE Software.

[56]  Craig Gentry,et al.  Implementing Gentry's Fully-Homomorphic Encryption Scheme , 2011, EUROCRYPT.

[57]  Brent Waters,et al.  Attribute-based encryption for fine-grained access control of encrypted data , 2006, CCS '06.

[58]  Elisa Bertino,et al.  Privacy Preserving Policy-Based Content Sharing in Public Clouds , 2013, IEEE Transactions on Knowledge and Data Engineering.

[59]  Vitaly Shmatikov,et al.  Airavat: Security and Privacy for MapReduce , 2010, NSDI.

[60]  Charalampos Papamanthou,et al.  Dynamic searchable symmetric encryption , 2012, IACR Cryptol. ePrint Arch..

[61]  Roberto Tamassia,et al.  Dynamic provable data possession , 2009, IACR Cryptol. ePrint Arch..

[62]  Federico Olmedo,et al.  Probabilistic Reasoning for Differential Privacy , 2012 .

[63]  Cong Wang,et al.  Efficient verifiable fuzzy keyword search over encrypted data in cloud computing , 2013, Comput. Sci. Inf. Syst..

[64]  Murat Kantarcioglu,et al.  Vigiles: Fine-Grained Access Control for MapReduce Systems , 2014, 2014 IEEE International Congress on Big Data.

[65]  Andreas Peter,et al.  A Survey of Provably Secure Searchable Encryption , 2014, ACM Comput. Surv..

[66]  Sherali Zeadally,et al.  Trust management of services in cloud environments: Obstacles and solutions , 2013, CSUR.

[67]  Gail-Joon Ahn,et al.  Zero-knowledge proofs of retrievability , 2011, Science China Information Sciences.

[68]  Melissa Chase,et al.  Structured Encryption and Controlled Disclosure , 2010, IACR Cryptol. ePrint Arch..

[69]  Philip S. Yu,et al.  A General Survey of Privacy-Preserving Data Mining Models and Algorithms , 2008, Privacy-Preserving Data Mining.

[70]  Rakesh Bobba,et al.  Attribute-Sets: A Practically Motivated Enhancement to Attribute-Based Encryption , 2009, ESORICS.

[71]  Ashwin Machanavajjhala,et al.  Blowfish privacy: tuning privacy-utility trade-offs using policies , 2013, SIGMOD Conference.

[72]  P. K. Rahul,et al.  A Novel Authentication Framework for Hadoop , 2015 .

[73]  David Cash,et al.  Dynamic Proofs of Retrievability via Oblivious RAM , 2013, EUROCRYPT.

[74]  Guy Gogniat,et al.  Towards Practical Program Execution over Fully Homomorphic Encryption Schemes , 2013, 2013 Eighth International Conference on P2P, Parallel, Grid, Cloud and Internet Computing.

[75]  Carl A. Gunter,et al.  Dynamic Searchable Encryption via Blind Storage , 2014, 2014 IEEE Symposium on Security and Privacy.

[76]  Antonio Puliafito,et al.  How to Enhance Cloud Architectures to Enable Cross-Federation , 2010, IEEE CLOUD.

[77]  Craig Gentry,et al.  Fully Homomorphic Encryption over the Integers , 2010, EUROCRYPT.

[78]  Rafail Ostrovsky,et al.  Public Key Encryption with Keyword Search , 2004, EUROCRYPT.

[79]  Elaine Shi,et al.  Practical Dynamic Searchable Encryption with Small Leakage , 2014, NDSS.

[80]  Frank McSherry,et al.  Privacy integrated queries: an extensible platform for privacy-preserving data analysis , 2009, SIGMOD Conference.

[81]  Max Mühlhäuser,et al.  Towards a trust management system for cloud computing marketplaces: using CAIQ as a trust information source , 2014, Secur. Commun. Networks.

[82]  Lior Rokach,et al.  Privacy-preserving data mining: A feature set partitioning approach , 2010, Inf. Sci..

[83]  Sibel Adali,et al.  Measuring behavioral trust in social networks , 2010, 2010 IEEE International Conference on Intelligence and Security Informatics.

[84]  Günther Pernul,et al.  Trust and Big Data: A Roadmap for Research , 2014, 2014 25th International Workshop on Database and Expert Systems Applications.

[85]  D. Richard Kuhn,et al.  An Access Control scheme for Big Data processing , 2014, 10th IEEE International Conference on Collaborative Computing: Networking, Applications and Worksharing.

[86]  Audun Jøsang,et al.  A Logic for Uncertain Probabilities , 2001, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[87]  Geong Sen Poh,et al.  Structured Encryption for Conceptual Graphs , 2012, IWSEC.

[88]  Vinod Vaikuntanathan,et al.  Efficient Fully Homomorphic Encryption from (Standard) LWE , 2011, 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science.

[89]  Angelos D. Keromytis,et al.  CloudFence: Data Flow Tracking as a Cloud Service , 2013, RAID.

[90]  Adam D. Smith,et al.  Privacy-preserving statistical estimation with optimal convergence rates , 2011, STOC '11.

[91]  Murat Kantarcioglu,et al.  BigSecret: A Secure Data Management Framework for Key-Value Stores , 2013, 2013 IEEE Sixth International Conference on Cloud Computing.

[92]  Qing Zhang,et al.  Aggregate Query Answering on Anonymized Tables , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[93]  Ninghui Li,et al.  t-Closeness: Privacy Beyond k-Anonymity and l-Diversity , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[94]  Seema Bawa,et al.  A Privacy, Trust and Policy based Authorization Framework for Services in Distributed Environments , 2007 .

[95]  Robert H. Deng,et al.  Private Query on Encrypted Data in Multi-user Settings , 2008, ISPEC.

[96]  Craig Gentry,et al.  Fully homomorphic encryption using ideal lattices , 2009, STOC '09.

[97]  Ramakrishnan Srikant,et al.  Order preserving encryption for numeric data , 2004, SIGMOD '04.

[98]  Brent Waters,et al.  Secure Conjunctive Keyword Search over Encrypted Data , 2004, ACNS.

[99]  Michael Mitzenmacher,et al.  Privacy Preserving Keyword Searches on Remote Encrypted Data , 2005, ACNS.

[100]  Seref Sagiroglu,et al.  Big data: A review , 2013, 2013 International Conference on Collaboration Technologies and Systems (CTS).

[101]  Geong Sen Poh,et al.  Verifiable Structured Encryption , 2012, Inscrypt.

[102]  Bo Luo,et al.  Access control for big data using data content , 2013, 2013 IEEE International Conference on Big Data.

[103]  Byung-Gon Chun,et al.  Secure Data Preservers for Web Services , 2011, WebApps.

[104]  Craig Gentry,et al.  Fully Homomorphic Encryption without Squashing Using Depth-3 Arithmetic Circuits , 2011, 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science.

[105]  Cong Wang,et al.  Enabling Public Auditability and Data Dynamics for Storage Security in Cloud Computing , 2011, IEEE Transactions on Parallel and Distributed Systems.

[106]  Rafail Ostrovsky,et al.  Attribute-based encryption with non-monotonic access structures , 2007, CCS '07.

[107]  F. John Krautheim,et al.  Private Virtual Infrastructure for Cloud Computing , 2009, HotCloud.

[108]  Alec Wolman,et al.  Credo: Trusted Computing for Guest VMs with a Commodity Hypervisor , 2011 .

[109]  Cécile Paris,et al.  A survey of trust in social networks , 2013, CSUR.

[110]  Jie Wu,et al.  Hierarchical attribute-based encryption for fine-grained access control in cloud storage services , 2010, CCS '10.

[111]  Brent Waters,et al.  Fuzzy Identity-Based Encryption , 2005, EUROCRYPT.

[112]  Stephen S. Yau,et al.  Efficient audit service outsourcing for data integrity in clouds , 2012, J. Syst. Softw..

[113]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[114]  Roberto Di Pietro,et al.  Scalable and efficient provable data possession , 2008, IACR Cryptol. ePrint Arch..

[115]  Muttukrishnan Rajarajan,et al.  Trust Model for Optimized Cloud Services , 2012, IFIPTM.

[116]  Krishna P. Gummadi,et al.  Towards Trusted Cloud Computing , 2009, HotCloud.

[117]  David Bernstein,et al.  Intercloud Security Considerations , 2010, 2010 IEEE Second International Conference on Cloud Computing Technology and Science.

[118]  Gabriel Antoniu,et al.  Managing Data Access on Clouds: A Generic Framework for Enforcing Security Policies , 2011, 2011 IEEE International Conference on Advanced Information Networking and Applications.

[119]  David Lie,et al.  Auditing cloud management using information flow tracking , 2012, STC '12.

[120]  Rafail Ostrovsky,et al.  Searchable symmetric encryption: improved definitions and efficient constructions , 2006, CCS '06.

[121]  Jemal H. Abawajy,et al.  Determining Service Trustworthiness in Intercloud Computing Environments , 2009, 2009 10th International Symposium on Pervasive Systems, Algorithms, and Networks.

[122]  Julien Bringer,et al.  Biometric Identification over Encrypted Data Made Feasible , 2009, ICISS.

[123]  Ashwin Machanavajjhala,et al.  Worst-Case Background Knowledge for Privacy-Preserving Data Publishing , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[124]  Trent Jaeger,et al.  Seeding clouds with trust anchors , 2010, CCSW '10.

[125]  Nora Cuppens-Boulahia,et al.  Privacy query rewriting algorithm instrumented by a privacy-aware access control model , 2014, Ann. des Télécommunications.

[126]  Bill McCarty,et al.  Selinux: NSA's Open Source Security Enhanced Linux , 2004 .

[127]  Shucheng Yu,et al.  Proofs of retrievability with public verifiability and constant communication cost in cloud , 2013, Cloud Computing '13.

[128]  Gilles Barthe,et al.  Probabilistic relational reasoning for differential privacy , 2012, POPL '12.

[129]  Mihir Bellare,et al.  Searchable Encryption Revisited: Consistency Properties, Relation to Anonymous IBE, and Extensions , 2005, Journal of Cryptology.

[130]  Yevgeniy Dodis,et al.  Proofs of Retrievability via Hardness Amplification , 2009, IACR Cryptol. ePrint Arch..

[131]  Brent Waters,et al.  Ciphertext-Policy Attribute-Based Encryption , 2007, 2007 IEEE Symposium on Security and Privacy (SP '07).

[132]  Cong Wang,et al.  Achieving Secure, Scalable, and Fine-grained Data Access Control in Cloud Computing , 2010, 2010 Proceedings IEEE INFOCOM.

[133]  Josep Domingo-Ferrer,et al.  Improving the Utility of Differentially Private Data Releases via k-Anonymity , 2013, 2013 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications.

[134]  Ari Juels,et al.  HAIL: a high-availability and integrity layer for cloud storage , 2009, CCS.

[135]  Jinjun Chen,et al.  Public Auditing for Big Data Storage in Cloud Computing -- A Survey , 2013, 2013 IEEE 16th International Conference on Computational Science and Engineering.

[136]  Gail-Joon Ahn,et al.  Collaborative integrity verification in hybrid clouds , 2011, 7th International Conference on Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom).

[137]  Jinjun Chen,et al.  Authorized Public Auditing of Dynamic Big Data Storage on Cloud with Efficient Verifiable Fine-Grained Updates , 2014, IEEE Transactions on Parallel and Distributed Systems.

[138]  Moti Yung,et al.  Order-Preserving Encryption Secure Beyond One-Wayness , 2014, IACR Cryptol. ePrint Arch..

[139]  Ninghui Li,et al.  Injector: Mining Background Knowledge for Data Anonymization , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[140]  Marten van Dijk,et al.  Iris: a scalable cloud file system with efficient integrity checks , 2012, ACSAC '12.