Big Data Analytics over Encrypted Datasets with Seabed

Today, enterprises collect large amounts of data and leverage the cloud to perform analytics over this data. Since the data is often sensitive, enterprises would prefer to keep it confidential and to hide it even from the cloud operator. Systems such as CryptDB and Monomi can accomplish this by operating mostly on encrypted data; however, these systems rely on expensive cryptographic techniques that limit performance in true "big data" scenarios that involve terabytes of data or more. This paper presents Seabed, a system that enables efficient analytics over large encrypted datasets. In contrast to previous systems, which rely on asymmetric encryption schemes, Seabed uses a novel, additively symmetric homomorphic encryption scheme (ASHE) to perform large-scale aggregations efficiently. Additionally, Seabed introduces a novel randomized encryption scheme called Splayed ASHE, or SPLASHE, that can, in certain cases, prevent frequency attacks based on auxiliary data.

[1]  David J. Wu,et al.  Practical Order-Revealing Encryption with Limited Leakage , 2016, FSE.

[2]  Samuel Madden,et al.  Processing Analytical Queries over Encrypted Data , 2013, Proc. VLDB Endow..

[3]  Hari Balakrishnan,et al.  CryptDB: protecting confidentiality with encrypted query processing , 2011, SOSP.

[4]  Chuang Liu,et al.  The Unified Logging Infrastructure for Data Analytics at Twitter , 2012, Proc. VLDB Endow..

[5]  Owen Kaser,et al.  Better bitmap performance with Roaring bitmaps , 2014, Softw. Pract. Exp..

[6]  Carlos V. Rozas,et al.  Innovative instructions and software model for isolated execution , 2013, HASP '13.

[7]  Mihir Bellare,et al.  Random oracles are practical: a paradigm for designing efficient protocols , 1993, CCS '93.

[8]  Vipin Kumar,et al.  Trends in big data analytics , 2014, J. Parallel Distributed Comput..

[9]  Christos Gkantsidis,et al.  VC3: Trustworthy Data Analytics in the Cloud Using SGX , 2015, 2015 IEEE Symposium on Security and Privacy.

[10]  Srinivas Devadas,et al.  Intel SGX Explained , 2016, IACR Cryptol. ePrint Arch..

[11]  Siddharth Garg,et al.  Securing Computer Hardware Using 3D Integrated Circuit (IC) Technology and Split Manufacturing for Obfuscation , 2013, USENIX Security Symposium.

[12]  Stanley B. Zdonik,et al.  Answering Aggregation Queries in a Secure System Model , 2007, VLDB.

[13]  Charles V. Wright,et al.  Inference Attacks on Property-Preserving Encrypted Databases , 2015, CCS.

[14]  Sergei Skorobogatov,et al.  Breakthrough Silicon Scanning Discovers Backdoor in Military Chip , 2012, CHES.

[15]  Sam Shah,et al.  The big data ecosystem at LinkedIn , 2013, SIGMOD '13.

[16]  Pascal Paillier,et al.  Public-Key Cryptosystems Based on Composite Degree Residuosity Classes , 1999, EUROCRYPT.

[17]  Gil Segev,et al.  Deterministic Public-Key Encryption for Adaptively Chosen Plaintext Distributions , 2013, EUROCRYPT.

[18]  G. Varghese,et al.  Adtributor: Revenue Debugging in Advertising Systems , 2014, NSDI.

[19]  Ramakrishnan Srikant,et al.  Order preserving encryption for numeric data , 2004, SIGMOD '04.

[20]  Ramarathnam Venkatesan,et al.  Orthogonal Security with Cipherbase , 2013, CIDR.

[21]  Beng Chin Ooi,et al.  M2R: Enabling Stronger Privacy in MapReduce Computation , 2015, USENIX Security Symposium.

[22]  Craig Gentry,et al.  Homomorphic Evaluation of the AES Circuit (Updated Implementation) , 2015 .

[23]  Dan Boneh,et al.  Evaluating 2-DNF Formulas on Ciphertexts , 2005, TCC.

[24]  Leonid Boytsov,et al.  Decoding billions of integers per second through vectorization , 2012, Softw. Pract. Exp..

[25]  Mihir Bellare,et al.  Deterministic and Efficiently Searchable Encryption , 2007, CRYPTO.

[26]  Craig Gentry,et al.  Homomorphic Evaluation of the AES Circuit , 2012, IACR Cryptol. ePrint Arch..

[27]  Rafail Ostrovsky,et al.  Software protection and simulation on oblivious RAMs , 1996, JACM.

[28]  Nathan Chenette,et al.  Order-Preserving Symmetric Encryption , 2009, IACR Cryptol. ePrint Arch..

[29]  Shivam Bhasin,et al.  A survey on hardware trojan detection techniques , 2015, 2015 IEEE International Symposium on Circuits and Systems (ISCAS).

[30]  Mark Zhandry,et al.  Semantically Secure Order-Revealing Encryption: Multi-input Functional Encryption Without Obfuscation , 2015, EUROCRYPT.

[31]  Radu Sion,et al.  TrustedDB: A Trusted Hardware-Based Database with Privacy and Data Confidentiality , 2011, IEEE Transactions on Knowledge and Data Engineering.

[32]  Craig Gentry,et al.  Fully homomorphic encryption using ideal lattices , 2009, STOC '09.

[33]  Rafail Ostrovsky,et al.  Efficient computation on oblivious RAMs , 1990, STOC '90.