Differentially Oblivious Database Joins: Overcoming the Worst-Case Curse of Fully Oblivious Algorithms

Numerous high-profile works have shown that access patterns to even encrypted databases can leak secret information and sometimes even lead to reconstruction of the entire database. To thwart access pattern leakage, the literature has focused on oblivious algorithms, where obliviousness requires that the access patterns leak nothing about the input data. In this paper, we consider the Join operator, an important database primitive that has been extensively studied and optimized. Unfortunately, any fully oblivious Join algorithm would require always padding the result to the worst-case length which is quadratic in the data size N . In comparison, an insecure baseline incurs only O(R + N) cost where R is the true result length, and in the common case in practice, R is relatively short. As a typical example, when R = O(N), any fully oblivious algorithm must inherently incur a prohibitive, N -fold slowdown relative to the insecure baseline. Indeed, the (non-private) database and algorithms literature invariably focuses on studying the instance-specific rather than worst-case performance of database algorithms. Unfortunately, the stringent notion of full obliviousness precludes the design of efficient algorithms with non-trivial instance-specific performance. To overcome this worst-case performance barrier of full obliviousness and enable algorithms with good instance-specific performance, we consider a relaxed notion of access pattern privacy called ( , δ)-differential obliviousness (DO), originally proposed in the seminal work of Chan et al. (SODA’19). Rather than insisting that the access patterns leak no information whatsoever, the relaxed DO notion requires that the access patterns satisfy ( , δ)-differential privacy. We show that by adopting the relaxed DO notion, we can obtain efficient database Join mechanisms whose instance-specific performance approximately matches the insecure baseline, while still offering a meaningful notion of privacy to individual users. Complementing our upper bound results, we also prove new lower bounds regarding the performance of any DO Join algorithm. Differential obliviousness (DO) is a new notion and is a relatively unexplored territory. Following the pioneering investigations by Chan et al. and others, our work is among the very first to formally explore how DO can help overcome the worst-case performance curse of full obliviousness; moreover, we motivate our work with database applications. Our work shows new evidence why DO might be a promising notion, and opens up several exciting future directions. ∗Author order is randomized.

[1]  Yael Tauman Kalai,et al.  Delegating RAM Computations , 2016, TCC.

[2]  Dan Suciu,et al.  From Theory to Practice: Efficient Join Query Evaluation in a Parallel Database System , 2015, SIGMOD Conference.

[3]  Elaine Shi,et al.  Private and Continual Release of Statistics , 2010, TSEC.

[4]  Adam O'Neill,et al.  Accessing Data while Preserving Privacy , 2017, ArXiv.

[5]  Dawn Xiaodong Song,et al.  Practical techniques for searches on encrypted data , 2000, Proceeding 2000 IEEE Symposium on Security and Privacy. S&P 2000.

[6]  Kai-Min Chung,et al.  Large-Scale Secure Computation: Multi-party Computation for (Parallel) RAM Programs , 2015, CRYPTO.

[7]  Ge Xia,et al.  Improved Parameterized Upper Bounds for Vertex Cover , 2006, MFCS.

[8]  Aaron Roth,et al.  Iterative Constructions and Private Data Release , 2011, TCC.

[9]  Matei Zaharia,et al.  ObliDB: Oblivious Query Processing for Secure Databases , 2017, Proc. VLDB Endow..

[10]  Shay Moran,et al.  A Note on Average-Case Sorting , 2013, Order.

[11]  Elaine Shi,et al.  Can We Overcome the n log n Barrier for Oblivious Sorting? , 2019, IACR Cryptol. ePrint Arch..

[12]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[13]  David Cash,et al.  Leakage-Abuse Attacks Against Searchable Encryption , 2015, IACR Cryptol. ePrint Arch..

[14]  Rafail Ostrovsky,et al.  Garbled RAM Revisited , 2014, EUROCRYPT.

[15]  Vitaly Shmatikov,et al.  Breaking Web Applications Built On Top of Encrypted Data , 2016, CCS.

[16]  Kane,et al.  Beyond the Worst-Case Analysis of Algorithms , 2020 .

[17]  E. Szemerédi,et al.  O(n LOG n) SORTING NETWORK. , 1983 .

[18]  Raghav Kaushik,et al.  Oblivious Query Processing , 2013, ICDT.

[19]  Ashwin Machanavajjhala,et al.  PrivateSQL: A Differentially Private SQL Query Engine , 2019, Proc. VLDB Endow..

[20]  Ananda Theertha Suresh Differentially private anonymized histograms , 2019, NeurIPS.

[21]  Atri Rudra,et al.  Beyond worst-case analysis for joins with minesweeper , 2014, PODS.

[22]  Uri Zwick,et al.  Listing Triangles , 2014, ICALP.

[23]  Micha Sharir,et al.  A simple output-sensitive algorithm for hidden surface removal , 1992, TOGS.

[24]  Elaine Shi,et al.  Circuit ORAM: On Tightness of the Goldreich-Ostrovsky Lower Bound , 2015, IACR Cryptol. ePrint Arch..

[25]  Dániel Marx,et al.  Size Bounds and Query Plans for Relational Joins , 2008, 2008 49th Annual IEEE Symposium on Foundations of Computer Science.

[26]  Kartik Nayak,et al.  Bucket Oblivious Sort: An Extremely Simple Oblivious Sort , 2020, SOSA.

[27]  Lorenzo Alvisi,et al.  Obladi: Oblivious Serializable Transactions in the Cloud , 2018, OSDI.

[28]  Amos Beimel,et al.  Exploring Differential Obliviousness , 2019, APPROX-RANDOM.

[29]  Salil P. Vadhan,et al.  The Complexity of Differential Privacy , 2017, Tutorials on the Foundations of Cryptography.

[30]  Ramarathnam Venkatesan,et al.  Orthogonal Security with Cipherbase , 2013, CIDR.

[31]  Charles V. Wright,et al.  Inference Attacks on Property-Preserving Encrypted Databases , 2015, CCS.

[32]  Rafail Ostrovsky,et al.  Software protection and simulation on oblivious RAMs , 1996, JACM.

[33]  Murat Kantarcioglu,et al.  Access Pattern disclosure on Searchable Encryption: Ramification, Attack and Mitigation , 2012, NDSS.

[34]  Elaine Shi,et al.  Differentially Oblivious Turing Machines , 2021, ITCS.

[35]  Elaine Shi,et al.  Towards Attribute-Based Encryption for RAMs from LWE: Sub-linear Decryption, and More , 2019, ASIACRYPT.

[36]  Ran Canetti,et al.  Adaptive Succinct Garbled RAM or: How to Delegate Your Database , 2016, TCC.

[37]  Tal Malkin,et al.  Private search in the real world , 2011, ACSAC '11.

[38]  Kartik Nayak,et al.  OptORAMa: Optimal Oblivious RAM , 2020, IACR Cryptol. ePrint Arch..

[39]  Craig Gentry,et al.  Outsourcing Private RAM Computation , 2014, 2014 IEEE 55th Annual Symposium on Foundations of Computer Science.

[40]  Elaine Shi,et al.  Foundations of Differentially Oblivious Algorithms , 2017, IACR Cryptol. ePrint Arch..

[41]  Gerth Stølting Brodal,et al.  Cache-Oblivious Algorithms and Data Structures , 2004, SWAT.

[42]  Hugo Krawczyk,et al.  Dynamic Searchable Encryption in Very-Large Databases: Data Structures and Implementation , 2014, NDSS.

[43]  Rafail Ostrovsky,et al.  Searchable symmetric encryption: improved definitions and efficient constructions , 2006, CCS '06.

[44]  Nina Mishra,et al.  Releasing search queries and clicks privately , 2009, WWW '09.

[45]  Kai-Min Chung,et al.  Oblivious Parallel RAM , 2014, IACR Cryptol. ePrint Arch..

[46]  Hari Balakrishnan,et al.  CryptDB: protecting confidentiality with encrypted query processing , 2011, SOSP.

[47]  Yael Tauman Kalai,et al.  How to Run Turing Machines on Encrypted Data , 2013, CRYPTO.

[48]  Marie-Sarah Lacharité,et al.  Learning to Reconstruct: Statistical Learning Theory and Encrypted Database Attacks , 2019, 2019 IEEE Symposium on Security and Privacy (SP).

[49]  Hugo Krawczyk,et al.  Highly-Scalable Searchable Symmetric Encryption with Support for Boolean Queries , 2013, IACR Cryptol. ePrint Arch..

[50]  Elaine Shi,et al.  Circuit OPRAM: Unifying Statistically and Computationally Secure ORAMs and OPRAMs , 2017, TCC.

[51]  Guy N. Rothblum,et al.  A Multiplicative Weights Mechanism for Privacy-Preserving Data Analysis , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[52]  Ke Yi,et al.  Instance and Output Optimal Parallel Algorithms for Acyclic Joins , 2019, PODS.

[53]  Elaine Shi,et al.  On the Depth of Oblivious Parallel RAM , 2017, ASIACRYPT.

[54]  Prateek Mittal,et al.  Differentially Private Oblivious RAM , 2016, Proc. Priv. Enhancing Technol..

[55]  Roberto Tamassia,et al.  The State of the Uniform: Attacks on Encrypted Databases Beyond the Uniform Query Distribution , 2020, 2020 IEEE Symposium on Security and Privacy (SP).

[56]  Serge Abiteboul,et al.  Foundations of Databases , 1994 .

[57]  Elaine Shi,et al.  Oblivious RAM with O((logN)3) Worst-Case Cost , 2011, ASIACRYPT.

[58]  Mihalis Yannakakis,et al.  Algorithms for Acyclic Database Schemes , 1981, VLDB.

[59]  Leonard D. Shapiro,et al.  Join processing in database systems with large main memories , 1986, TODS.

[60]  Kobbi Nissim,et al.  Simultaneous Private Learning of Multiple Concepts , 2015, ITCS.

[61]  Alok Aggarwal,et al.  The input/output complexity of sorting and related problems , 1988, CACM.

[62]  Jeffrey Scott Vitter,et al.  External memory algorithms and data structures: dealing with massive data , 2001, CSUR.

[63]  Michael A. Bender,et al.  An Optimal Cache-Oblivious Priority Queue and Its Application to Graph Algorithms , 2007, SIAM J. Comput..

[64]  Hugo Krawczyk,et al.  Outsourced symmetric private information retrieval , 2013, IACR Cryptol. ePrint Arch..

[65]  Maurice Herlihy,et al.  Encrypted Databases for Differential Privacy , 2019, IACR Cryptol. ePrint Arch..

[66]  Oded Goldreich,et al.  Towards a theory of software protection and simulation by oblivious RAMs , 1987, STOC.

[67]  Kai-Min Chung,et al.  Cryptography for Parallel RAM from Indistinguishability Obfuscation , 2016, ITCS.

[68]  Salil P. Vadhan,et al.  Differential Privacy on Finite Computers , 2017, ITCS.

[69]  Adam O'Neill,et al.  Generic Attacks on Secure Outsourced Databases , 2016, CCS.

[70]  S. Janson Tail bounds for sums of geometric and exponential variables , 2017, 1709.08157.

[71]  Jonathan Katz,et al.  Secure two-party computation in sublinear (amortized) time , 2012, CCS.

[72]  Sahar Mazloom,et al.  Secure Computation with Differentially Private Access Patterns , 2018, CCS.

[73]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[74]  Elaine Shi,et al.  Oblivious Parallel Tight Compaction , 2020, IACR Cryptol. ePrint Arch..

[75]  Jörg Flum,et al.  Parameterized Complexity Theory (Texts in Theoretical Computer Science. An EATCS Series) , 2006 .

[76]  Joseph Bonneau,et al.  Differentially Private Password Frequency Lists , 2016, NDSS.

[77]  E. Shi,et al.  Data Oblivious Algorithms for Multicores , 2020, IACR Cryptol. ePrint Arch..

[78]  Elaine Shi,et al.  Multi-Dimensional Range Query over Encrypted Data , 2007, 2007 IEEE Symposium on Security and Privacy (SP '07).

[79]  Moni Naor,et al.  De-amortized Cuckoo Hashing: Provable Worst-Case Performance and Experimental Results , 2009, ICALP.

[80]  Ion Stoica,et al.  Opaque: An Oblivious and Encrypted Distributed Analytics Platform , 2017, NSDI.

[81]  Jennifer Widom,et al.  Database systems - the complete book (2. ed.) , 2009 .

[82]  Radu Sion,et al.  TrustedDB: A Trusted Hardware-Based Database with Privacy and Data Confidentiality , 2011, IEEE Transactions on Knowledge and Data Engineering.

[83]  Ashwin Machanavajjhala,et al.  Architecting a Differentially Private SQL Engine , 2019, CIDR.