SoK: Cryptographically Protected Database Search

Protected database search systems cryptographically isolate the roles of reading from, writing to, and administering the database. This separation limits unnecessary administrator access and protects data in the case of system breaches. Since protected search was introduced in 2000, the area has grown rapidly, systems are offered by academia, start-ups, and established companies. However, there is no best protected search system or set of techniques. Design of such systems is a balancing act between security, functionality, performance, and usability. This challenge is made more difficult by ongoing database specialization, as some users will want the functionality of SQL, NoSQL, or NewSQL databases. This database evolution will continue, and the protected search community should be able to quickly provide functionality consistent with newly invented databases. At the same time, the community must accurately and clearly characterize the tradeoffs between different approaches. To address these challenges, we provide the following contributions:1) An identification of the important primitive operations across database paradigms. We find there are a small number of base operations that can be used and combined to support a large number of database paradigms.2) An evaluation of the current state of protected search systems in implementing these base operations. This evaluation describes the main approaches and tradeoffs for each base operation. Furthermore, it puts protected search in the context of unprotected search, identifying key gaps in functionality.3) An analysis of attacks against protected search for different base queries.4) A roadmap and tools for transforming a protected search system into a protected database, including an open-source performance evaluation platform and initial user opinions of protected search.

[1]  Ramakrishnan Srikant,et al.  Order preserving encryption for numeric data , 2004, SIGMOD '04.

[2]  Arkady Yerukhimovich,et al.  POPE: Partial Order Preserving Encoding , 2016, CCS.

[3]  John Miles Smith,et al.  Optimizing the performance of a relational algebra database interface , 1975, CACM.

[4]  Murat Kantarcioglu,et al.  Efficient Similarity Search over Encrypted Data , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[5]  Dan Pritchett,et al.  BASE: An Acid Alternative , 2008, ACM Queue.

[6]  Henrik Loeser,et al.  "One Size Fits All": An Idea Whose Time Has Come and Gone? , 2011, BTW.

[7]  Charalampos Papamanthou,et al.  Parallel and Dynamic Searchable Symmetric Encryption , 2013, Financial Cryptography.

[8]  G. Linoff,et al.  Mining the Web: Transforming Customer Data into Customer Value , 2002 .

[9]  Robert K. Cunningham,et al.  Computing on masked data: a high performance method for improving big data veracity , 2014, 2014 IEEE High Performance Extreme Computing Conference (HPEC).

[10]  Jeffrey Scott Vitter,et al.  Proceedings of the thirtieth annual ACM symposium on Theory of computing , 1998, STOC 1998.

[11]  Jeremy Kepner,et al.  D4M: Bringing associative arrays to database engines , 2015, 2015 IEEE High Performance Extreme Computing Conference (HPEC).

[12]  Tal Malkin,et al.  Malicious-Client Security in Blind Seer: A Scalable Private DBMS , 2015, 2015 IEEE Symposium on Security and Privacy.

[13]  Elaine Shi,et al.  Practical Dynamic Searchable Encryption with Small Leakage , 2014, NDSS.

[14]  George Kollios,et al.  GRECS: Graph Encryption for Approximate Shortest Distance Queries , 2015, IACR Cryptol. ePrint Arch..

[15]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[16]  Hugo Krawczyk,et al.  Highly-Scalable Searchable Symmetric Encryption with Support for Boolean Queries , 2013, IACR Cryptol. ePrint Arch..

[17]  Seny Kamara,et al.  SQL on Structurally-Encrypted Databases , 2018, IACR Cryptol. ePrint Arch..

[18]  Jonathan Herzog,et al.  A test-suite generator for database systems , 2014, 2014 IEEE High Performance Extreme Computing Conference (HPEC).

[19]  Ling Ren,et al.  Path ORAM , 2012, J. ACM.

[20]  Adam J. Aviv,et al.  A Practical Oblivious Map Data Structure with Secure Deletion and History Independence , 2015, 2016 IEEE Symposium on Security and Privacy (SP).

[21]  David Cash,et al.  Leakage-Abuse Attacks Against Searchable Encryption , 2015, IACR Cryptol. ePrint Arch..

[22]  Dan Suciu,et al.  Demonstration of the Myria big data management service , 2014, SIGMOD Conference.

[23]  Andrew Pavlo,et al.  What's Really New with NewSQL? , 2016, SGMD.

[24]  Arkady Yerukhimovich,et al.  Cryptography for Big Data Security , 2015, IACR Cryptol. ePrint Arch..

[25]  Michael Stonebraker,et al.  A Demonstration of the BigDAWG Polystore System , 2015, Proc. VLDB Endow..

[26]  David Cash,et al.  The Locality of Searchable Symmetric Encryption , 2014, IACR Cryptol. ePrint Arch..

[27]  Alexander S. Szalay,et al.  Scalable database query processing , 2012 .

[28]  David A. Bader,et al.  Graphs, Matrices, and the GraphBLAS: Seven Good Reasons , 2015, ICCS.

[29]  Moni Naor,et al.  Private Information Retrieval by Keywords , 1998, IACR Cryptol. ePrint Arch..

[30]  Ramez Elmasri,et al.  Fundamentals of Database Systems , 1989 .

[31]  Brent Waters,et al.  Candidate Indistinguishability Obfuscation and Functional Encryption for all Circuits , 2013, 2013 IEEE 54th Annual Symposium on Foundations of Computer Science.

[32]  Indrajit Ray,et al.  Multi-keyword Similarity Search Over Encrypted Cloud Data , 2014, IACR Cryptol. ePrint Arch..

[33]  Daniel J. Abadi,et al.  Data Management in the Cloud: Limitations and Opportunities , 2009, IEEE Data Eng. Bull..

[34]  Elaine Shi,et al.  Predicate Privacy in Encryption Systems , 2009, IACR Cryptol. ePrint Arch..

[35]  Murat Kantarcioglu,et al.  Access Pattern disclosure on Searchable Encryption: Ramification, Attack and Mitigation , 2012, NDSS.

[36]  Angelos D. Keromytis,et al.  Blind Seer: A Scalable Private DBMS , 2014, 2014 IEEE Symposium on Security and Privacy.

[37]  Cong Wang,et al.  Efficient verifiable fuzzy keyword search over encrypted data in cloud computing , 2013, Comput. Sci. Inf. Syst..

[38]  Raphael Bost,et al.  ∑oφoς: Forward Secure Searchable Encryption , 2016, CCS.

[39]  Andreas Peter,et al.  A Survey of Provably Secure Searchable Encryption , 2014, ACM Comput. Surv..

[40]  Adam O'Neill,et al.  Generic Attacks on Secure Outsourced Databases , 2016, CCS.

[41]  Jeremy Kepner,et al.  Graphulo implementation of server-side sparse matrix multiply in the Accumulo database , 2015, 2015 IEEE High Performance Extreme Computing Conference (HPEC).

[42]  Carl A. Gunter,et al.  Dynamic Searchable Encryption via Blind Storage , 2014, 2014 IEEE Symposium on Security and Privacy.

[43]  Craig Gentry,et al.  Better Bootstrapping in Fully Homomorphic Encryption , 2012, Public Key Cryptography.

[44]  Olga Ohrimenko,et al.  Sorting and Searching Behind the Curtain , 2015, Financial Cryptography.

[45]  Nathan Chenette,et al.  Efficient Fuzzy Search on Encrypted Data , 2014, FSE.

[46]  Hugo Krawczyk,et al.  Rich Queries on Encrypted Data: Beyond Exact Matches , 2015, ESORICS.

[47]  Silvio Micali,et al.  A Completeness Theorem for Protocols with Honest Majority , 1987, STOC 1987.

[48]  Seny Kamara,et al.  Boolean Searchable Symmetric Encryption with Worst-Case Sub-linear Complexity , 2017, EUROCRYPT.

[49]  Rishabh Poddar,et al.  Arx: A Strongly Encrypted Database System , 2016, IACR Cryptol. ePrint Arch..

[50]  Thomas Ristenpart,et al.  Leakage-Abuse Attacks against Order-Revealing Encryption , 2017, 2017 IEEE Symposium on Security and Privacy (SP).

[51]  Avi Wigderson,et al.  Completeness Theorems for Non-Cryptographic Fault-Tolerant Distributed Computation (Extended Abstract) , 1988, STOC.

[52]  Nikita Shamgunov The MemSQL In-Memory Database System , 2014, IMDM@VLDB.

[53]  Joseph K. Bradley,et al.  Spark SQL: Relational Data Processing in Spark , 2015, SIGMOD Conference.

[54]  Michael T. Goodrich,et al.  Authenticated Data Structures for Graph and Geometric Searching , 2003, CT-RSA.

[55]  Jeremy Kepner,et al.  Polystore mathematics of relational algebra , 2017, 2017 IEEE International Conference on Big Data (Big Data).

[56]  Sanjam Garg,et al.  TWORAM: Efficient Oblivious RAM in Two Rounds with Applications to Searchable Encryption , 2016, CRYPTO.

[57]  Melissa Chase,et al.  Structured Encryption and Controlled Disclosure , 2010, IACR Cryptol. ePrint Arch..

[58]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[59]  Rishabh Poddar,et al.  A Secure One-Roundtrip Index for Range Queries , 2016, IACR Cryptol. ePrint Arch..

[60]  Craig Gentry,et al.  Fully homomorphic encryption using ideal lattices , 2009, STOC '09.

[61]  Ravi Kumar,et al.  Pig latin: a not-so-foreign language for data processing , 2008, SIGMOD Conference.

[62]  ChooKim-Kwang Raymond,et al.  Searchable Symmetric Encryption , 2017 .

[63]  Julien Bringer,et al.  Error-Tolerant Searchable Encryption , 2009, 2009 IEEE International Conference on Communications.

[64]  Andrew Chi-Chih Yao,et al.  Protocols for Secure Computations (Extended Abstract) , 1982, FOCS.

[65]  Paul G. Brown,et al.  Overview of sciDB: large scale array storage, processing and analysis , 2010, SIGMOD Conference.

[66]  Rafail Ostrovsky,et al.  Searchable symmetric encryption: improved definitions and efficient constructions , 2006, CCS '06.

[67]  Hugo Krawczyk,et al.  Dynamic Searchable Encryption in Very-Large Databases: Data Structures and Implementation , 2014, NDSS.

[68]  E. F. CODD,et al.  A relational model of data for large shared data banks , 1970, CACM.

[69]  Stavros Papadopoulos,et al.  Practical Private Range Search Revisited , 2016, SIGMOD Conference.

[70]  Julien Bringer,et al.  Biometric Identification over Encrypted Data Made Feasible , 2009, ICISS.

[71]  Hari Balakrishnan,et al.  CryptDB: processing queries on an encrypted database , 2012, CACM.

[72]  D. Blumenthal Launching HITECH. , 2010, The New England journal of medicine.

[73]  Abhi Shelat,et al.  Computing on Authenticated Data , 2012, Journal of Cryptology.

[74]  Roberto Tamassia,et al.  Time and Space Efficient Algorithms for Two-Party Authenticated Data Structures , 2007, ICICS.

[75]  Yutaka Hata,et al.  Fuzzy-ASM Based Automated Skull Stripping Method from Infantile Brain MR Images , 2007 .

[76]  Jeremy Kepner,et al.  Achieving 100,000,000 database inserts per second using Accumulo and D4M , 2014, 2014 IEEE High Performance Extreme Computing Conference (HPEC).

[77]  Jonathan Katz,et al.  All Your Queries Are Belong to Us: The Power of File-Injection Attacks on Searchable Encryption , 2016, USENIX Security Symposium.

[78]  Shuguo Han,et al.  Privacy-Preserving Linear Fisher Discriminant Analysis , 2008, PAKDD.

[79]  Stanislaw Jarecki,et al.  Three-Party ORAM for Secure Computation , 2015, ASIACRYPT.

[80]  Andreas Reuter,et al.  Principles of transaction-oriented database recovery , 1983, CSUR.

[81]  Radu Sion,et al.  TrustedDB: A Trusted Hardware-Based Database with Privacy and Data Confidentiality , 2011, IEEE Transactions on Knowledge and Data Engineering.

[82]  Peter Willett,et al.  The Porter stemming algorithm: then and now , 2006, Program.

[83]  Mihir Bellare,et al.  Deterministic and Efficiently Searchable Encryption , 2007, CRYPTO.

[84]  Hugo Krawczyk,et al.  Outsourced symmetric private information retrieval , 2013, IACR Cryptol. ePrint Arch..

[85]  Ran Canetti,et al.  Modular Order-Preserving Encryption, Revisited , 2015, SIGMOD Conference.

[86]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[87]  Vitaly Shmatikov,et al.  Breaking Web Applications Built On Top of Encrypted Data , 2016, CCS.

[88]  Giovanni Di Crescenzo,et al.  Privacy-Preserving Range Queries from Keyword Queries , 2015, DBSec.

[89]  Rafail Ostrovsky,et al.  Private Large-Scale Databases with Distributed Searchable Symmetric Encryption , 2016, CT-RSA.

[90]  Yuval Ishai,et al.  Protecting data privacy in private information retrieval schemes , 1998, STOC '98.

[91]  Jennifer Widom,et al.  A First Course in Database Systems , 1997 .

[92]  Elisa Bertino,et al.  Database security - concepts, approaches, and challenges , 2005, IEEE Transactions on Dependable and Secure Computing.

[93]  Nathan Chenette,et al.  Order-Preserving Encryption Revisited: Improved Security Analysis and Alternative Solutions , 2011, CRYPTO.

[94]  Muhammad Naveed,et al.  The Fallacy of Composition of Oblivious RAM and Searchable Encryption , 2015, IACR Cryptol. ePrint Arch..

[95]  Yang Yang,et al.  HEtest: A Homomorphic Encryption Testing Framework , 2015, Financial Cryptography Workshops.

[96]  Kevin Loney,et al.  Oracle Database 10g The Complete Reference , 2004 .

[97]  Yannis Rouselakis,et al.  Property Preserving Symmetric Encryption , 2012, EUROCRYPT.

[98]  Cong Wang,et al.  Achieving usable and privacy-assured similarity search over outsourced cloud data , 2012, 2012 Proceedings IEEE INFOCOM.

[99]  Robert K. Cunningham,et al.  Automated Assessment of Secure Search Systems , 2015, OPSR.

[100]  Eyal Kushilevitz,et al.  Private information retrieval , 1998, JACM.

[101]  Charles V. Wright,et al.  Inference Attacks on Property-Preserving Encrypted Databases , 2015, CCS.

[102]  Michael Stonebraker,et al.  The design of POSTGRES , 1986, SIGMOD '86.

[103]  Seny Kamara,et al.  Structured Encryption and Leakage Suppression , 2018, IACR Cryptol. ePrint Arch..

[104]  Eric Chen,et al.  Cocoon : Encrypted Substring Search , 2016 .

[105]  Nickolai Zeldovich,et al.  An Ideal-Security Protocol for Order-Preserving Encoding , 2013, 2013 IEEE Symposium on Security and Privacy.

[106]  Alptekin Küpçü,et al.  Database Outsourcing with Hierarchical Authenticated Data Structures , 2013, ICISC.

[107]  Jeremy Kepner,et al.  Graphulo: Linear Algebra Graph Kernels for NoSQL Databases , 2015, 2015 IEEE International Parallel and Distributed Processing Symposium Workshop.

[108]  Michael Stonebraker,et al.  The BigDAWG polystore system and architecture , 2016, 2016 IEEE High Performance Extreme Computing Conference (HPEC).

[109]  Arkady Yerukhimovich,et al.  Computing on Masked Data to improve the security of big data , 2015, 2015 IEEE International Symposium on Technologies for Homeland Security (HST).

[110]  Lars George,et al.  HBase: The Definitive Guide , 2011 .

[111]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[112]  Charles V. Wright,et al.  The Shadow Nemesis: Inference Attacks on Efficiently Deployable, Efficiently Searchable Encryption , 2016, CCS.

[113]  Kartik Nayak,et al.  Oblivious Data Structures , 2014, IACR Cryptol. ePrint Arch..

[114]  Debmalya Biswas,et al.  Performance Comparison of Secure Comparison Protocols , 2009, 2009 20th International Workshop on Database and Expert Systems Application.

[115]  Tarik Moataz,et al.  Oblivious Substring Search with Updates , 2015, IACR Cryptol. ePrint Arch..

[116]  Rafail Ostrovsky,et al.  How to Garble RAM Programs , 2013, EUROCRYPT.

[117]  Rafail Ostrovsky,et al.  Software protection and simulation on oblivious RAMs , 1996, JACM.

[118]  Nathan Chenette,et al.  Order-Preserving Symmetric Encryption , 2009, IACR Cryptol. ePrint Arch..

[119]  Dawn Xiaodong Song,et al.  Practical techniques for searches on encrypted data , 2000, Proceeding 2000 IEEE Symposium on Security and Privacy. S&P 2000.

[120]  Craig Gentry,et al.  (Leveled) fully homomorphic encryption without bootstrapping , 2012, ITCS '12.

[121]  Stefan Savage,et al.  An analysis of underground forums , 2011, IMC '11.

[122]  Jim Webber,et al.  A programmatic introduction to Neo4j , 2018, SPLASH '12.

[123]  Pieter H. Hartel,et al.  Selective Document Retrieval from Encrypted Database , 2012, ISC.

[124]  Dong Hoon Lee,et al.  Secure Similarity Search , 2007, 2007 IEEE International Conference on Granular Computing (GRC 2007).

[125]  Oded Goldreich,et al.  Towards a theory of software protection and simulation by oblivious RAMs , 1987, STOC.

[126]  Gary T. Leavens,et al.  Proceedings of the 3rd annual conference on Systems, programming, and applications: software for humanity , 2012, SPLASH 2012.

[127]  Michael Backes,et al.  ADSNARK: Nearly Practical and Privacy-Preserving Proofs on Authenticated Data , 2015, 2015 IEEE Symposium on Security and Privacy.

[128]  Rob W.W. Hooft,et al.  The value of data , 2011, Nature Genetics.

[129]  Michael Stonebraker,et al.  Readings in Database Systems , 1988 .

[130]  Andrew Chi-Chih Yao,et al.  How to Generate and Exchange Secrets (Extended Abstract) , 1986, FOCS.

[131]  Ivan Damgård,et al.  Homomorphic encryption and secure comparison , 2008, Int. J. Appl. Cryptogr..

[132]  Christopher Frost,et al.  Spanner: Google's Globally-Distributed Database , 2012, OSDI.

[133]  Laura M. Haas,et al.  Towards heterogeneous multimedia information systems: the Garlic approach , 1995, Proceedings RIDE-DOM'95. Fifth International Workshop on Research Issues in Data Engineering-Distributed Object Management.

[134]  Andreas Schaad,et al.  Experiences and observations on the industrial implementation of a system to search over outsourced encrypted data , 2014, Sicherheit.

[135]  Melissa Chase,et al.  Substring-Searchable Symmetric Encryption , 2015, Proc. Priv. Enhancing Technol..