ScrambleDB: Oblivious (Chameleon) Pseudonymization-as-a-Service

Abstract Pseudonymization is a widely deployed technique to de-sensitize data sets by consistently replacing identifying attributes with non-sensitive surrogates. However, all existing solutions are impractical to deploy in settings where data is accumulated from distributed sources: they either require sharing the same secret key with all sources, or rely on a fully trusted service to consistently compute these pseudonyms. Further, the consistency of pseudonyms, which is required to maintain the data’s utility, comes with inherent and severe privacy limitations. This paper solves the key management and privacy challenges by introducing oblivious pseudonymization-as-a-service. Therein, the pseudonymization is outsourced to a central, yet fully oblivious entity, i.e., the service neither learns the sensitive information nor the pseudonyms it produces. Further, to obtain better privacy we no longer require pseudonyms to be computed consistently and instead introduce a dedicated join procedure. When data is stored at rest, all data is pseudonymized in a fully unlinkable manner. Only when certain subsets of the data are needed, the linkage is established through a controlled and nontransitive join operation. We formally define the desired security properties in the UC framework and propose a generic protocol that provably satisfies them. The core of our scheme is a 3-party oblivious and convertible PRF, which we believe to be of independent interest.

[1]  T. Grance,et al.  SP 800-122. Guide to Protecting the Confidentiality of Personally Identifiable Information (PII) , 2010 .

[2]  Vitaly Shmatikov,et al.  How To Break Anonymity of the Netflix Prize Dataset , 2006, ArXiv.

[3]  Aggelos Kiayias,et al.  Highly-Efficient and Composable Password-Protected Secret Sharing (Or: How to Protect Your Bitcoin Wallet Online) , 2016, 2016 IEEE European Symposium on Security and Privacy (EuroS&P).

[4]  Hari Balakrishnan,et al.  CryptDB: protecting confidentiality with encrypted query processing , 2011, SOSP.

[5]  Abhishek Banerjee,et al.  New and Improved Key-Homomorphic Pseudorandom Functions , 2014, CRYPTO.

[6]  Ran Canetti,et al.  Universally composable security: a new paradigm for cryptographic protocols , 2001, Proceedings 2001 IEEE International Conference on Cluster Computing.

[7]  Jan Camenisch,et al.  (Un)linkable Pseudonyms for Governmental Databases , 2015, CCS.

[8]  Moni Naor,et al.  Distributed Pseudo-random Functions and KDCs , 1999, EUROCRYPT.

[9]  Nickolai Zeldovich,et al.  Cryptographic Treatment of CryptDB's Adjustable Join , 2012 .

[10]  Jan Camenisch,et al.  Privacy-Preserving User-Auditable Pseudonym Systems , 2017, 2017 IEEE European Symposium on Security and Privacy (EuroS&P).

[11]  Dan Boneh,et al.  Key Homomorphic PRFs and Their Applications , 2013, CRYPTO.

[12]  Hannes Federrath,et al.  Hashing of personally identifiable information is not sufficient , 2018, Sicherheit.

[13]  Matthew Green,et al.  Improved proxy re-encryption schemes with applications to secure distributed storage , 2006, TSEC.

[14]  Peter Scholl,et al.  Extending Oblivious Transfer with Low Communication via Key-Homomorphic PRFs , 2018, Public Key Cryptography.

[15]  Anja Lehmann,et al.  Updatable Tokenization: Formal Definitions and Provably Secure Constructions , 2017, Financial Cryptography.

[16]  Kenneth G. Paterson,et al.  Key Rotation for Authenticated Encryption , 2017, CRYPTO.

[17]  Hugo Krawczyk,et al.  Threshold Partially-Oblivious PRFs with Applications to Key Management , 2018, IACR Cryptol. ePrint Arch..

[18]  Xiaomin Liu,et al.  Efficient Oblivious Pseudorandom Function with Applications to Adaptive OT and Secure Computation of Set Intersection , 2009, TCC.

[19]  Y. de Montjoye,et al.  Unique in the shopping mall: On the reidentifiability of credit card metadata , 2015, Science.

[20]  Matt Blaze,et al.  Divertible Protocols and Atomic Proxy Cryptography , 1998, EUROCRYPT.

[21]  Emiliano De Cristofaro,et al.  Linear-Complexity Private Set Intersection Protocols Secure in Malicious Model , 2010, ASIACRYPT.

[22]  Thomas Ristenpart,et al.  The Pythia PRF Service , 2015, USENIX Security Symposium.

[23]  Vinod Vaikuntanathan,et al.  Constrained Key-Homomorphic PRFs from Standard Lattice Assumptions - Or: How to Secretly Embed a Circuit in Your PRF , 2015, TCC.

[24]  Gil Segev,et al.  Strengthening the Security of Encrypted Databases: Non-Transitive JOINs , 2017, IACR Cryptol. ePrint Arch..