Heterogeneous Data Management, Polystores, and Analytics for Healthcare: VLDB 2019 Workshops, Poly and DMAH, Los Angeles, CA, USA, August 30, 2019, Revised Selected Papers

The increasing pace of data collection has led to increasing awareness of privacy risks, resulting in new data privacy regulations like General data Protection Regulation (GDPR). Such regulations are an important step, but automatic compliance checking is challenging. In this work, we present a new paradigm, Data Capsule, for automatic compliance checking of data privacy regulations in heterogeneous data processing infrastructures. Our key insight is to pair up a data subject’s data with a policy governing how the data is processed. Specified in our formal policy language: PrivPolicy, the policy is created and provided by the data subject alongside the data, and is associated with the data throughout the life-cycle of data processing (e.g., data transformation by data processing systems, data aggregation of multiple data subjects’ data). We introduce a solution for static enforcement of privacy policies based on the concept of residual policies, and present a novel algorithm based on abstract interpretation for deriving residual policies in PrivPolicy. Our solution ensures compliance automatically, and is designed for deployment alongside existing infrastructure. We also design and develop PrivGuard, a reference data capsule manager that implements all the functionalities of Data Capsule paradigm.

[1]  Craig Gentry,et al.  Pinocchio: Nearly Practical Verifiable Computation , 2013, IEEE Symposium on Security and Privacy.

[2]  Carsten Binnig,et al.  BlockchainDB - Towards a Shared Database on Blockchains , 2019, SIGMOD Conference.

[3]  Siu-Ming Yiu,et al.  SDB: A Secure Query Processing System with Data Interoperability , 2015, Proc. VLDB Endow..

[4]  Joan Feigenbaum,et al.  Using Intel Software Guard Extensions for Efficient Two-Party Secure Function Evaluation , 2016, Financial Cryptography Workshops.

[5]  Andrew Chi-Chih Yao,et al.  Protocols for secure computations , 1982, FOCS 1982.

[6]  Zachary G. Ives,et al.  Adaptive query processing: Why, How, When, and What Next? , 2007, VLDB.

[7]  Gang Chen,et al.  Database Meets Deep Learning: Challenges and Opportunities , 2016, SGMD.

[8]  Wolf-Tilo Balke,et al.  Multi-objective Query Processing for Database Systems , 2004, VLDB.

[9]  Martin Grund,et al.  CPU and cache efficient management of memory-resident databases , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[10]  Rajeev Motwani,et al.  Two Can Keep A Secret: A Distributed Architecture for Secure Database Services , 2005, CIDR.

[11]  Christoph Koch,et al.  Multi-Objective Parametric Query Optimization , 2014, Proc. VLDB Endow..

[12]  Matei Zaharia,et al.  An Oblivious General-Purpose SQL Database for the Cloud , 2017, ArXiv.

[13]  Mihir Bellare,et al.  Efficient Garbling from a Fixed-Key Blockcipher , 2013, 2013 IEEE Symposium on Security and Privacy.

[14]  Jonathan Lee,et al.  Veritas: Shared Verifiable Databases and Tables in the Cloud , 2019, CIDR.

[15]  Yuval Ishai,et al.  Extending Oblivious Transfers Efficiently , 2003, CRYPTO.

[16]  Sebastian Link,et al.  Entity Integrity, Referential Integrity, and Query Optimization with Embedded Uniqueness Constraints , 2019, 2019 IEEE 35th International Conference on Data Engineering (ICDE).

[17]  Omer Reingold,et al.  Computational Differential Privacy , 2009, CRYPTO.

[18]  Craig Gentry,et al.  Quadratic Span Programs and Succinct NIZKs without PCPs , 2013, IACR Cryptol. ePrint Arch..

[19]  Abel N. Kho,et al.  SMCQL: Secure Query Processing for Private Data Networks , 2016, Proc. VLDB Endow..

[20]  Toniann Pitassi,et al.  The Limits of Two-Party Differential Privacy , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[21]  Viktor Leis,et al.  Compiling Database Queries into Machine Code , 2014, IEEE Data Eng. Bull..

[22]  Jonathan Katz,et al.  vSQL: Verifying Arbitrary SQL Queries over Dynamic Outsourced Databases , 2017, 2017 IEEE Symposium on Security and Privacy (SP).

[23]  Michael Benedikt,et al.  Querying with Access Patterns and Integrity Constraints , 2015, Proc. VLDB Endow..

[24]  Senthil Nathan,et al.  Blockchain Meets Database: Design and Implementation of a Blockchain Relational Database , 2019, Proc. VLDB Endow..

[25]  Donald D. Chamberlin,et al.  Access Path Selection in a Relational Database Management System , 1989 .

[26]  Irit Dinur,et al.  Revealing information while preserving privacy , 2003, PODS.

[27]  Siu-Ming Yiu,et al.  Secure query processing with data interoperability in a cloud database environment , 2014, SIGMOD Conference.

[28]  Frank Wang,et al.  Splinter: Practical Private Queries on Public Data , 2017, NSDI.

[29]  Volker Markl,et al.  LEO: An autonomic query optimizer for DB2 , 2003, IBM Syst. J..

[30]  Todd C. Mowry,et al.  Relaxed Operator Fusion for In-Memory Databases: Making Compilation, Vectorization, and Prefetching Work Together At Last , 2017, Proc. VLDB Endow..

[31]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[32]  Andrew Chi-Chih Yao,et al.  How to Generate and Exchange Secrets (Extended Abstract) , 1986, FOCS.

[33]  Marcel Keller,et al.  Overdrive: Making SPDZ Great Again , 2018, IACR Cryptol. ePrint Arch..

[34]  Dan Bogdanov,et al.  Students and Taxes: a Privacy-Preserving Social Study Using Secure Computation , 2015, IACR Cryptol. ePrint Arch..

[35]  Kartik Nayak,et al.  Oblivious Data Structures , 2014, IACR Cryptol. ePrint Arch..

[36]  Jonathan Katz,et al.  Authenticated Garbling and Efficient Maliciously Secure Two-Party Computation , 2017, CCS.

[37]  Ashwin Machanavajjhala,et al.  PrivateSQL: A Differentially Private SQL Query Engine , 2019, Proc. VLDB Endow..

[38]  Lin Ma,et al.  Self-Driving Database Management Systems , 2017, CIDR.

[39]  Ramarathnam Venkatesan,et al.  Secure database-as-a-service with Cipherbase , 2013, SIGMOD '13.

[40]  Dan Bogdanov,et al.  Sharemind: A Framework for Fast Privacy-Preserving Computations , 2008, ESORICS.

[41]  Ashwin Machanavajjhala,et al.  Architecting a Differentially Private SQL Engine , 2019, CIDR.

[42]  Hari Balakrishnan,et al.  CryptDB: protecting confidentiality with encrypted query processing , 2011, SOSP.

[43]  Ivan Damgård,et al.  Secure Multiparty Computation Goes Live , 2009, Financial Cryptography.

[44]  Dawn Xiaodong Song,et al.  Towards Practical Differential Privacy for SQL Queries , 2017, Proc. VLDB Endow..

[45]  William Wallace,et al.  KloakDB: A Data Federation for Analyzing Sensitive Data with K -anonymous Query Processing , 2019 .

[46]  Chris Peikert,et al.  ALCHEMY: A Language and Compiler for Homomorphic Encryption Made easY , 2018, CCS.

[47]  Ashwin Machanavajjhala,et al.  APEx: Accuracy-Aware Differentially Private Data Exploration , 2017, SIGMOD Conference.

[48]  Magdalena Balazinska,et al.  Learning State Representations for Query Optimization with Deep Reinforcement Learning , 2018, DEEM@SIGMOD.

[49]  Somesh Jha,et al.  Outis: Crypto-Assisted Differential Privacy on Untrusted Servers , 2019, ArXiv.

[50]  Kartik Nayak,et al.  ObliVM: A Programming Framework for Secure Computation , 2015, 2015 IEEE Symposium on Security and Privacy.

[51]  Dan Boneh,et al.  Callisto: A Cryptographic Approach to Detecting Serial Perpetrators of Sexual Misconduct , 2018, COMPASS.

[52]  Olga Papaemmanouil,et al.  Deep Reinforcement Learning for Join Order Enumeration , 2018, aiDM@SIGMOD.

[53]  Craig Gentry,et al.  Outsourcing Private RAM Computation , 2014, 2014 IEEE 55th Annual Symposium on Foundations of Computer Science.

[54]  Azer Bestavros,et al.  Conclave: secure multi-party computation on big data , 2019, EuroSys.

[55]  Samuel Madden,et al.  Processing Analytical Queries over Encrypted Data , 2013, Proc. VLDB Endow..

[56]  Ion Stoica,et al.  Learning to Optimize Join Queries With Deep Reinforcement Learning , 2018, ArXiv.