Dynamic Skyline Queries on Encrypted Data Using Result Materialization

Skyline computation is an increasingly popular query, with broad applicability in domains such as healthcare, travel and finance. Given the recent trend to outsource databases and query evaluation, and due to the proprietary and sometimes highly sensitivity nature of the data (e.g., in healthcare), it is essential to evaluate skylines on encrypted datasets. Several research efforts acknowledged the importance of secure skyline computation, but existing solutions suffer from at least one of the following shortcomings: (i) they only provide ad-hoc security; (ii) they are prohibitively expensive; or (iii) they rely on unrealistic assumptions, such as the presence of multiple non-colluding parties in the protocol. Inspired from solutions for secure nearest-neighbors (NN) computation, we conjecture that the most secure and efficient way to compute skylines is through result materialization. However, this approach is significantly more challenging for skylines than for NN queries. We exhaustively study and provide algorithms for pre-computation of skyline results, and we perform an in-depth theoretical analysis of this process. We show that pre-computing results while minimizing storage overhead is NP-hard, and we provide dynamic programming and greedy heuristics that solve the problem more efficiently, while maintaining storage at reasonable levels. Our algorithms are novel and applicable to plain-text skyline computation, but we focus on the encrypted setting where materialization reduces the cost of skyline computation from hours to seconds. Extensive experiments show that we clearly outperform existing work in terms of performance, and our security analysis proves that we obtain a smaller (and quantifiable) data leakage than competitors.

[1]  Tanzima Hashem,et al.  Privacy preserving group nearest neighbor queries , 2010, EDBT '10.

[2]  Christian Buchta,et al.  On the Average Number of Maxima in a Set of Vectors , 1989, Inf. Process. Lett..

[3]  David J. Wu,et al.  Practical Order-Revealing Encryption with Limited Leakage , 2016, FSE.

[4]  Jian Pei,et al.  Secure and Efficient Skyline Queries on Encrypted Data , 2018, IEEE Transactions on Knowledge and Data Engineering.

[5]  Elaine Shi,et al.  Towards Practical Oblivious RAM , 2011, NDSS.

[6]  Ximeng Liu,et al.  CINEMA: Efficient and Privacy-Preserving Online Medical Primary Diagnosis With Skyline Query , 2019, IEEE Internet of Things Journal.

[7]  Xiang Lian,et al.  Monochromatic and bichromatic reverse skyline search over uncertain databases , 2008, SIGMOD Conference.

[8]  Jianliang Xu,et al.  Processing private queries over untrusted data cloud through privacy homomorphism , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[9]  Donald Kossmann,et al.  Shooting Stars in the Sky: An Online Algorithm for Skyline Queries , 2002, VLDB.

[10]  Seung-won Hwang,et al.  Continuous Skylining on Volatile Moving Data , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[11]  Muhammad Aamir Cheema,et al.  A safe zone based approach for monitoring moving skyline queries , 2013, EDBT '13.

[12]  Bernhard Seeger,et al.  An optimal and progressive algorithm for skyline queries , 2003, SIGMOD '03.

[13]  Alfredo Cuzzocrea,et al.  Skyline Query Processing over Encrypted Data: An Attribute-Order-Preserving-Free Approach , 2014, PSBD '14.

[14]  Cyrus Shahabi,et al.  The spatial skyline queries , 2006, VLDB.

[15]  Panos Kalnis,et al.  Private queries in location based services: anonymizers are not necessary , 2008, SIGMOD Conference.

[16]  Anthony K. H. Tung,et al.  Continuous Skyline Queries for Moving Objects , 2006, IEEE Transactions on Knowledge and Data Engineering.

[17]  Wei Jiang,et al.  Secure k-nearest neighbor query over encrypted data in outsourced environments , 2013, 2014 IEEE 30th International Conference on Data Engineering.

[18]  Jian Pei,et al.  Finding Pareto Optimal Groups: Group-based Skyline , 2015, Proc. VLDB Endow..

[19]  Rui Zhang,et al.  Secure outsourced skyline query processing via untrusted cloud service providers , 2016, IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications.

[20]  Ling Ren,et al.  Path ORAM , 2012, J. ACM.

[21]  Donald Kossmann,et al.  The Skyline operator , 2001, Proceedings 17th International Conference on Data Engineering.

[22]  Anthony K. H. Tung,et al.  Finding k-dominant skylines in high dimensional space , 2006, SIGMOD Conference.

[23]  Nickolai Zeldovich,et al.  An Ideal-Security Protocol for Order-Preserving Encoding , 2013, 2013 IEEE Symposium on Security and Privacy.

[24]  Jianliang Xu,et al.  Range-Based Skyline Queries in Mobile Environments , 2013, IEEE Transactions on Knowledge and Data Engineering.

[25]  Jian Pei,et al.  Skyline Diagram: Finding the Voronoi Counterpart for Skyline Queries , 2018, 2018 IEEE 34th International Conference on Data Engineering (ICDE).

[26]  Xu Chen,et al.  Fast Algorithms for Pareto Optimal Group-based Skyline , 2017, CIKM.

[27]  Feifei Li,et al.  Secure nearest neighbor revisited , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[28]  Bernhard Seeger,et al.  Efficient Computation of Reverse Skyline Queries , 2007, VLDB.

[29]  Hakan Hacigümüs,et al.  Executing SQL over encrypted data in the database-service-provider model , 2002, SIGMOD '02.

[30]  Jianfeng Ma,et al.  Efficient and privacy-preserving skyline computation framework across domains , 2016, Future Gener. Comput. Syst..