Skyline Query Processing over Encrypted Data: An Attribute-Order-Preserving-Free Approach

Making co-existent and convergent the need for efficiency of relational query processing over Clouds and the security of data themselves is figuring-out how one of the most challenging research problems in the Big Data era. Indeed, in actual analytics-oriented engines, such as Google Analytics and Amazon S3, where key-value storage-representation and efficient-management models are employed as to cope with the simultaneous processing of billions of transactions, querying encrypted data is becoming one of the most annoying problem, which has also attracted a great deal of attention from the research community. While this issue has been applied to a large variety of data formats, e.g. relational, RDF and multidimensional data, very few initiatives have pointed-out skyline query processing over encrypted data, which is, indeed, relevant for database analytics. In order to fulfill this methodological and technological gap, in this paper we present eSkyline, a prototype system and query interface that enables the processing of skyline queries over encrypted data, even without preserving the order on each attribute as order-preserving encryption would do. Our system comprises of an encryption scheme that facilitates the evaluation of domination relationships, hence allows for state-of-the-art skyline processing algorithms to be used. In order to prove the effectiveness and the reliability of our system, we also provide the details of the underlying encryption scheme, plus a suitable GUI that allows a user to interact with a server, and showcases the efficiency of computing skyline queries and decrypting the results.

[1]  Katja Hose,et al.  Distributed skyline processing: a trend in database research still going strong , 2012, EDBT '12.

[2]  Oded Goldreich,et al.  The Foundations of Cryptography - Volume 2: Basic Applications , 2001 .

[3]  Florian Waas,et al.  Online Expansion of Largescale Data Warehouses , 2011, Proc. VLDB Endow..

[4]  Donald Kossmann,et al.  The Skyline operator , 2001, Proceedings 17th International Conference on Data Engineering.

[5]  S. Griffis EDITOR , 1997, Journal of Navigation.

[6]  Alfredo Cuzzocrea,et al.  A Robust Sampling-Based Framework for Privacy Preserving OLAP , 2008, DaWaK.

[7]  Florian Waas,et al.  Online expansion of large-scale data warehouses , 2011, VLDB 2011.

[8]  Hakan Hacigümüs,et al.  Executing SQL over encrypted data in the database-service-provider model , 2002, SIGMOD '02.

[9]  Divyakant Agrawal,et al.  Big data and cloud computing: current state and future opportunities , 2011, EDBT/ICDT '11.

[10]  Murat Kantarcioglu,et al.  Query Optimization in Encrypted Relational Databases by Vertical Schema Partitioning , 2009, Secure Data Management.

[11]  Yike Guo,et al.  Enhanced user data privacy with pay-by-data model , 2013, 2013 IEEE International Conference on Big Data.

[12]  Bernhard Seeger,et al.  An optimal and progressive algorithm for skyline queries , 2003, SIGMOD '03.

[13]  Xiaofeng Meng,et al.  HEDC: a histogram estimator for data in the cloud , 2012, CloudDB '12.

[14]  Ken Eguro,et al.  Querying encrypted data , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[15]  Stan Matwin,et al.  Processing OLAP Queries over an Encrypted Data Warehouse Stored in the Cloud , 2014, DaWaK.

[16]  Xiaofeng Meng,et al.  ESQP: an efficient SQL query processing for cloud data management , 2010, CloudDB '10.

[17]  Oded Goldreich,et al.  Foundations of Cryptography: Volume 2, Basic Applications , 2004 .

[18]  Liang Dong,et al.  Starfish: A Self-tuning System for Big Data Analytics , 2011, CIDR.

[19]  Craig Gentry,et al.  Fully homomorphic encryption using ideal lattices , 2009, STOC '09.

[20]  Jan Chomicki,et al.  Skyline with presorting , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[21]  Divyakant Agrawal,et al.  Secure and privacy-preserving database services in the cloud , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[22]  Nathan Chenette,et al.  Order-Preserving Symmetric Encryption , 2009, IACR Cryptol. ePrint Arch..

[23]  Ruixuan Li,et al.  Efficient multi-keyword ranked query over encrypted data in cloud computing , 2014, Future Gener. Comput. Syst..

[24]  Stéphane Betgé-Brezetz,et al.  End-to-end privacy policy enforcement in cloud infrastructure , 2013, 2013 IEEE 2nd International Conference on Cloud Networking (CloudNet).

[25]  Nathan Chenette,et al.  Order-Preserving Encryption Revisited: Improved Security Analysis and Alternative Solutions , 2011, CRYPTO.

[26]  J. Manyika Big data: The next frontier for innovation, competition, and productivity , 2011 .

[27]  Sheng Zhong,et al.  Privacy-Preserving Queries on Encrypted Data , 2006, ESORICS.

[28]  Nikos Mamoulis,et al.  Secure kNN computation on encrypted databases , 2009, SIGMOD Conference.

[29]  Abraham Silberschatz,et al.  HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads , 2009, Proc. VLDB Endow..

[30]  Meiko Jensen Challenges of Privacy Protection in Big Data Analytics , 2013, 2013 IEEE International Congress on Big Data.

[31]  Alfredo Cuzzocrea,et al.  Balancing accuracy and privacy of OLAP aggregations on data cubes , 2010, DOLAP '10.

[32]  Joseph M. Hellerstein,et al.  MAD Skills: New Analysis Practices for Big Data , 2009, Proc. VLDB Endow..

[33]  Peng Liu,et al.  MyCloud: supporting user-configured privacy protection in cloud computing , 2013, ACSAC.

[34]  Hakan Hacigümüs,et al.  Efficient Execution of Aggregation Queries over Encrypted Relational Databases , 2004, DASFAA.

[35]  Hakan Hacigümüs,et al.  Providing database as a service , 2002, Proceedings 18th International Conference on Data Engineering.

[36]  Jarek Gryz,et al.  Maximal Vector Computation in Large Data Sets , 2005, VLDB.

[37]  Alfredo Cuzzocrea Analytics over Big Data: Exploring the Convergence of DataWarehousing, OLAP and Data-Intensive Cloud Infrastructures , 2013, 2013 IEEE 37th Annual Computer Software and Applications Conference.

[38]  I. Song,et al.  Analytics over large-scale multidimensional data: the big data revolution! , 2011, DOLAP '11.

[39]  Ramakrishnan Srikant,et al.  Order preserving encryption for numeric data , 2004, SIGMOD '04.

[41]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[42]  Murat Kantarcioglu,et al.  Design and Analysis of Querying Encrypted Data in Relational Databases , 2007, DBSec.

[43]  Panos Kalnis,et al.  Enabling search services on outsourced private spatial data , 2009, The VLDB Journal.

[44]  Mark Giereth,et al.  On Partial Encryption of RDF-Graphs , 2005, SEMWEB.