Data collected from various data sources are destined to be logically interrelated but geographically distributed. Top-k query is an efficient way to find the most important objects from high volumes of data. A common way to process the top-k query over distributed data is to bring them to a centralized entity (e.g. cloud). However, there are privacy considerations during the top-k query when dealing with sensitive data (e.g. eHealthcare data) in such method. Apart from data privacy, efficiency also needs to be taken into consideration. Existing focuses on top-k query do not (fully) consider the data privacy or efficiency. In order to deal with the mentioned disadvantages, in this paper, we propose an efficient and privacy-preserving top-k query scheme over vertically distributed data. Specifically, we first design a data filtering technique to reduce the number of transmitted data from each data source to the centralized entity, which can greatly reduce the communication overhead and computational cost. Then, we propose a privacy-preserving top-k query scheme over encrypted data by deploying the homomorphic encryption technique, which can well preserve the private information and achieve the functionality at the same time. Besides, security analysis shows that the proposed scheme is privacy-preserving and performance evaluation validates the efficiency of the proposed scheme.
[1]
Kim-Kwang Raymond Choo,et al.
Achieving Efficient and Privacy-Preserving Cross-Domain Big Data Deduplication in Cloud
,
2017,
IEEE Transactions on Big Data.
[2]
Ihab F. Ilyas,et al.
A survey of top-k query processing techniques in relational database systems
,
2008,
CSUR.
[3]
Moni Naor,et al.
Optimal aggregation algorithms for middleware
,
2001,
PODS.
[4]
Emmanuel Bresson,et al.
A Simple Public-Key Cryptosystem with a Double Trapdoor Decryption Mechanism and Its Applications
,
2003,
ASIACRYPT.
[5]
Brent Waters,et al.
Secure attribute-based systems
,
2006,
CCS '06.
[6]
Kim-Kwang Raymond Choo,et al.
Achieving high performance and privacy-preserving query over encrypted multidimensional big metering data
,
2018,
Future Gener. Comput. Syst..