Toward Efficient Multi-Keyword Fuzzy Search Over Encrypted Outsourced Data With Accuracy Improvement

Keyword-based search over encrypted outsourced data has become an important tool in the current cloud computing scenario. The majority of the existing techniques are focusing on multi-keyword exact match or single keyword fuzzy search. However, those existing techniques find less practical significance in real-world applications compared with the multi-keyword fuzzy search technique over encrypted data. The first attempt to construct such a multi-keyword fuzzy search scheme was reported by Wang et al., who used locality-sensitive hashing functions and Bloom filtering to meet the goal of multi-keyword fuzzy search. Nevertheless, Wang's scheme was only effective for a one letter mistake in keyword but was not effective for other common spelling mistakes. Moreover, Wang's scheme was vulnerable to server out-of-order problems during the ranking process and did not consider the keyword weight. In this paper, based on Wang et al.'s scheme, we propose an efficient multi-keyword fuzzy ranked search scheme based on Wang et al.'s scheme that is able to address the aforementioned problems. First, we develop a new method of keyword transformation based on the uni-gram, which will simultaneously improve the accuracy and creates the ability to handle other spelling mistakes. In addition, keywords with the same root can be queried using the stemming algorithm. Furthermore, we consider the keyword weight when selecting an adequate matching file set. Experiments using real-world data show that our scheme is practically efficient and achieve high accuracy.

[1]  Eu-Jin Goh,et al.  Secure Indexes , 2003, IACR Cryptol. ePrint Arch..

[2]  Cong Wang,et al.  Secure Ranked Keyword Search over Encrypted Cloud Data , 2010, 2010 IEEE 30th International Conference on Distributed Computing Systems.

[3]  Ruixuan Li,et al.  Efficient Multi-Keyword Ranked Query on Encrypted Data in the Cloud , 2012, 2012 IEEE 18th International Conference on Parallel and Distributed Systems.

[4]  Rafail Ostrovsky,et al.  Searchable symmetric encryption: improved definitions and efficient constructions , 2006, CCS '06.

[5]  Xiao Yu,et al.  Privacy-Preserving Ranked Multi-keyword Fuzzy Search on Cloud Encrypted Data Supporting Range Query , 2015 .

[6]  Zhihua Xia,et al.  A Secure and Dynamic Multi-Keyword Ranked Search Scheme over Encrypted Cloud Data , 2016, IEEE Transactions on Parallel and Distributed Systems.

[7]  Xingming Sun,et al.  Semantic keyword search based on trie over encrypted cloud data , 2014, SCC '14.

[8]  Nicole Immorlica,et al.  Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.

[9]  Ilsun You,et al.  Verifiable Auditing for Outsourced Database in Cloud Computing , 2015, IEEE Transactions on Computers.

[10]  Cong Wang,et al.  Efficient verifiable fuzzy keyword search over encrypted data in cloud computing , 2013, Comput. Sci. Inf. Syst..

[11]  Xingming Sun,et al.  Enabling Personalized Search over Encrypted Outsourced Data with Efficiency Improvement , 2016, IEEE Transactions on Parallel and Distributed Systems.

[12]  Wei Zhang,et al.  Catch You if You Misbehave: Ranked Keyword Search Results Verification in Cloud Computing , 2018, IEEE Transactions on Cloud Computing.

[13]  Murat Kantarcioglu,et al.  Efficient Similarity Search over Encrypted Data , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[14]  Julie Beth Lovins,et al.  Development of a stemming algorithm , 1968, Mech. Transl. Comput. Linguistics.

[15]  Brent Waters,et al.  Secure Conjunctive Keyword Search over Encrypted Data , 2004, ACNS.

[16]  Michael Mitzenmacher,et al.  Privacy Preserving Keyword Searches on Remote Encrypted Data , 2005, ACNS.

[17]  Yi Yang,et al.  Enabling Fine-Grained Multi-Keyword Search Supporting Classified Sub-Dictionaries over Encrypted Cloud Data , 2016, IEEE Transactions on Dependable and Secure Computing.

[18]  Xingming Sun,et al.  Smart cloud search services: verifiable keyword-based semantic search over encrypted cloud data , 2014, IEEE Transactions on Consumer Electronics.

[19]  Dawn Xiaodong Song,et al.  Practical techniques for searches on encrypted data , 2000, Proceeding 2000 IEEE Symposium on Security and Privacy. S&P 2000.

[20]  Ming Li,et al.  Verifiable Privacy-Preserving Multi-Keyword Text Search in the Cloud Supporting Similarity-Based Ranking , 2014, IEEE Trans. Parallel Distributed Syst..

[21]  Liehuang Zhu,et al.  Fuzzy keyword search on encrypted cloud storage data with small index , 2011, 2011 IEEE International Conference on Cloud Computing and Intelligence Systems.

[22]  Feifei Li,et al.  Secure nearest neighbor revisited , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[23]  M. Chuah,et al.  Privacy-Aware BedTree Based Solution for Fuzzy Multi-keyword Search over Encrypted Data , 2011, 2011 31st International Conference on Distributed Computing Systems Workshops.

[24]  Yiwei Thomas Hou,et al.  Privacy-preserving multi-keyword fuzzy search over encrypted data in the cloud , 2014, IEEE INFOCOM 2014 - IEEE Conference on Computer Communications.

[25]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[26]  Yiwei Thomas Hou,et al.  Inverted index based multi-keyword public-key searchable encryption with strong privacy guarantee , 2015, 2015 IEEE Conference on Computer Communications (INFOCOM).

[27]  Ian H. Witten,et al.  Managing Gigabytes: Compressing and Indexing Documents and Images , 1999 .

[28]  Brent Waters,et al.  Conjunctive, Subset, and Range Queries on Encrypted Data , 2007, TCC.

[29]  N. Cao,et al.  Privacy-preserving multi-keyword ranked search over encrypted cloud data , 2011, 2011 Proceedings IEEE INFOCOM.

[30]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[31]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.