A Scalable Distributed Private Stream Search System

With the coming of the era of big data, people are more concerned about data privacy. On the one hand, the users are more eager for fresh and low-latency search results than ever before. On the other hand, they do not want to open the search criteria. To this end, this paper proposes a scalable distributed private stream search system, in which the search criteria is hidden by homomorphic encryption technique with three buffers. Most importantly, the system adopts shared-nothing architecture to support the horizontal scalability, and partitions the stream into segments to achieve parallel query and bitmap index-based storage. Experimental results show the effectiveness and efficiency of our method on private stream search.

[1]  Kristin E. Lauter,et al.  Cryptographic Cloud Storage , 2010, Financial Cryptography Workshops.

[2]  Pascal Paillier,et al.  Public-Key Cryptosystems Based on Composite Degree Residuosity Classes , 1999, EUROCRYPT.

[3]  Jay Kreps,et al.  Kafka : a Distributed Messaging System for Log Processing , 2011 .

[4]  Andreas Peter,et al.  A Survey of Provably Secure Searchable Encryption , 2014, ACM Comput. Surv..

[5]  Scott Shenker,et al.  Discretized Streams: An Efficient and Fault-Tolerant Model for Stream Processing on Large Clusters , 2012, HotCloud.

[6]  Brent Waters,et al.  New Techniques for Private Stream Searching , 2009, TSEC.

[7]  Parag Agrawal,et al.  The case for RAMClouds: scalable high-performance storage entirely in DRAM , 2010, OPSR.

[8]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[9]  Alessandro Colantonio,et al.  Concise: Compressed 'n' Composable Integer Set , 2010, Inf. Process. Lett..

[10]  Eyal Kushilevitz,et al.  Private information retrieval , 1998, JACM.

[11]  Dan Boneh,et al.  Evaluating 2-DNF Formulas on Ciphertexts , 2005, TCC.

[12]  Meng Xiaofeng and Ci Xiang,et al.  Big Data Management: Concepts,Techniques and Challenges , 2013 .

[13]  Xun Yi,et al.  Private (t,n) Threshold Searching on Streaming Data , 2012, 2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Confernece on Social Computing.

[14]  Hugo Krawczyk,et al.  Outsourced symmetric private information retrieval , 2013, IACR Cryptol. ePrint Arch..

[15]  Rafail Ostrovsky,et al.  Private Searching on Streaming Data , 2005, Journal of Cryptology.

[16]  Milind Bhandarkar Hadoop: a view from the trenches , 2013, KDD.

[17]  Mahadev Konar,et al.  ZooKeeper: Wait-free Coordination for Internet-scale Systems , 2010, USENIX ATC.

[18]  Goetz Graefe,et al.  The five-minute rule ten years later, and other computer storage rules of thumb , 1997, SGMD.

[19]  Lu Liu,et al.  Muppet: MapReduce-Style Processing of Fast Data , 2012, Proc. VLDB Endow..