Secure Similarity Queries: Enabling Precision Medicine with Privacy

Up till now, most medical treatments are designed for average patients. However, one size doesn’t fit all, treatments that work well for some patients may not work for others. Precision medicine is an emerging approach for disease treatment and prevention that takes into account individual variability in people’s genes, environments, lifestyles, etc. A critical component for precision medicine is to search existing treatments for a new patient by similarity queries. However, this also raises significant concerns about patient privacy, i.e., how such sensitive medical data would be managed and queried while ensuring patient privacy? In this paper, we (1) briefly introduce the background of the precision medicine initiative, (2) review existing secure kNN queries and introduce a new class of secure skyline queries, (3) summarize the challenges and investigate potential techniques for secure skyline queries.

[1]  Tanzima Hashem,et al.  Privacy preserving group nearest neighbor queries , 2010, EDBT '10.

[2]  Elisa Bertino,et al.  Practical k nearest neighbor queries with location privacy , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[3]  Bin Jiang,et al.  Probabilistic Skylines on Uncertain Data , 2007, VLDB.

[4]  Philip S. Yu,et al.  Privacy-preserving data publishing: A survey of recent developments , 2010, CSUR.

[5]  Gabriel Ghinita,et al.  An efficient privacy-preserving system for monitoring mobile users: making searchable encryption practical , 2014, CODASPY '14.

[6]  Sakti Pramanik,et al.  Fast approximate search algorithm for nearest neighbor queries in high dimensions , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[7]  Ashwin Machanavajjhala,et al.  l-Diversity: Privacy Beyond k-Anonymity , 2006, ICDE.

[8]  Craig Gentry,et al.  Fully homomorphic encryption using ideal lattices , 2009, STOC '09.

[9]  Ninghui Li,et al.  t-Closeness: Privacy Beyond k-Anonymity and l-Diversity , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[10]  Bernhard Seeger,et al.  Efficient Computation of Reverse Skyline Queries , 2007, VLDB.

[11]  Craig Gentry,et al.  Computing arbitrary functions of encrypted data , 2010, CACM.

[12]  Donald Kossmann,et al.  The Skyline operator , 2001, Proceedings 17th International Conference on Data Engineering.

[13]  F. Collins,et al.  A new initiative on precision medicine. , 2015, The New England journal of medicine.

[14]  Pascal Paillier,et al.  Public-Key Cryptosystems Based on Composite Degree Residuosity Classes , 1999, EUROCRYPT.

[15]  Mikhail J. Atallah,et al.  Efficient Privacy-Preserving k-Nearest Neighbor Search , 2008, 2008 The 28th International Conference on Distributed Computing Systems.

[16]  Nikos Mamoulis,et al.  Secure kNN computation on encrypted databases , 2009, SIGMOD Conference.

[17]  Feifei Li,et al.  Secure nearest neighbor revisited , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[18]  Wei Jiang,et al.  Secure k-nearest neighbor query over encrypted data in outsourced environments , 2013, 2014 IEEE 30th International Conference on Data Engineering.

[19]  Haoran Li,et al.  Finding Probabilistic k-Skyline Sets on Uncertain Data , 2015, CIKM.

[20]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[21]  Chun Yuan,et al.  Differentially Private Data Release through Multidimensional Partitioning , 2010, Secure Data Management.

[22]  Joshua Zhexue Huang,et al.  Rating: Privacy Preservation for Multiple Attributes with Different Sensitivity Requirements , 2011, 2011 IEEE 11th International Conference on Data Mining Workshops.

[23]  Xiaofeng Xu,et al.  Faster output-sensitive skyline computation algorithm , 2014, Inf. Process. Lett..

[24]  Vaidy S. Sunderam,et al.  Monitoring web browsing behavior with differential privacy , 2014, WWW.

[25]  Benjamin C. M. Fung,et al.  Publishing set-valued data via differential privacy , 2011, Proc. VLDB Endow..

[26]  Xiaoqian Jiang,et al.  Differentially Private Synthesization of Multi-Dimensional Data using Copula Functions , 2014, EDBT.

[27]  Jian Pei,et al.  Finding Pareto Optimal Groups: Group-based Skyline , 2015, Proc. VLDB Endow..

[28]  Bernhard Seeger,et al.  Progressive skyline computation in database systems , 2005, TODS.

[29]  David G. Kirkpatrick,et al.  Output-size sensitive algorithms for finding maximal vectors , 1985, SCG '85.

[30]  Stavros Papadopoulos,et al.  Nearest neighbor search with strong location privacy , 2010, Proc. VLDB Endow..

[31]  Brent Waters,et al.  Conjunctive, Subset, and Range Queries on Encrypted Data , 2007, TCC.

[32]  Cyrus Shahabi,et al.  Indexing land surface for efficient kNN query , 2008, Proc. VLDB Endow..

[33]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[34]  Jianliang Xu,et al.  Processing private queries over untrusted data cloud through privacy homomorphism , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[35]  Hoeteck Wee,et al.  Toward Privacy in Public Databases , 2005, TCC.