Privacy preserving large scale DNA read-mapping in MapReduce framework using FPGAs

Read-mapping, i.e., finding certain patterns in a long DNA sequence, is an important operation for molecular biology. It is widely used in a variety of biological analyses including SNP discovery, genotyping and personal genomics. As next-generation DNA sequencing machines are generating an enormous amount of sequence data, it is a good choice to implement the read-mapping algorithm in the MapReduce framework and outsource the computation to the cloud. Data privacy becomes a big concern in this situation as DNA sequences are very sensitive. In response, encryption may be used to protect the data. However, it is very difficult for the cloud to process cipher texts. In the MapReduce framework, even if values (data to be processed) may be protected by encryption, keys cannot be encrypted using sematic secure encryption schemes as it will affect the MapReduce scheduling mechanism. But if no protection is utilized, attackers may extract useful information from unprotected keys. We propose a solution that can securely outsource read-mapping computations in the MapReduce framework by leveraging inherent tamper resistant properties of FPGAs. We also provide a method to protect the keys generated in this process. We implement our solution using FPGAs and apply it to some data sets. The security evaluation and experimental results show that with this method, DNA sequence privacy is well protected, and the extra cost is acceptable.

[1]  Brent Waters,et al.  Ciphertext-Policy Attribute-Based Encryption , 2007, 2007 IEEE Symposium on Security and Privacy (SP '07).

[2]  Jürgen Teich,et al.  Netlist-level IP protection by watermarking for LUT-based FPGAs , 2008, 2008 International Conference on Field-Programmable Technology.

[3]  Tom Feist,et al.  Vivado Design Suite , 2012 .

[4]  Jorge Guajardo,et al.  Extended abstract: The butterfly PUF protecting IP on every FPGA , 2008, 2008 IEEE International Workshop on Hardware-Oriented Security and Trust.

[5]  Brent Waters,et al.  Attribute-based encryption for fine-grained access control of encrypted data , 2006, CCS '06.

[6]  M. Potkonjak,et al.  Robust FPGA intellectual property protection through multiple small watermarks , 1999, Proceedings 1999 Design Automation Conference (Cat. No. 99CH36361).

[7]  Tom VanCourt,et al.  FPGA acceleration of quasi-Monte Carlo in finance , 2008, 2008 International Conference on Field Programmable Logic and Applications.

[8]  Weidong Shi,et al.  PFC: Privacy Preserving FPGA Cloud - A Case Study of MapReduce , 2014, 2014 IEEE 7th International Conference on Cloud Computing.

[9]  Yu Wang,et al.  FPMR: MapReduce framework on FPGA , 2010, FPGA '10.

[10]  Elaine Shi,et al.  PHANTOM: practical oblivious computation in a secure processor , 2013, CCS.

[11]  Jorge Guajardo,et al.  FPGA Intrinsic PUFs and Their Use for IP Protection , 2007, CHES.

[12]  Martin C. Herbordt,et al.  FPGA Acceleration of Rigid Molecule Interactions , 2004, FPL.

[13]  Craig Gentry,et al.  Fully homomorphic encryption using ideal lattices , 2009, STOC '09.