Can you Really Anonymize the Donors of Genomic Data in Today's Digital World?

The rapid progress in genome sequencing technologies leads to availability of high amounts of genomic data. Accelerating the pace of biomedical breakthroughs and discoveries necessitates not only collecting millions of genetic samples but also granting open access to genetic databases. However, one growing concern is the ability to protect the privacy of sensitive information and its owner. In this work, we survey a wide spectrum of cross-layer privacy breaching strategies to human genomic data using both public genomic databases and other public non-genomic data. We outline the principles and outcomes of each technique, and assess its technological complexity and maturation. We then review potential privacy-preserving countermeasure mechanisms for each threat.

[1]  William W. Eaton,et al.  Genetic research participation in a young adult community sample , 2014, Journal of Community Genetics.

[2]  Shiro Ueda,et al.  Public involvement in pharmacogenomics research: a national survey on patients’ attitudes towards pharmacogenomics research and the willingness to donate DNA samples to a DNA bank in Japan , 2009, Cell and Tissue Banking.

[3]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[4]  P. Visscher,et al.  On Jim Watson's APOE status: genetic information is hard to hide , 2009, European Journal of Human Genetics.

[5]  Jean-Pierre Hubaux,et al.  Protecting and evaluating genomic privacy in medical tests and personalized medicine , 2013, WPES.

[6]  Carl A. Gunter,et al.  Privacy in the Genomic Era , 2014, ACM Comput. Surv..

[7]  Michael Y. Galperin,et al.  The 2015 Nucleic Acids Research Database Issue and Molecular Biology Database Collection , 2014, Nucleic Acids Res..

[8]  Emiliano De Cristofaro,et al.  Countering GATTACA: efficient and secure testing of fully-sequenced human genomes , 2011, CCS '11.

[9]  K. Hao,et al.  Bayesian method to predict individual SNP genotypes from gene expression data , 2012, Nature Genetics.

[10]  Eran Halperin,et al.  Identifying Personal Genomes by Surname Inference , 2013, Science.

[11]  S. Nelson,et al.  Resolving Individuals Contributing Trace Amounts of DNA to Highly Complex Mixtures Using High-Density SNP Genotyping Microarrays , 2008, PLoS genetics.

[12]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[13]  Manfred Kayser,et al.  Improving human forensics through advances in genetics, genomics and molecular biology , 2011, Nature Reviews Genetics.

[14]  F. Collins,et al.  A vision for the future of genomics research , 2003, Nature.

[15]  Yaniv Erlich,et al.  Routes for breaching and protecting genetic privacy , 2013 .

[16]  Zhen Lin,et al.  Genomic Research and Human Subject Privacy , 2004, Science.

[17]  Daniel C. Barth-Jones,et al.  The 'Re-Identification' of Governor William Weld's Medical Information: A Critical Re-Examination of Health Data Identification Risks and Privacy Protections, Then and Now , 2012 .

[18]  Boris Yamrom,et al.  The contribution of de novo coding mutations to autism spectrum disorder , 2014, Nature.

[19]  Latanya Sweeney,et al.  Identifying Participants in the Personal Genome Project by Name , 2013, ArXiv.

[20]  Adam Molyneaux,et al.  Privacy-Preserving Processing of Raw Genomic Data , 2013, DPM/SETOP.

[21]  J. Kaiser Human genetics. Agency nixes deCODE's new data-mining plan. , 2013, Science.

[22]  Kenneth K. Kidd,et al.  SNPs for a universal individual identification panel , 2010, Human Genetics.

[23]  Jill M. Pulley,et al.  Attitudes and perceptions of patients towards methods of establishing a DNA biobank , 2008, Cell and Tissue Banking.

[24]  Jean-Pierre Hubaux,et al.  Reconciling Utility with Privacy in Genomics , 2014, WPES.

[25]  Jean-Pierre Hubaux,et al.  Addressing the concerns of the lacks family: quantification of kin genomic privacy , 2013, CCS.