Genetic Data Sharing and Privacy

Genetic data has provided valuable insights into disease cause and risk as well as drug discovery and development in neuroscience. For example, human genetics studies have provided insights into cognition (Glahn et al. 2013) and psychiatric disorders (Kao et al. 2010). The genetic basis of several inherited disorders such as Down's Syndrome and Tay-Sachs disease are well known, and other associations such as the role of APOE in Alzheimer's disease are still extensively studied. However, despite advances in understanding the human genome, there are concerns about the privacy of genetic data and potential discrimination resulting from its disclosure, and there has been incomplete oversight of genetic testing (Scheuner et al. 2008). At the same time, there have been increased efforts to share research data to enable scientific discovery and achieve cost efficiencies. It has become clear that no scientist can guarantee absolute privacy, and it is also increasingly recognized that research will work better if scientists have more information about the people they study and that being identifiable has some benefits (Angrist 2013). There are examples of pioneering efforts in neuroscience research. The fMRI Data Center is a leader in open-access data sharing in the functional neuroimaging community, overcoming logistical, cultural and funding barriers (Mennes et al. 2013). Similarly, the INCF Task Force on Neuroimaging Datasharing has started work on tools to ease and automate sharing of raw, processed, and derived neuroimaging data and metadata (Poline et al. 2012). In the United States, legislation such as the Health Insurance Portability and Accountability Act (HIPAA) (G o s t i n 2 0 0 1) a n d t h e G e n e t i c I n f o r m a t i o n Nondiscrimination Act have attempted to limit access to sensitive data and discrimination related to health insurance and employment, but it has been known for over a decade that seemingly anonymized data can be related to publicly available information to identify specific individuals (Braun et al. 2009) using diagnosis codes (Tamersoy et al. 2010), rare visible disorders (Eguale et al. 2005), allele frequencies (Craig et al. 2011), place and date of birth (Acquisti and Gross 2009), a combination of a surname with age and state (Gymrek et al. 2013), and patient health location visit patterns (Malin 2007). Re-identification methods have included genotype-phenotype inferences, family structures , and dictionary attacks (Malin 2005). In total, …

[1]  Robyn Tamblyn,et al.  Rare Visible disorders/Diseases as Individually Identifiable Health Information , 2005, AMIA.

[2]  Misha Angrist Genetic privacy needs a more nuanced approach , 2013, Nature.

[3]  Ewout W Steyerberg,et al.  Outcome prediction after mild and complicated mild traumatic brain injury: external validation of existing models and identification of new predictors using the TRACK-TBI pilot study. , 2015, Journal of neurotrauma.

[4]  Don Barry Original by J. Stevens Measurement in Maintenance Management , 2016 .

[5]  Jinghui Zhang,et al.  Needles in the Haystack: Identifying Individuals Present in Pooled Genomic Data , 2009, PLoS genetics.

[6]  Clark C. Evans,et al.  Using global unique identifiers to link autism collections , 2010, J. Am. Medical Informatics Assoc..

[7]  Patricia Gillard,et al.  Perspectives of Australian adults about protecting the privacy of their health information in statistical databases , 2012, Int. J. Medical Informatics.

[8]  Bradley Malin,et al.  Evaluating re-identification risks with respect to the HIPAA privacy rule , 2010, J. Am. Medical Informatics Assoc..

[9]  Pratik Mukherjee,et al.  Magnetic resonance imaging improves 3‐month outcome prediction in mild traumatic brain injury , 2012, Annals of neurology.

[10]  Satrajit S. Ghosh,et al.  Data sharing in neuroimaging research , 2012, Front. Neuroinform..

[11]  Joshua C Denny,et al.  Anonymization of administrative billing codes with repeated diagnoses through censoring. , 2010, AMIA ... Annual Symposium proceedings. AMIA Symposium.

[12]  Rita Noumeir,et al.  Pseudonymization of Radiology Data for Research Purposes , 2007, Journal of Digital Imaging.

[13]  Hester F. Lingsma,et al.  Transforming research and clinical knowledge in traumatic brain injury pilot: multicenter implementation of the common data elements for traumatic brain injury. , 2013, Journal of neurotrauma.

[14]  Bradley Malin,et al.  Re-identification of Familial Database Records , 2006, AMIA.

[15]  Bradley Malin,et al.  Never too old for anonymity: a statistical standard for demographic data sharing via the HIPAA Privacy Rule , 2011, J. Am. Medical Informatics Assoc..

[16]  S. Meystre,et al.  Automatic de-identification of textual documents in the electronic health record: a review of recent research , 2010, BMC medical research methodology.

[17]  John D. Van Horn,et al.  Domain-Specific Data Sharing in Neuroscience: What Do We Have to Learn from Each Other? , 2008, Neuroinformatics.

[18]  Daniel R Weinberger,et al.  Common genetic variation in Neuregulin 3 (NRG3) influences risk for schizophrenia and impacts NRG3 expression in human brain , 2010, Proceedings of the National Academy of Sciences.

[19]  Jens Meyer,et al.  Efficient data management in a large-scale epidemiology research project , 2012, Comput. Methods Programs Biomed..

[20]  Krzysztof J. Gorgolewski,et al.  Making Data Sharing Count: A Publication-Based Solution , 2012, Front. Neurosci..

[21]  Alessandro Acquisti,et al.  Predicting Social Security numbers from public data , 2009, Proceedings of the National Academy of Sciences.

[22]  Pratik Mukherjee,et al.  The impact of previous traumatic brain injury on health and functioning: a TRACK-TBI study. , 2013, Journal of neurotrauma.

[23]  Stephen T. Sherry,et al.  Assessing and managing risk when sharing aggregate genetic variant data , 2011, Nature Reviews Genetics.

[24]  Melissa C Brouwers,et al.  Written informed consent and selection bias in observational studies using medical records: systematic review , 2009, BMJ : British Medical Journal.

[25]  Michael Nolte,et al.  Solving Problems of Disclosure Risk in an Academic Setting: Using a Combination of Restricted Data and Restricted Access Methods , 2006, Journal of empirical research on human research ethics : JERHRE.

[26]  Bradley Malin,et al.  A computational model to protect patient data from location-based re-identification , 2007, Artif. Intell. Medicine.

[27]  Peter Kochunov,et al.  Genetic basis of neurocognitive decline and reduced white-matter integrity in normal human brain aging , 2013, Proceedings of the National Academy of Sciences.

[28]  R. Bharat Rao,et al.  Secure De-identification and Re-identification , 2003, AMIA.

[29]  K. Brazil,et al.  Access to medical records for research purposes: varying perceptions across research ethics boards , 2008, Journal of Medical Ethics.

[30]  J. Wardlaw,et al.  An open source toolkit for medical imaging de-identification , 2010, European Radiology.

[31]  N Cohen,et al.  Coding of DNA Samples and Data in the Pharmaceutical Industry: Current Practices and Future Directions—Perspective of the I‐PWG , 2011, Clinical pharmacology and therapeutics.

[32]  J. Marc Overhage,et al.  Application of Information Technology: A Context-sensitive Approach to Anonymizing Spatial Surveillance Data: Impact on Outbreak Detection , 2006, J. Am. Medical Informatics Assoc..

[33]  Clement J. McDonald,et al.  Application of Information Technology: A Software Tool for Removing Patient Identifying Information from Clinical Documents , 2008, J. Am. Medical Informatics Assoc..

[34]  Lynette Hirschman,et al.  The MITRE Identification Scrubber Toolkit: Design, training, and assessment , 2010, Int. J. Medical Informatics.

[35]  D. Roden,et al.  Development of a Large‐Scale De‐Identified DNA Biobank to Enable Personalized Medicine , 2008, Clinical pharmacology and therapeutics.

[36]  L. Gostin,et al.  National health information privacy: regulations under the Health Insurance Portability and Accountability Act. , 2001, JAMA.

[37]  Jianhua Liu,et al.  Toward a Fully De-identified Biomedical Information Warehouse , 2009, AMIA.

[38]  Julia Lane,et al.  Balancing access to health data and privacy: a review of the issues and approaches for the future. , 2010, Health services research.

[39]  Wayne Hall,et al.  Neuroscience research on the addictions: a prospectus for future ethical and policy analysis. , 2004, Addictive behaviors.

[40]  Clement J. McDonald,et al.  A successful technique for removing names in pathology reports using an augmented search and replace method , 2002, AMIA.

[41]  Chia-Hung Hsiao,et al.  Embedding a Hiding Function in a Portable Electronic Health Record for Privacy Preservation , 2008, Journal of Medical Systems.

[42]  Eran Halperin,et al.  Identifying Personal Genomes by Surname Inference , 2013, Science.

[43]  Yasuhiro Fujiwara,et al.  De-identification procedure and sample quality of the post-clinical test samples at the bio-repository of the National Cancer Center Hospital (NCCH) in Tokyo. , 2011, Japanese journal of clinical oncology.

[44]  Dams-O'ConnorKristen,et al.  The Impact of Previous Traumatic Brain Injury on Health and Functioning: A TRACK-TBI Study , 2013 .

[45]  A Berghold,et al.  The Genome Austria Tissue Bank (GATiB) , 2007, Pathobiology.

[46]  B. Malin,et al.  Anonymization of electronic medical records for validating genome-wide association studies , 2010, Proceedings of the National Academy of Sciences.

[47]  Bradley Malin,et al.  Technical Evaluation: An Evaluation of the Current State of Genomic Data Privacy Protection Technology and a Roadmap for the Future , 2004, J. Am. Medical Informatics Assoc..

[48]  Bharat B. Biswal,et al.  Making data sharing work: The FCP/INDI experience , 2013, NeuroImage.

[49]  Hester F. Lingsma,et al.  Acute biomarkers of traumatic brain injury: relationship between plasma levels of ubiquitin C-terminal hydrolase-L1 and glial fibrillary acidic protein. , 2014, Journal of neurotrauma.

[50]  Robert M. Goor,et al.  Assessing and managing risk when sharing aggregate genetic variant data , 2011, Nature Reviews Genetics.

[51]  Paul G Shekelle,et al.  Delivery of genomic medicine for common chronic adult diseases: a systematic review. , 2008, JAMA.