Responsible, practical genomic data sharing that accelerates research

[1]  Daniel S. Himmelstein,et al.  Compressing gene expression data using multiple latent space dimensionalities learns complementary biological representations , 2020, Genome Biology.

[2]  Barbara McGillivray,et al.  The citation advantage of linking publications to research data , 2019, PloS one.

[3]  Ryan L. Collins,et al.  The mutational constraint spectrum quantified from variation in 141,456 humans , 2020, Nature.

[4]  I. Deary,et al.  Genome-wide analysis identifies molecular systems and 149 genetic loci associated with income , 2019, Nature Communications.

[5]  Graham Coop,et al.  Attacks on genetic privacy via uploads to genealogical databases , 2019, bioRxiv.

[6]  Mark Gerstein,et al.  FANCY: fast estimation of privacy risk in functional genomics data , 2019, bioRxiv.

[7]  Isabella Peters,et al.  The effect of bioRxiv preprints on citations and altmetrics , 2019, bioRxiv.

[8]  Jean-Simon Brouard,et al.  The GATK joint genotyping workflow is appropriate for calling variants in RNA-seq experiments , 2019, Journal of Animal Science and Biotechnology.

[9]  David Haussler,et al.  Barriers to accessing public cancer genomic data , 2019, Scientific Data.

[10]  Jun Cheng,et al.  The Kipoi repository accelerates community exchange and reuse of predictive models for genomics , 2019, Nature Biotechnology.

[11]  Gabriel Popkin,et al.  Data sharing and how it can benefit your scientific career , 2019, Nature.

[12]  Sheng Liu,et al.  Decomposing Cell Identity for Transfer Learning across Cellular Measurements, Platforms, Tissues, and Species. , 2019, Cell systems.

[13]  Casey S Greene,et al.  MultiPLIER: A Transfer Learning Framework for Transcriptomics Reveals Systemic Features of Rare Disease. , 2019, Cell systems.

[14]  Scott M. Williams,et al.  The Missing Diversity in Human Genetic Studies , 2019, Cell.

[15]  David Haussler,et al.  Federated discovery and sharing of genomic data using Beacons , 2019, Nature Biotechnology.

[16]  Vinay Udyawer,et al.  Early Career Researchers Embrace Data Sharing. , 2019, Trends in ecology & evolution.

[17]  Sara Mannheimer,et al.  Qualitative Data Sharing: Data Repositories and Academic Libraries as Key Partners in Addressing Challenges , 2018, The American behavioral scientist.

[18]  P. Olliaro,et al.  Sharing health research data – the role of funders in improving the impact , 2018, F1000Research.

[19]  Yaniv Erlich,et al.  Identity inference of genomic data using long-range familial searches , 2018, Science.

[20]  Casey S Greene,et al.  A parasite's perspective on data sharing , 2018, GigaScience.

[21]  J. Kaiser We will find you: DNA search used to nab Golden State Killer can home in on about 60% of white Americans , 2018, Science.

[22]  Dustin Lange,et al.  Data Is the New Oil , 2018, Towards User-Centric Transport in Europe.

[23]  Russ B. Altman,et al.  Data-driven human transcriptomic modules determined by independent component analysis , 2018, BMC Bioinformatics.

[24]  Carlo Colantuoni,et al.  Decomposing cell identity for transfer learning across cellular measurements, platforms, tissues, and species , 2018, bioRxiv.

[25]  Lenny Teytelman,et al.  No more excuses for non-reproducible methods , 2018, Nature.

[26]  Arno Klein,et al.  Assessment of the impact of shared brain imaging data on the scientific literature , 2018, Nature Communications.

[27]  Jessica L Couture,et al.  A funder-imposed data publication requirement seldom inspired data sharing , 2018, PloS one.

[28]  Mark Gerstein,et al.  Analysis of sensitive information leakage in functional genomics signal profiles through genomic deletions , 2018, Nature Communications.

[29]  Annette N. Brown,et al.  Push button replication: Is impact evaluation evidence for international development verifiable? , 2018, PloS one.

[30]  Mark Gerstein,et al.  Private information leakage from functional genomics data: Quantification with calibration experiments and reduction via data sanitization protocols , 2018, bioRxiv.

[31]  S. Goodman,et al.  Clinical Trial Participants’ Views of the Risks and Benefits of Data Sharing , 2018, The New England journal of medicine.

[32]  Christopher W. Belter,et al.  Data sharing in PLOS ONE: An analysis of Data Availability Statements , 2018, PloS one.

[33]  Mark W. Youngblood,et al.  Author Correction: Integrated genomic analyses of de novo pathways underlying atypical meningiomas , 2018, Nature Communications.

[34]  Iain Hrynaszkiewicz,et al.  Whitepaper: Practical challenges for researchers in data sharing , 2018 .

[35]  E. Green,et al.  Prioritizing diversity in human genomics research , 2017, Nature Reviews Genetics.

[36]  Margaret C. Levenstein,et al.  Data: Sharing Is Caring , 2018 .

[37]  Zhiwei Steven Wu,et al.  Privacy-Preserving Generative Deep Neural Networks Support Clinical Data Sharing , 2017, bioRxiv.

[38]  Teresa J. Feo,et al.  Structural absorption by barbule microstructures of super black bird of paradise feathers , 2018, Nature Communications.

[39]  Michèle B. Nuijten,et al.  Journal Data Sharing Policies and Statistical Reporting Inconsistencies in Psychology , 2017 .

[40]  R. Kiley,et al.  Data Sharing from Clinical Trials — A Research Funder's Perspective , 2017, The New England journal of medicine.

[41]  Arthur Brady,et al.  Strains, functions and dynamics in the expanded Human Microbiome Project , 2017, Nature.

[42]  Jie Tan,et al.  Unsupervised Extraction of Stable Expression Signatures from Public Compendia with an Ensemble of Neural Networks. , 2017, Cell systems.

[43]  Casey S Greene,et al.  Data-Sharing Models , 2017, The New England journal of medicine.

[44]  Angela N. Brooks,et al.  A Next Generation Connectivity Map: L1000 Platform and the First 1,000,000 Profiles , 2017, Cell.

[45]  F. Arnaud,et al.  From core referencing to data re-use: two French national initiatives to reinforce paleodata stewardship (National Cyber Core Repository and LTER France Retro-Observatory) , 2017 .

[46]  S. Rijcke,et al.  Open Data: the researcher perspective - survey and case studies , 2017 .

[47]  Brett K. Beaulieu-Jones,et al.  Reproducibility of computational workflows is automated using continuous analysis , 2017, Nature Biotechnology.

[48]  Dexter Hadley,et al.  Systematic integration of biomedical knowledge prioritizes drugs for repurposing , 2017, bioRxiv.

[49]  J. Mervis Data check: U.S. government share of basic research funding falls below 50% , 2017 .

[50]  R. Goodacre,et al.  Metabolomics for the masses: The future of metabolomics in a personalized world , 2017, New horizons in translational medicine.

[51]  Vitaly Shmatikov,et al.  Membership Inference Attacks Against Machine Learning Models , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[52]  Sean Khozin,et al.  Data-Sharing Models. , 2017, New England Journal of Medicine.

[53]  Casey S Greene,et al.  Celebrating parasites , 2017, Nature Genetics.

[54]  Subhashini Venugopalan,et al.  Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs. , 2016, JAMA.

[55]  B. Knoppers,et al.  Are Data Sharing and Privacy Protection Mutually Exclusive? , 2016, Cell.

[56]  Simon Oxenham,et al.  Legal confusion threatens to slow data science , 2016, Nature.

[57]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[58]  Rachel G Liao,et al.  A federated ecosystem for sharing genomic, clinical data , 2016, Science.

[59]  Oumar Gaye,et al.  Avoiding Data Dumpsters--Toward Equitable and Useful Data Sharing. , 2016, The New England journal of medicine.

[60]  David R. Kelley,et al.  Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks , 2015, bioRxiv.

[61]  Marcia McNutt,et al.  Data sharing , 2016, Science.

[62]  R. Gibbs,et al.  An open access pilot freely sharing cancer genomic data from participants in Texas , 2016, Scientific Data.

[63]  J. Drazen,et al.  Data Sharing. , 2016, The New England journal of medicine.

[64]  C. Greene,et al.  ADAGE-Based Integration of Publicly Available Pseudomonas aeruginosa Gene Expression Data with Denoising Autoencoders Illuminates Microbe-Host Interactions , 2016, mSystems.

[65]  James Y. Zou Analysis of protein-coding genetic variation in 60,706 humans , 2015, Nature.

[66]  Jean Claude Zenklusen,et al.  A Practical Guide to The Cancer Genome Atlas (TCGA) , 2016, Statistical Genomics.

[67]  C. Bustamante,et al.  Privacy Risks from Genomic Data-Sharing Beacons , 2015, American journal of human genetics.

[68]  O. Troyanskaya,et al.  Predicting effects of noncoding variants with deep learning–based sequence model , 2015, Nature Methods.

[69]  Gabor T. Marth,et al.  A global reference for human genetic variation , 2015, Nature.

[70]  Elizabeth D. Dalton,et al.  Changes in Data Sharing and Data Reuse Practices and Perceptions among Scientists Worldwide , 2015, PloS one.

[71]  Gerard J. Holzmann Points of Truth , 2015, IEEE Software.

[72]  P. Elliott,et al.  UK Biobank: An Open Access Resource for Identifying the Causes of a Wide Range of Complex Diseases of Middle and Old Age , 2015, PLoS medicine.

[73]  Michael Morrison,et al.  Dynamic consent: a patient interface for twenty-first century research networks , 2014, European Journal of Human Genetics.

[74]  B. Fitzgerald Guidance Regarding Methods for De-identification of Protected Health Information in Accordance with the Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule , 2015 .

[75]  Robert A Philibert,et al.  Methylation array data can simultaneously identify individuals and convey protected health information: an unrecognized ethical concern , 2014, Clinical Epigenetics.

[76]  S. Kishore,et al.  Primary MPNST in Childhood- A Rare Case Report. , 2014, Journal of clinical and diagnostic research : JCDR.

[77]  E. Goode,et al.  Prognostic and therapeutic relevance of molecular subtypes in high-grade serous ovarian cancer. , 2014, Journal of the National Cancer Institute.

[78]  Jingyuan Fu,et al.  Calling genotypes from public RNA-sequencing data enables identification of genetic variants that affect gene-expression levels , 2014, Genome Medicine.

[79]  Madeleine P. Ball,et al.  Harvard Personal Genome Project: lessons from participatory public research , 2014, Genome Medicine.

[80]  Jin Billy Li,et al.  Reliable identification of genomic variants from RNA-seq data. , 2013, American journal of human genetics.

[81]  Heather A. Piwowar,et al.  Data reuse and the open data citation advantage , 2013, PeerJ.

[82]  Heather A. Piwowar,et al.  Altmetrics: Value all research products , 2013, Nature.

[83]  Sean R. Davis,et al.  NCBI GEO: archive for functional genomics data sets—update , 2012, Nucleic Acids Res..

[84]  Maria Keays,et al.  ArrayExpress update—trends in database growth and links to data analysis tools , 2012, Nucleic Acids Res..

[85]  K. Hao,et al.  Bayesian method to predict individual SNP genotypes from gene expression data , 2012, Nature Genetics.

[86]  Brendan W. Vaughan,et al.  The 1000 Genomes Project: data management and community access , 2012, Nature Methods.

[87]  Tatiana A. Tatusova,et al.  BioProject and BioSample databases at NCBI: facilitating capture and organization of metadata , 2011, Nucleic Acids Res..

[88]  Gregor Hagedorn,et al.  Creative Commons licenses and the non-commercial condition: Implications for the re-use of biodiversity information , 2011, ZooKeys.

[89]  Susan M. Huse,et al.  Global Patterns of Bacterial Beta-Diversity in Seafloor and Seawater Ecosystems , 2011, PloS one.

[90]  Scott Stern,et al.  Climbing Atop the Shoulders of Giants: The Impact of Institutions on Cumulative Research , 2006, American Economic Review.

[91]  A. Fraser,et al.  Predicting genetic modifier loci using functional gene networks. , 2010, Genome research.

[92]  Anna Zhukova,et al.  Modeling sample variables with an Experimental Factor Ontology , 2010, Bioinform..

[93]  Peter M. Rice,et al.  The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants , 2009, Nucleic acids research.

[94]  G. Breen,et al.  Genetic variation : methods and protocols , 2010 .

[95]  Pauline C Ng,et al.  Whole genome sequencing. , 2010, Methods in molecular biology.

[96]  A. Vickers,et al.  Empirical Study of Data Sharing by Authors Publishing in PLoS Journals , 2009, PloS one.

[97]  J. Ragoussis Genotyping technologies for genetic research. , 2009, Annual review of genomics and human genetics.

[98]  Matthew A. Hibbs,et al.  Exploring the human genome with functional maps. , 2009, Genome research.

[99]  M. Stephens,et al.  RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. , 2008, Genome research.

[100]  S. Nelson,et al.  Resolving Individuals Contributing Trace Amounts of DNA to Highly Complex Mixtures Using High-Density SNP Genotyping Microarrays , 2008, PLoS genetics.

[101]  David Warde-Farley,et al.  GeneMANIA: a real-time multiple association network integration algorithm for predicting gene function , 2008, Genome Biology.

[102]  Vitaly Shmatikov,et al.  Robust De-anonymization of Large Sparse Datasets , 2008, 2008 IEEE Symposium on Security and Privacy (sp 2008).

[103]  Stephan Beck,et al.  The methylome: approaches for global DNA methylation profiling. , 2008, Trends in genetics : TIG.

[104]  N. Siva 1000 Genomes project , 2008, Nature Biotechnology.

[105]  K. Sirotkin,et al.  The NCBI dbGaP database of genotypes and phenotypes , 2007, Nature Genetics.

[106]  Heather A. Piwowar,et al.  Sharing Detailed Research Data Is Associated with Increased Citation Rate , 2007, PloS one.

[107]  Matthew A. Hibbs,et al.  Discovery of biological networks from diverse functional genomic data , 2005, Genome Biology.

[108]  Bradley Malin,et al.  Technical Evaluation: An Evaluation of the Current State of Genomic Data Privacy Protection Technology and a Roadmap for the Future , 2004, J. Am. Medical Informatics Assoc..

[109]  J. Handelsman Metagenomics: Application of Genomics to Uncultured Microorganisms , 2004, Microbiology and Molecular Biology Reviews.

[110]  C. Ball,et al.  Submission of Microarray Data to Public Repositories , 2004, PLoS biology.

[111]  R. Aebersold,et al.  Mass spectrometry-based proteomics , 2003, Nature.

[112]  Charles M. Perou Show me the data! , 2001, Nature Genetics.

[113]  Jason E. Stewart,et al.  Minimum information about a microarray experiment (MIAME)—toward standards for microarray data , 2001, Nature Genetics.

[114]  Christian A. Rees,et al.  Molecular portraits of human breast tumours , 2000, Nature.

[115]  Farid Neema,et al.  Data sharing , 1998 .