Crowdsourced direct-to-consumer genomic analysis of a family quartet

BackgroundWe describe the pioneering experience of a Spanish family pursuing the goal of understanding their own personal genetic data to the fullest possible extent using Direct to Consumer (DTC) tests. With full informed consent from the Corpas family, all genotype, exome and metagenome data from members of this family, are publicly available under a public domain Creative Commons 0 (CC0) license waiver. All scientists or companies analysing these data (“the Corpasome”) were invited to return results to the family.MethodsWe released 5 genotypes, 4 exomes, 1 metagenome from the Corpas family via a blog and figshare under a public domain license, inviting scientists to join the crowdsourcing efforts to analyse the genomes in return for coauthorship or acknowldgement in derived papers. Resulting analysis data were compiled via social media and direct email.ResultsHere we present the results of our investigations, combining the crowdsourced contributions and our own efforts. Four companies offering annotations for genomic variants were applied to four family exomes: BIOBASE, Ingenuity, Diploid, and GeneTalk. Starting from a common VCF file and after selecting for significant results from company reports, we find no overlap among described annotations. We additionally report on a gut microbiome analysis of a member of the Corpas family.ConclusionsThis study presents an analysis of a diverse set of tools and methods offered by four DTC companies. The striking discordance of the results mirrors previous findings with respect to DTC analysis of SNP chip data, and highlights the difficulties of using DTC data for preventive medical care. To our knowledge, the data and analysis results from our crowdsourced study represent the most comprehensive exome and analysis for a family quartet using solely DTC data generation to date.

[1]  M. Leppert,et al.  A frameshift polymorphism in P2X5 elicits an allogeneic cytotoxic T lymphocyte response associated with remission of chronic myeloid leukemia. , 2005, The Journal of clinical investigation.

[2]  Larry J Kricka,et al.  Concordance study of 3 direct-to-consumer genetic-testing services. , 2011, Clinical chemistry.

[3]  Rudi Balling,et al.  Revolutionizing medicine in the 21st century through systems approaches. , 2012, Biotechnology journal.

[4]  P. Bayer,et al.  openSNP–A Crowdsourced Web Resource for Personal Genomics , 2014, PloS one.

[5]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[6]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology Information: update , 2004, Nucleic acids research.

[7]  Michael Cariaso,et al.  SNPedia: a wiki supporting personal genome annotation, interpretation and analysis , 2011, Nucleic Acids Res..

[8]  T E Klein,et al.  Clinical Pharmacogenetics Implementation Consortium Guidelines for Thiopurine Methyltransferase Genotype and Thiopurine Dosing , 2011, Clinical pharmacology and therapy.

[9]  Saeed R. Khan,et al.  Genetic basis of renal cellular dysfunction and the formation of kidney stones , 2009, Urological Research.

[10]  M. Loriot,et al.  Characterisation of novel defective thiopurine S-methyltransferase allelic variants. , 2008, Biochemical pharmacology.

[11]  S. Henikoff,et al.  Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm , 2009, Nature Protocols.

[12]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology Information , 2008, Nucleic Acids Res..

[13]  M. Hiratsuka,et al.  Functional characterization of 23 allelic variants of thiopurine S-methyltransferase gene (TPMT*2 – *24) , 2008, Pharmacogenetics and genomics.

[14]  Jared C. Roach,et al.  Chromosomal haplotypes by genetic phasing of human families. , 2011, American journal of human genetics.

[15]  Gustavo Glusman,et al.  Low budget analysis of Direct-To-Consumer genomic testing familial data , 2012, F1000Research.

[16]  D. Altshuler,et al.  A map of human genome variation from population-scale sequencing , 2010, Nature.

[17]  P. Shannon,et al.  Analysis of Genetic Inheritance in a Family Quartet by Whole-Genome Sequencing , 2010, Science.

[18]  Anna Wojas-Pelc,et al.  Model-based prediction of human hair color using DNA variants , 2011, Human Genetics.

[19]  I. Adzhubei,et al.  Predicting Functional Effect of Human Missense Mutations Using PolyPhen‐2 , 2013, Current protocols in human genetics.

[20]  F. Hu,et al.  A Genome-Wide Association Study Identifies Novel Alleles Associated with Hair Color and Skin Pigmentation , 2008, PLoS genetics.

[21]  S. Seshagiri,et al.  The emerging mutational landscape of G proteins and G-protein-coupled receptors in cancer , 2013, Nature Reviews Cancer.

[22]  W. Valdivia-Granda Biosurveillance enterprise for operational awareness, a genomic-based approach for tracking pathogen virulence , 2013, Virulence.

[23]  Manuel Corpas,et al.  Crowdsourcing the Corpasome , 2013, Source Code for Biology and Medicine.

[24]  H. Hakonarson,et al.  Low concordance of multiple variant-calling pipelines: practical implications for exome and genome sequencing , 2013, Genome Medicine.

[25]  R. Terkeltaub,et al.  Differential mechanisms of inorganic pyrophosphate production by plasma cell membrane glycoprotein-1 and B10 in chondrocytes. , 1999, Arthritis and rheumatism.

[26]  Hugues Bersini,et al.  InSilico DB genomic datasets hub: an efficient starting point for analyzing genome-wide studies in GenePattern, Integrative Genomics Viewer, and R/Bioconductor , 2012, Genome Biology.

[27]  F. Dhombres,et al.  Representation of rare diseases in health information systems: The orphanet approach to serve a wide range of end users , 2012, Human mutation.

[28]  Margaret Grieco,et al.  Keeping it in the family , 2016 .

[29]  H. Bolouri Computational Challenges of Personal Genomics , 2008, Current genomics.

[30]  Peggy Hall,et al.  The NHGRI GWAS Catalog, a curated resource of SNP-trait associations , 2013, Nucleic Acids Res..

[31]  Robert C. Green,et al.  Variations in predicted risks in personal genome testing for common complex diseases , 2013, Genetics in Medicine.

[32]  Joseph K. Pickrell,et al.  A Systematic Survey of Loss-of-Function Variants in Human Protein-Coding Genes , 2012, Science.

[33]  T. Dörk,et al.  Mutation Analysis of the ERCC4/FANCQ Gene in Hereditary Breast Cancer , 2014, PloS one.

[34]  Deanna M. Church,et al.  ClinVar: public archive of relationships among sequence variation and human phenotype , 2013, Nucleic Acids Res..

[35]  Tom Kamphans,et al.  GeneTalk: an expert exchange platform for assessing rare sequence variants in personal genomes , 2012, Bioinform..

[36]  J. Rees Genetics of hair and skin color. , 2003, Annual review of genetics.

[37]  Leroy Hood,et al.  Systems Biology and P4 Medicine: Past, Present, and Future , 2013, Rambam Maimonides medical journal.

[38]  Willy A Valdivia-Granda,et al.  Bioinformatics for biodefense: challenges and opportunities. , 2010, Biosecurity and bioterrorism : biodefense strategy, practice, and science.

[39]  G. Church,et al.  The Personal Genome Project , 2005, Molecular systems biology.

[40]  D. Absher,et al.  Genome-Wide Association Studies of Quantitatively Measured Skin, Hair, and Eye Pigmentation in Four European Populations , 2012, PloS one.

[41]  P. Stenson,et al.  The Human Gene Mutation Database: building a comprehensive mutation repository for clinical and molecular genetics, diagnostic testing and personalized genomic medicine , 2013, Human Genetics.

[42]  H. Sokol,et al.  Faecalibacterium prausnitzii and human intestinal health. , 2013, Current opinion in microbiology.

[43]  W. Catalona,et al.  Analysis of Candidate Genes for Prostate Cancer , 2004, Human Heredity.

[44]  M. DePristo,et al.  A framework for variation discovery and genotyping using next-generation DNA sequencing data , 2011, Nature Genetics.

[45]  Nicola Abate,et al.  Role of ENPP1 on Adipocyte Maturation , 2007, PloS one.