Accurately annotate compound effects of genetic variants using a context-sensitive framework

Abstract In genomics, effectively identifying the biological effects of genetic variants is crucial. Current methods handle each variant independently, assuming that each variant acts in a context-free manner. However, variants within the same gene may interfere with each other, producing combinational (compound) rather than individual effects. In this work, we introduce COPE, a gene-centric variant annotation tool that integrates the entire sequential context in evaluating the functional effects of intra-genic variants. Applying COPE to the 1000 Genomes dataset, we identified numerous cases of multiple-variant compound effects that frequently led to false-positive and false-negative loss-of-function calls by conventional variant-centric tools. Specifically, 64 disease-causing mutations were identified to be rescued in a specific genomic context, thus potentially contributing to the buffering effects for highly penetrant deleterious mutations. COPE is freely available for academic use at http://cope.cbi.pku.edu.cn.

[1]  Joel Dudley,et al.  Interpreting functional effects of coding variants: challenges in proteome-scale prediction, annotation and assessment , 2016, Briefings Bioinform..

[2]  Brian T. Naughton,et al.  Analysis of 589,306 genomes identifies individuals resilient to severe Mendelian childhood diseases , 2016, Nature Biotechnology.

[3]  Harry Hemingway,et al.  Health and population effects of rare gene knockouts in adult humans with related parents , 2015, Science.

[4]  Ricardo Villamarín-Salomón,et al.  ClinVar: public archive of interpretations of clinically relevant variants , 2015, Nucleic Acids Res..

[5]  Wen J. Li,et al.  Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation , 2015, Nucleic Acids Res..

[6]  Juw Won Park,et al.  Discover hidden splicing variations by mapping personal transcriptomes to personal genomes , 2015, Nucleic acids research.

[7]  Yeon Jeong Kim,et al.  Intron retention is a widespread mechanism of tumor-suppressor inactivation , 2015, Nature Genetics.

[8]  Gabor T. Marth,et al.  A global reference for human genetic variation , 2015, Nature.

[9]  C. Morrison,et al.  MAC: identifying and correcting annotation for multi-nucleotide variations , 2015, BMC Genomics.

[10]  Matthew W. Snyder,et al.  Haplotype-resolved genome sequencing: experimental methods and applications , 2015, Nature Reviews Genetics.

[11]  M. Bracken,et al.  Confronting the missing epistasis problem: on the reproducibility of gene–gene interactions , 2015, Human Genetics.

[12]  Núria Queralt-Rosinach,et al.  DisGeNET: a discovery platform for the dynamical exploration of human diseases and their genes , 2015, Database J. Biol. Databases Curation.

[13]  Trevor J Pugh,et al.  Oncotator: Cancer Variant Annotation Tool , 2015, Human mutation.

[14]  Steven L Salzberg,et al.  HISAT: a fast spliced aligner with low memory requirements , 2015, Nature Methods.

[15]  Bale,et al.  Standards and Guidelines for the Interpretation of Sequence Variants: A Joint Consensus Recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology , 2015, Genetics in Medicine.

[16]  Eric Boerwinkle,et al.  In silico prediction of splice-altering single nucleotide variants in the human genome , 2014, Nucleic acids research.

[17]  J. Shendure,et al.  A general framework for estimating the relative pathogenicity of human genetic variants , 2014, Nature Genetics.

[18]  Pedro G. Ferreira,et al.  Transcriptome and genome sequencing uncovers functional variation in humans , 2013, Nature.

[19]  C. Haley,et al.  An Evolutionary Perspective on Epistasis and the Missing Heritability , 2013, PLoS genetics.

[20]  I. Adzhubei,et al.  Predicting Functional Effect of Human Missense Mutations Using PolyPhen‐2 , 2013, Current protocols in human genetics.

[21]  Angela Hobbs,et al.  Chitotriosidase deficiency: a mutation update in an african population. , 2012, JIMD reports.

[22]  Manolis Kellis,et al.  Interpreting non-coding variation in complex disease genetics , 2012, Nature Biotechnology.

[23]  Nansheng Chen,et al.  CooVar: Co-occurring variant analyzer , 2012, BMC Research Notes.

[24]  M. Swertz,et al.  Mutation update on the CHD7 gene involved in CHARGE syndrome , 2012, Human mutation.

[25]  Joseph K. Pickrell,et al.  A Systematic Survey of Loss-of-Function Variants in Human Protein-Coding Genes , 2012, Science.

[26]  O. Delaneau,et al.  A linear complexity phasing method for thousands of genomes , 2011, Nature Methods.

[27]  Morris A Swertz,et al.  The international dystrophic epidermolysis bullosa patient registry: An online database of dystrophic epidermolysis bullosa patients and their COL7A1 mutations , 2011, Human mutation.

[28]  Andrew C. Adey,et al.  Haplotype-resolved genome sequencing of a Gujarati Indian individual , 2011, Nature Biotechnology.

[29]  A. Kasarskis,et al.  A window into third-generation sequencing. , 2010, Human molecular genetics.

[30]  H. Hakonarson,et al.  ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data , 2010, Nucleic acids research.

[31]  Daniel Rios,et al.  Bioinformatics Applications Note Databases and Ontologies Deriving the Consequences of Genomic Variants with the Ensembl Api and Snp Effect Predictor , 2022 .

[32]  Donna M. Martin,et al.  Molecular and phenotypic aspects of CHD7 mutation in CHARGE syndrome , 2010, American journal of medical genetics. Part A.

[33]  Vineet Bafna,et al.  HapCUT: an efficient and accurate algorithm for the haplotype assembly problem , 2008, ECCB.

[34]  Leif E. Peterson,et al.  Spectrum of CHD7 mutations in 110 individuals with CHARGE syndrome and genotype-phenotype correlation. , 2006, American journal of human genetics.

[35]  Rolf Backofen,et al.  Single-nucleotide polymorphisms in NAGNAG acceptors are highly predictive for variations of alternative splicing. , 2006, American journal of human genetics.

[36]  P. Stenson,et al.  The Human Gene Mutation Database (HGMD) and Its Exploitation in the Study of Mutational Mechanisms , 2005, Current protocols in bioinformatics.

[37]  G. Church,et al.  The Personal Genome Project , 2005, Molecular systems biology.

[38]  Han G Brunner,et al.  Mutations in a new member of the chromodomain gene family cause CHARGE syndrome , 2004, Nature Genetics.

[39]  Christopher B. Burge,et al.  Maximum entropy modeling of short sequence motifs with applications to RNA splicing signals , 2003, RECOMB '03.

[40]  H. Cordell Epistasis: what it means, what it doesn't mean, and statistical methods to detect it in humans. , 2002, Human molecular genetics.