The Ensembl Variant Effect Predictor

The Ensembl Variant Effect Predictor (VEP) is a powerful toolset for the analysis, annotation and prioritization of genomic variants, including in non-coding regions. The VEP accurately predicts the effects of sequence variants on transcripts, protein products, regulatory regions and binding motifs by leveraging the high quality, broad scope, and integrated nature of the Ensembl databases. In addition, it enables comparison with a large collection of existing publicly available variation data within Ensembl to provide insights into population and ancestral genetics, phenotypes and disease. The VEP is open source and free to use. It is available via a simple web interface (http://www.ensembl.org/vep), a powerful downloadable package, and both Ensembl’s Perl and REST application program interface (API) services.

[1]  A. Gonzalez-Perez,et al.  Improving the prediction of the functional impact of cancer mutations by baseline tolerance transformation , 2012, Genome Medicine.

[2]  Jun S. Liu,et al.  The Genotype-Tissue Expression (GTEx) pilot analysis: Multitissue gene regulation in humans , 2015, Science.

[3]  D. Burt,et al.  SNP and INDEL detection in a QTL region on chicken chromosome 2 associated with muscle deposition. , 2015, Animal genetics.

[4]  Jana Marie Schwarz,et al.  MutationTaster2: mutation prediction for the deep-sequencing age , 2014, Nature Methods.

[5]  Laurent Gil,et al.  Ensembl variation resources , 2010, BMC Genomics.

[6]  Alejandro Sifrim,et al.  Genetic diagnosis of developmental disorders in the DDD study: a scalable analysis of genome-wide research data , 2015, The Lancet.

[7]  Nuno A. Fonseca,et al.  Expression Atlas update—a database of gene and transcript expression from microarray- and sequencing-based functional genomics experiments , 2013, Nucleic Acids Res..

[8]  Brendan W. Vaughan,et al.  The 1000 Genomes Project: data management and community access , 2012, Nature Methods.

[9]  Michael R. Speicher,et al.  A survey of tools for variant analysis of next-generation genome sequencing data , 2013, Briefings Bioinform..

[10]  Elizabeth M. Smigielski,et al.  dbSNP: the NCBI database of genetic variation , 2001, Nucleic Acids Res..

[11]  Bronwen L. Aken,et al.  GENCODE: The reference human genome annotation for The ENCODE Project , 2012, Genome research.

[12]  S. Batzoglou,et al.  Distribution and intensity of constraint in mammalian genomic sequence. , 2005, Genome research.

[13]  Gonçalo R. Abecasis,et al.  Unified representation of genetic variants , 2015, Bioinform..

[14]  Aliz R. Rao,et al.  Rich annotation of DNA sequencing variants by leveraging the Ensembl Variant Effect Predictor with plugins , 2015, Briefings Bioinform..

[15]  Matthew R. Pocock,et al.  The Bioperl toolkit: Perl modules for the life sciences. , 2002, Genome research.

[16]  R. E. Tully,et al.  Locus Reference Genomic sequences: an improved basis for describing human DNA variants , 2010, Genome Medicine.

[17]  Eric Boerwinkle,et al.  In silico prediction of splice-altering single nucleotide variants in the human genome , 2014, Nucleic acids research.

[18]  A. Chen,et al.  PARP inhibitor treatment in ovarian and breast cancer. , 2011, Current problems in cancer.

[19]  Karsten M. Borgwardt,et al.  Whole-genome sequencing of multiple Arabidopsis thaliana populations , 2011, Nature Genetics.

[20]  Daniel Rios,et al.  A database and API for variation, dense genotyping and resequencing data , 2010, BMC Bioinformatics.

[21]  F. Collins,et al.  Potential etiologic and functional implications of genome-wide association loci for human diseases and traits , 2009, Proceedings of the National Academy of Sciences.

[22]  Tom R. Gaunt,et al.  Predicting the Functional, Molecular, and Phenotypic Consequences of Amino Acid Substitutions using Hidden Markov Models , 2012, Human mutation.

[23]  Hongyu Zhao,et al.  A review of post-GWAS prioritization approaches , 2013, Front. Genet..

[24]  Daniel Rios,et al.  Bioinformatics Applications Note Databases and Ontologies Deriving the Consequences of Genomic Variants with the Ensembl Api and Snp Effect Predictor , 2022 .

[25]  E. Boerwinkle,et al.  dbNSFP v2.0: A Database of Human Non‐synonymous SNVs and Their Functional Predictions and Annotations , 2013, Human mutation.

[26]  P. Flicek,et al.  The Ensembl Regulatory Build , 2015, Genome Biology.

[27]  J. Lupski,et al.  Human genome sequencing in health and disease. , 2012, Annual review of medicine.

[28]  S. P. Akpabio World Health Organisation , 1983, British Dental Journal.

[29]  ENCODEConsortium,et al.  An Integrated Encyclopedia of DNA Elements in the Human Genome , 2012, Nature.

[30]  Chao Chen,et al.  dbVar and DGVa: public archives for genomic structural variation , 2012, Nucleic Acids Res..

[31]  J. Shendure,et al.  A general framework for estimating the relative pathogenicity of human genetic variants , 2014, Nature Genetics.

[32]  Bjarni V. Halldórsson,et al.  Large-scale whole-genome sequencing of the Icelandic population , 2015, Nature Genetics.

[33]  Thomas Lengauer,et al.  BLUEPRINT to decode the epigenetic signature written in blood , 2012, Nature Biotechnology.

[34]  Karen Eilbeck,et al.  Improving the Sequence Ontology terminology for genomic variant annotation , 2015, Journal of Biomedical Semantics.

[35]  Peggy Hall,et al.  The NHGRI GWAS Catalog, a curated resource of SNP-trait associations , 2013, Nucleic Acids Res..

[36]  Deanna M. Church,et al.  ClinVar: public archive of relationships among sequence variation and human phenotype , 2013, Nucleic Acids Res..

[37]  Gary D Bader,et al.  Computational approaches to identify functional genetic variants in cancer genomes , 2013, Nature Methods.

[38]  F. Collins,et al.  A new initiative on precision medicine. , 2015, The New England journal of medicine.

[39]  Caroline F. Wright,et al.  DECIPHER: database for the interpretation of phenotype-linked plausibly pathogenic sequence and copy-number variation , 2013, Nucleic Acids Res..

[40]  M. Daly,et al.  Searching for missing heritability: Designing rare variant association studies , 2014, Proceedings of the National Academy of Sciences.

[41]  Rasmus Froberg Brøndum,et al.  Fine mapping QTL for female fertility on BTA04 and BTA13 in dairy cattle using HD SNP and sequence data , 2014, BMC Genomics.

[42]  Pablo Cingolani,et al.  © 2012 Landes Bioscience. Do not distribute. , 2022 .

[43]  S. O’Brien,et al.  The Genome 10K Project: a way forward. , 2015, Annual review of animal biosciences.

[44]  R. Veerkamp,et al.  Whole-genome sequencing of 234 bulls facilitates mapping of monogenic and complex traits in cattle , 2014, Nature Genetics.

[45]  H. Hakonarson,et al.  ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data , 2010, Nucleic acids research.

[46]  Manolis Kellis,et al.  Interpreting noncoding genetic variation in complex traits and human disease , 2012, Nature Biotechnology.

[47]  Jean-Baptiste Cazier,et al.  Choice of transcripts and software has a large effect on variant annotation , 2014, Genome Medicine.

[48]  Michael Eisenstein,et al.  Personalized medicine: Special treatment , 2014, Nature.

[49]  Dan M. Bolser,et al.  Ensembl Genomes 2013: scaling up access to genome-wide data , 2013, Nucleic Acids Res..

[50]  Tomas W. Fitzgerald,et al.  Large-scale discovery of novel genetic causes of developmental disorders , 2014, Nature.

[51]  Alfonso Valencia,et al.  APPRIS: annotation of principal and alternative splice isoforms , 2012, Nucleic Acids Res..

[52]  Melissa J. Landrum,et al.  RefSeq: an update on mammalian reference sequences , 2013, Nucleic Acids Res..

[53]  E. Mardis The $1,000 genome, the $100,000 analysis? , 2010, Genome Medicine.

[54]  M. Pirinen,et al.  Analysis of immune-related loci identifies 48 new susceptibility variants for multiple sclerosis , 2013, Nature Genetics.

[55]  Alessandro Vullo,et al.  The Ensembl REST API: Ensembl Data for Any Language , 2014, Bioinform..

[56]  A. Valencia,et al.  Non-coding recurrent mutations in chronic lymphocytic leukaemia , 2015, Nature.

[57]  Mingming Jia,et al.  COSMIC: mining complete cancer genomes in the Catalogue of Somatic Mutations in Cancer , 2010, Nucleic Acids Res..

[58]  S. Henikoff,et al.  Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm , 2009, Nature Protocols.

[59]  S. Antonarakis,et al.  Corrigendum: Mutation nomenclature extensions and suggestions to describe complex mutations: A discussion , 2002, Human mutation.

[60]  Data production leads,et al.  An integrated encyclopedia of DNA elements in the human genome , 2012 .

[61]  Peter Saffrey,et al.  Rapid Whole-Genome Sequencing for Genetic Disease Diagnosis in Neonatal Intensive Care Units , 2012, Science Translational Medicine.

[62]  Brian T. Lee,et al.  The UCSC Genome Browser database: 2015 update , 2014, Nucleic Acids Res..

[63]  Tom R. Gaunt,et al.  Ranking non-synonymous single nucleotide polymorphisms based on disease concepts , 2014, Human Genomics.

[64]  P. Bork,et al.  A method and server for predicting damaging missense mutations , 2010, Nature Methods.

[65]  Ingo Ruczinski,et al.  Identification of functional variants for cleft lip with or without cleft palate in or near PAX7, FGFR2, and NOG by targeted sequencing of GWAS loci. , 2015, American journal of human genetics.

[66]  P. Stenson,et al.  The Human Gene Mutation Database (HGMD) and Its Exploitation in the Fields of Personalized Genomics and Molecular Evolution , 2012, Current protocols in bioinformatics.

[67]  P. Visscher,et al.  Five years of GWAS discovery. , 2012, American journal of human genetics.

[68]  Kenny Q. Ye,et al.  An integrated map of genetic variation from 1,092 human genomes , 2012, Nature.

[69]  E. Génin,et al.  How important are rare variants in common disease? , 2014, Briefings in functional genomics.

[70]  Nuno A. Fonseca,et al.  Comparison of GENCODE and RefSeq gene annotation and the impact of reference geneset on variant effect prediction , 2015, BMC Genomics.

[71]  C. Glass,et al.  Epigenomics: Roadmap for regulation , 2015, Nature.