Finding a Needle in a Haystack: Variant Effect Predictor (VEP) Prioritizes Disease Causative Variants from Millions of Neutral Ones

The Ensembl Variant Effect Predictor is an integrative computational platform which could provide analysis, genomic annotation, and pathogenicity predictions of genetic sequence variants lying both protein-coding and noncoding regions of the human genome. This webserver acts as a gateway to a diverse range of genomic annotations and one step platform to enter mutation data and analyze different formats of prediction outcomes. This webserver is open access and easy to use and provides reproducible results. VEP simplifies variant analysis and interpretation in diverse study settings of the human genome. This chapter describes basic navigation for VEP users and illustrates how they could use the web-based interface to analyze the single-nucleotide variants (SNVs). This includes (i) data input, (ii) pathogenicity predictions, (iii) preview of results, and (iv) downloading the results.

[1]  Magalie S Leduc,et al.  Clinical whole-exome sequencing for the diagnosis of mendelian disorders. , 2013, The New England journal of medicine.

[2]  Cathy H. Wu,et al.  UniProt: the Universal Protein knowledgebase , 2004, Nucleic Acids Res..

[3]  F. Cunningham,et al.  The Ensembl Variant Effect Predictor , 2016, Genome Biology.

[4]  C. Sander,et al.  Predicting the functional impact of protein mutations: application to cancer genomics , 2011, Nucleic acids research.

[5]  Jae-Hwan Jhong,et al.  Erratum to: Meta-analytic support vector machine for integrating multiple omics data , 2017, BioData Mining.

[6]  A. Siepel,et al.  Probabilities of Fitness Consequences for Point Mutations Across the Human Genome , 2014, Nature Genetics.

[7]  J. Bonfield,et al.  Finishing the euchromatic sequence of the human genome , 2004, Nature.

[8]  Gabor T. Marth,et al.  A global reference for human genetic variation , 2015, Nature.

[9]  A. Gonzalez-Perez,et al.  Improving the assessment of the outcome of nonsynonymous SNVs with a consensus deleteriousness score, Condel. , 2011, American journal of human genetics.

[10]  Euan A Ashley,et al.  Clinical interpretation and implications of whole-genome sequencing. , 2014, JAMA.

[11]  D. Goldstein,et al.  Uncovering the roles of rare variants in common disease through whole-genome sequencing , 2010, Nature Reviews Genetics.

[12]  Nazneen Rahman,et al.  CSN and CAVA: variant annotation tools for rapid, robust next-generation sequencing analysis in the clinical setting , 2015, Genome Medicine.

[13]  Amos Bairoch,et al.  The PROSITE database , 2005, Nucleic Acids Res..

[14]  I. Adzhubei,et al.  Predicting Functional Effect of Human Missense Mutations Using PolyPhen‐2 , 2013, Current protocols in human genetics.

[15]  Eric Boerwinkle,et al.  In silico tools for splicing defect prediction - A survey from the viewpoint of end-users , 2013, Genetics in Medicine.

[16]  Jana Marie Schwarz,et al.  MutationTaster2: mutation prediction for the deep-sequencing age , 2014, Nature Methods.

[17]  M. Daly,et al.  A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms , 2001, Nature.

[18]  Robert D. Finn,et al.  InterPro: the integrative protein signature database , 2008, Nucleic Acids Res..

[19]  Lilia M. Iakoucheva,et al.  MutPred2: inferring the molecular and phenotypic impact of amino acid variants , 2017, bioRxiv.

[20]  Rinku Sharma,et al.  Birth defects in India: Hidden truth, need for urgent attention , 2013, Indian journal of human genetics.

[21]  Xiaohui Xie,et al.  DANN: a deep learning approach for annotating the pathogenicity of genetic variants , 2015, Bioinform..

[22]  E. Boerwinkle,et al.  dbNSFP v2.0: A Database of Human Non‐synonymous SNVs and Their Functional Predictions and Annotations , 2013, Human mutation.

[23]  R. Gibbs,et al.  Comparison and integration of deleteriousness prediction methods for nonsynonymous SNVs in whole exome sequencing studies. , 2015, Human molecular genetics.

[24]  Tim Hubbard Finishing the euchromatic sequence of the human genome , 2004 .

[25]  Toshihiro Tanaka The International HapMap Project , 2003, Nature.

[26]  Daniel G. MacArthur,et al.  The ExAC browser: displaying reference data information from over 60 000 exomes , 2016, bioRxiv.

[27]  Gill Bejerano,et al.  M-CAP eliminates a majority of variants of uncertain significance in clinical exomes at high sensitivity , 2016, Nature Genetics.

[28]  J. Buxbaum,et al.  A SPECTRAL APPROACH INTEGRATING FUNCTIONAL GENOMIC ANNOTATIONS FOR CODING AND NONCODING VARIANTS , 2015, Nature Genetics.

[29]  R. Myers,et al.  Quality assessment of the human genome sequence , 2004, Nature.

[30]  Kei-Hoi Cheung,et al.  A Statistical Framework to Predict Functional Non-Coding Regions in the Human Genome Through Integrated Analysis of Annotation Data , 2015, Scientific Reports.

[31]  H. Carter,et al.  Identifying Mendelian disease genes with the Variant Effect Scoring Tool , 2013, BMC Genomics.

[32]  Jing Hu,et al.  SIFT web server: predicting effects of amino acid substitutions on proteins , 2012, Nucleic Acids Res..

[33]  J. Shendure,et al.  A general framework for estimating the relative pathogenicity of human genetic variants , 2014, Nature Genetics.

[34]  Trevor Hastie,et al.  REVEL: An Ensemble Method for Predicting the Pathogenicity of Rare Missense Variants. , 2016, American journal of human genetics.

[35]  E. Birney,et al.  Pfam: the protein families database , 2013, Nucleic Acids Res..

[36]  P. Stenson,et al.  Human Gene Mutation Database (HGMD®): 2003 update , 2003, Human mutation.

[37]  Junaid Gamieldien,et al.  A new tool for prioritization of sequence variants from whole exome sequencing data , 2016, Source Code for Biology and Medicine.

[38]  Yongwook Choi,et al.  PROVEAN web server: a tool to predict the functional effect of amino acid substitutions and indels , 2015, Bioinform..

[39]  Stylianos E. Antonarakis,et al.  Mendelian disorders deserve more attention , 2006, Nature Reviews Genetics.

[40]  Leif Groop,et al.  LoFtool: a gene intolerance score based on loss‐of‐function variants in 60 706 individuals , 2016, Bioinform..