Clinical Annotation Reference Templates: a resource for consistent variant annotation

Annotating the impact of a variant on a gene is a vital component of genetic medicine and genetic research. Different gene annotations for the same genomic variant are possible, because different structures and sequences for the same gene are available. The clinical community typically use RefSeq NMs to annotate gene variation, which do not always match the reference genome. The scientific community typically use Ensembl ENSTs to annotate gene variation. These match the reference genome, but often do not match the equivalent NM. Often the transcripts used to annotate gene variation are not provided, impeding interoperability and consistency. Here we introduce the concept of the Clinical Annotation Reference Template (CART). CARTs are analogous to the reference genome; they provide a universal standard template so reference genomic coordinates are consistently annotated at the protein level. Naturally, there are many situations where annotations using a specific transcript, or multiple transcripts are useful. The aim of the CARTs is not to impede this practice. Rather, the CART annotation serves as an anchor to ensure interoperability between different annotation systems and variant frequency accuracy. Annotations using other explicitly-named transcripts should also be provided, wherever useful. We have integrated transcript data to generate CARTs for over 18,000 genes, for both GRCh37 and GRCh38, based on the associated NM and ENST identified through the CART selection process. Each CART has a unique ID and can be used individually or as a stable set of templates; CART37A for GRCh37 and CART38A for GRCh38. We have made the CARTs available on the UCSC browser and in different file formats on the Open Science Framework: https://osf.io/tcvbq/ . We have also made the CARTtools software we used to generate the CARTs available on GitHub. We hope the CARTs will be useful in helping to drive transparent, stable, consistent, interoperable variant annotation.

[1]  David Haussler,et al.  The UCSC Genome Browser database: 2018 update , 2017, Nucleic Acids Res..

[2]  Sian Ellard,et al.  Mutation surveyor: software for DNA sequence analysis. , 2011, Methods in molecular biology.

[3]  James Y. Zou Analysis of protein-coding genetic variation in 60,706 humans , 2015, Nature.

[4]  Alfonso Valencia,et al.  APPRIS 2017: principal isoforms for multiple gene sets , 2017, Nucleic Acids Res..

[5]  Nazneen Rahman,et al.  CSN and CAVA: variant annotation tools for rapid, robust next-generation sequencing analysis in the clinical setting , 2015, Genome Medicine.

[6]  D. Valle,et al.  Online Mendelian Inheritance In Man (OMIM) , 2000, Human mutation.

[7]  C. Cole,et al.  The COSMIC Cancer Gene Census: describing genetic dysfunction across all human cancers , 2018, Nature Reviews Cancer.

[8]  Pablo Cingolani,et al.  © 2012 Landes Bioscience. Do not distribute. , 2022 .

[9]  Chunlei Liu,et al.  ClinVar: improving access to variant interpretations and supporting evidence , 2017, Nucleic Acids Res..

[10]  Astrid Gall,et al.  Ensembl 2018 , 2017, Nucleic Acids Res..

[11]  H. Hakonarson,et al.  ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data , 2010, Nucleic acids research.

[12]  S. Seal,et al.  The ICR96 exon CNV validation series: a resource for orthogonal assessment of exon CNV calling in NGS data , 2017, Wellcome Open Research.

[13]  E. Harkness,et al.  The Contribution of Whole Gene Deletions and Large Rearrangements to the Mutation Spectrum in Inherited Tumor Predisposing Syndromes , 2016, Human mutation.

[14]  F. Cunningham,et al.  The Ensembl Variant Effect Predictor , 2016, Genome Biology.

[15]  Joshua L. Deignan,et al.  ACMG clinical laboratory standards for next-generation sequencing , 2013, Genetics in Medicine.

[16]  A. Chakravarti,et al.  Idiopathic congenital central hypoventilation syndrome: evaluation of brain-derived neurotrophic factor genomic DNA sequence variation. , 2002, American journal of medical genetics.

[17]  Susan Tweedie,et al.  Genenames.org: the HGNC and VGNC resources in 2017 , 2016, Nucleic Acids Res..