G-quadruplex forming sequences in the genome of all known human viruses: A comprehensive guide

G-quadruplexes are non-canonical nucleic-acid structures that control transcription, replication, and recombination in organisms. G-quadruplexes are present in eukaryotes, prokaryotes, and viruses. In the latter, mounting evidence indicates their key biological activity. Since data on viruses are scattered, we here present a comprehensive analysis of potential quadruplex-forming sequences (PQS) in the genome of all known viruses that can infect humans. We show that occurrence and location of PQSs are features characteristic of each virus class and family. Our statistical analysis proves that their presence within the viral genome is orderly arranged, as indicated by the possibility to correctly assign up to two-thirds of viruses to their exact class based on the PQS classification. For each virus we provide: i) the list of all PQS present in the genome (positive and negative strands), ii) their position in the viral genome, iii) the degree of conservation among strains of each PQS in its genome context, iv) the statistical significance of PQS abundance. This information is accessible from a database to allow the easy navigation of the results: http://www.medcomp.medicina.unipd.it/main_site/doku.php?id=g4virus. The availability of these data will greatly expedite research on G-quadruplex in viruses, with the possibility to accelerate finding therapeutic opportunities to numerous and some fearsome human diseases.

[1]  Mark A. Sullivan,et al.  U3 Region in the HIV-1 Genome Adopts a G-Quadruplex Structure in Its RNA and DNA Sequence , 2014, Biochemistry.

[2]  M. Palumbo,et al.  The cellular protein nucleolin preferentially binds long-looped G-quadruplex nucleic acids , 2016, Biochimica et Biophysica Acta (BBA) - General Subjects.

[3]  Giorgio Palù,et al.  A dynamic G-quadruplex region regulates the HIV-1 long terminal repeat promoter. , 2013, Journal of medicinal chemistry.

[4]  Janez Plavec,et al.  Characterization of DNA G-quadruplex species forming from C9ORF72 G4C2-expanded repeats associated with amyotrophic lateral sclerosis and frontotemporal lobar degeneration , 2015, Neurobiology of Aging.

[5]  Simon Litvak,et al.  G-quadruplexes in viruses: function and potential therapeutic applications , 2014, Nucleic acids research.

[6]  Giorgio Palù,et al.  Visualization of DNA G-quadruplexes in herpes simplex virus 1-infected cells , 2016, Nucleic acids research.

[7]  Stefano Toppo,et al.  NeSSie: a tool for the identification of approximate DNA sequence symmetries , 2018, Bioinform..

[8]  Geoffrey J. Barton,et al.  Jalview Version 2—a multiple sequence alignment editor and analysis workbench , 2009, Bioinform..

[9]  Janez Plavec,et al.  Human papillomavirus G-quadruplexes. , 2013, Biochemistry.

[10]  Gaël Varoquaux,et al.  The NumPy Array: A Structure for Efficient Numerical Computation , 2011, Computing in Science & Engineering.

[11]  Robert C. Edgar,et al.  BIOINFORMATICS APPLICATIONS NOTE , 2001 .

[12]  Stefano Alcaro,et al.  Targeting unimolecular G-quadruplex nucleic acids: a new paradigm for the drug discovery? , 2014, Expert opinion on drug discovery.

[13]  Xiang Zhou,et al.  A highly conserved G-rich consensus sequence in hepatitis C virus core gene represents a new anti–hepatitis C target , 2016, Science Advances.

[14]  N. Maizels,et al.  The G4 Genome , 2013, PLoS genetics.

[15]  E. Ruggiero,et al.  G-quadruplexes and G-quadruplex ligands: targets and tools in antiviral therapy , 2018, Nucleic acids research.

[16]  L. Loeb,et al.  The fragile X syndrome d(CGG)n nucleotide repeats form a stable tetrahelical structure. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[17]  Scott Frees,et al.  QGRS-Conserve: a computational method for discovering evolutionarily conserved G-quadruplex motifs , 2014, Human Genomics.

[18]  Jie Zhong,et al.  G-quadruplexes regulate Epstein-Barr virus-encoded nuclear antigen 1 mRNA translation. , 2014, Nature chemical biology.

[19]  Dipankar Sen,et al.  A sodium-potassium switch in the formation of four-stranded G4-DNA , 1990, Nature.

[20]  Stefano Toppo,et al.  Mapping and characterization of G-quadruplexes in Mycobacterium tuberculosis gene promoter regions , 2017, Scientific Reports.

[21]  C. E. Pearson,et al.  The Disease-associated r(GGGGCC)n Repeat from the C9orf72 Gene Forms Tract Length-dependent Uni- and Multimolecular RNA G-quadruplex Structures* , 2013, The Journal of Biological Chemistry.

[22]  Souvik Maiti,et al.  The Tale of RNA G‐Quadruplex , 2015 .

[23]  John M. Chambers,et al.  Graphical Methods for Data Analysis , 1983 .

[24]  Giorgio Palù,et al.  Conserved presence of G-quadruplex forming sequences in the Long Terminal Repeat Promoter of Lentiviruses , 2017, Scientific Reports.

[25]  Wenyaw Chan,et al.  Statistical Methods in Medical Research , 2013, Model. Assist. Stat. Appl..

[26]  L. Stein,et al.  JBrowse: a next-generation genome browser. , 2009, Genome research.

[27]  Jean-Louis Mergny,et al.  Topology of a DNA G-quadruplex structure formed in the HIV-1 promoter: a potential target for anti-HIV drug development. , 2014, Journal of the American Chemical Society.

[28]  Vinod Scaria,et al.  © 2012 Landes Bioscience. Do not distribute. Potential G-quadruplexes in the human long non-coding transcriptome , 2012 .

[29]  Yanyan Geng,et al.  Topology of a G-quadruplex DNA formed by C9orf72 hexanucleotide repeats associated with ALS and FTD , 2015, Scientific Reports.

[30]  Daniela Rhodes,et al.  G-quadruplex structures: in vivo evidence and function. , 2009, Trends in cell biology.

[31]  Rolf Hilgenfeld,et al.  The SARS-Unique Domain (SUD) of SARS Coronavirus Contains Two Macrodomains That Bind G-Quadruplexes , 2009, PLoS pathogens.

[32]  Christophe Jamin,et al.  Nucleolin directly mediates Epstein-Barr virus immune evasion through binding to G-quadruplexes of EBNA1 mRNA , 2017, Nature Communications.

[33]  Giorgio Palù,et al.  The Herpes Simplex Virus-1 genome contains multiple clusters of repeated G-quadruplex: Implications for the antiviral activity of a G-quadruplex ligand , 2015, Antiviral Research.

[34]  Gyan Bhanot,et al.  Patterns of Evolution and Host Gene Mimicry in Influenza and Other RNA Viruses , 2008, PLoS pathogens.

[35]  Paul M. Lieberman,et al.  Role for G-Quadruplex RNA Binding by Epstein-Barr Virus Nuclear Antigen 1 in DNA Replication and Metaphase Chromosome Attachment , 2009, Journal of Virology.

[36]  Andreas Tauch,et al.  Virus-Host Coevolution: Common Patterns of Nucleotide Motif Usage in Flaviviridae and Their Hosts , 2009, PloS one.

[37]  Shankar Balasubramanian,et al.  Prevalence of quadruplexes in the human genome , 2005, Nucleic acids research.

[38]  Giorgio Palù,et al.  Anti-HIV-1 activity of the G-quadruplex ligand BRACO-19. , 2014, The Journal of antimicrobial chemotherapy.

[39]  Louis Flamand,et al.  Stabilization of Telomere G-Quadruplexes Interferes with Human Herpesvirus 6A Chromosomal Integration , 2017, Journal of Virology.

[40]  Nathan P. Croft,et al.  mRNA Structural Constraints on EBNA1 Synthesis Impact on In Vivo Antigen Presentation and Early Priming of CD8+ T Cells , 2014, PLoS pathogens.

[41]  Jean-Louis Mergny,et al.  How long is too long? Effects of loop size on G-quadruplex stability , 2010, Nucleic acids research.

[42]  Jean-Pierre Perreault,et al.  RNA G-Quadruplexes as Key Motifs of the Transcriptome. , 2017, Advances in biochemical engineering/biotechnology.

[43]  Burkhard Rost,et al.  MSAViewer: interactive JavaScript visualization of multiple sequence alignments , 2016, Bioinform..

[44]  Rolf Hilgenfeld,et al.  A G-quadruplex-binding macrodomain within the “SARS-unique domain” is essential for the activity of the SARS-coronavirus replication–transcription complex , 2015, Virology.

[45]  Carl L. Schildkraut,et al.  G-quadruplex-interacting compounds alter latent DNA replication and episomal persistence of KSHV , 2016, Nucleic acids research.

[46]  Beat Kleiner,et al.  Graphical Methods for Data Analysis , 1983 .

[47]  Giorgio Palù,et al.  The cellular protein hnRNP A2/B1 enhances HIV-1 transcription by unfolding LTR promoter G-quadruplexes , 2017, Scientific Reports.

[48]  Cynthia J. Burrows,et al.  Zika Virus Genomic RNA Possesses Conserved G-Quadruplexes Characteristic of the Flaviviridae Family , 2016, ACS infectious diseases.

[49]  M. Kandpal,et al.  A G-quadruplex motif in an envelope gene promoter regulates transcription and virion secretion in HBV genotype B , 2017, Nucleic acids research.

[50]  A. Cammas,et al.  RNA G-quadruplexes: emerging mechanisms in disease , 2016, Nucleic acids research.

[51]  S. J. Flint ... et al. Principles of virology , 2013 .

[52]  Giorgio Palù,et al.  Formation of a Unique Cluster of G-Quadruplex Structures in the HIV-1 nef Coding Region: Implications for Antiviral Activity , 2013, PloS one.

[53]  A. Phan,et al.  Bulges in G-quadruplexes: broadening the definition of G-quadruplex-forming sequences. , 2013, Journal of the American Chemical Society.

[54]  Robin Thorpe,et al.  The function of DNA binding protein nucleophosmin in AAV replication , 2017, Virology.

[55]  Mauro Freccero,et al.  Synthesis, Binding and Antiviral Properties of Potent Core-Extended Naphthalene Diimides Targeting the HIV-1 Long Terminal Repeat Promoter G-Quadruplexes , 2015, Journal of medicinal chemistry.

[56]  Xiang Zhou,et al.  Chemical Targeting of a G-Quadruplex RNA in the Ebola Virus L Gene. , 2016, Cell chemical biology.

[57]  Souvik Maiti,et al.  Effect of loops and G-quartets on the stability of RNA G-quadruplexes. , 2013, The journal of physical chemistry. B.

[58]  Wes McKinney,et al.  Data Structures for Statistical Computing in Python , 2010, SciPy.

[59]  Soumitra Basu,et al.  A potassium ion-dependent RNA structural switch regulates human pre-miRNA 92b maturation. , 2015, Chemistry & biology.