Bioinformatics for analysis of poxvirus genomes.

In recent years, there have been numerous unprecedented technological advances in the field of molecular biology; these include DNA sequencing, mass spectrometry of proteins, and microarray analysis of mRNA transcripts. Perhaps, however, it is the area of genomics, which has now generated the complete genome sequences of more than 100 poxviruses, that has had the greatest impact on the average virology researcher because the DNA sequence data is in constant use in many different ways by almost all molecular virologists. As this data resource grows, so does the importance of the availability of databases and software tools to enable the bench virologist to work with and make use of this (valuable/expensive) DNA sequence information. Thus, providing researchers with intuitive software to first select and reformat genomics data from large databases, second, to compare/analyze genomics data, and third, to view and interpret large and complex sets of results has become pivotal in enabling progress to be made in modern virology. This chapter is directed at the bench virologist and describes the software required for a number of common bioinformatics techniques that are useful for comparing and analyzing poxvirus genomes. In a number of examples, we also highlight the Viral Orthologous Clusters database system and integrated tools that we developed for the management and analysis of complete viral genomes.

[1]  Amos Bairoch,et al.  The PROSITE database, its status in 2002 , 2002, Nucleic Acids Res..

[2]  Christian Cole,et al.  The Jpred 3 secondary structure prediction server , 2008, Nucleic Acids Res..

[3]  M. Sternberg,et al.  Protein structure prediction on the Web: a case study using the Phyre server , 2009, Nature Protocols.

[4]  Amos Bairoch,et al.  ScanProsite: detection of PROSITE signature matches and ProRule-associated functional and structural residues in proteins , 2006, Nucleic Acids Res..

[5]  Michael Kaufmann,et al.  DIALIGN-TX: greedy and progressive approaches for segment-based multiple sequence alignment , 2008, Algorithms for Molecular Biology.

[6]  Adam Godzik,et al.  Saturated BLAST: an automated multiple intermediate sequence search used to detect distant homology , 2000, Bioinform..

[7]  C. Upton,et al.  Prediction of Steps in the Evolution of Variola Virus Host Range , 2014, PloS one.

[8]  C. Upton,et al.  Base-By-Base version 2: single nucleotide-level analysis of whole viral genome alignments , 2011, Microbial Informatics and Experimentation.

[9]  Richa Agarwala,et al.  COBALT: constraint-based alignment tool for multiple protein sequences , 2007, Bioinform..

[10]  David Baker,et al.  Protein structure prediction and analysis using the Robetta server , 2004, Nucleic Acids Res..

[11]  Yang Zhang,et al.  I-TASSER server for protein 3D structure prediction , 2008, BMC Bioinformatics.

[12]  Chris Upton,et al.  Predicted Function of the Vaccinia Virus G5r Protein , 2022 .

[13]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[14]  E. Koonin,et al.  Predicted poxvirus FEN1-like nuclease required for homologous recombination, double-strand break repair and full-size genome formation , 2009, Proceedings of the National Academy of Sciences.

[15]  S. Goebel,et al.  The complete DNA sequence of vaccinia virus. , 1990, Virology.

[16]  Chris Upton,et al.  Poxvirus Bioinformatics Resource Center: a comprehensive Poxviridae informational and analytical resource , 2004, Nucleic Acids Res..

[17]  Thomas L. Madden,et al.  Domain enhanced lookup time accelerated BLAST , 2012, Biology Direct.

[18]  Burkhard Morgenstern,et al.  DIALIGN2: Improvement of the segment to segment approach to multiple sequence alignment , 1999, German Conference on Bioinformatics.

[19]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[20]  N. Harris,et al.  Genotator: a workbench for sequence annotation. , 1997, Genome research.

[21]  Vasily Tcherepanov,et al.  Genome Annotation Transfer Utility (GATU): rapid annotation of viral genomes using a closely related reference genome , 2006, BMC Genomics.

[22]  Rachel L. Roper,et al.  Poxvirus Orthologous Clusters: toward Defining the Minimum Essential Poxvirus Genome , 2003, Journal of Virology.

[23]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[24]  C. Upton,et al.  Vaccinia Virus G8R Protein: A Structural Ortholog of Proliferating Cell Nuclear Antigen (PCNA) , 2009, PloS one.

[25]  C. Upton,et al.  Host-derived pathogenicity islands in poxviruses , 2005, Virology Journal.

[26]  Jason H. Moore,et al.  Identification of SNPs associated with variola virus virulence , 2013, BioData Mining.

[27]  K. Katoh,et al.  MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. , 2002, Nucleic acids research.

[28]  D. Higgins,et al.  T-Coffee: A novel method for fast and accurate multiple sequence alignment. , 2000, Journal of molecular biology.

[29]  O. Homann,et al.  MochiView: versatile software for genome browsing and DNA motif analysis , 2010, BMC Biology.

[30]  Conrad C. Huang,et al.  UCSF Chimera—A visualization system for exploratory research and analysis , 2004, J. Comput. Chem..

[31]  Lars Malmström,et al.  PROTEINS: Structure, Function, and Bioinformatics Suppl 7:193–200 (2005) Automated Prediction of Domain Boundaries in CASP6 Targets Using Ginzu and RosettaDOM , 2022 .

[32]  B. Moss,et al.  Sequence-Divergent Chordopoxvirus Homologs of the O3 Protein Maintain Functional Interactions with Components of the Vaccinia Virus Entry-Fusion Complex , 2011, Journal of Virology.

[33]  Chris Upton,et al.  JDotter: a Java interface to multiple dotplots generated by dotter , 2004, Bioinform..

[34]  B. Moss,et al.  Simultaneous high-resolution analysis of vaccinia virus and host cell transcriptomes by deep RNA sequencing , 2010, Proceedings of the National Academy of Sciences.

[35]  C. Upton Screening Predicted Coding Regions in Poxvirus Genomes , 2004, Virus Genes.

[36]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[37]  Mural Rj,et al.  ARTEMIS: a tool for displaying and annotating DNA sequence. , 2000 .

[38]  Aaron E. Darling,et al.  Reordering contigs of draft genomes using the Mauve Aligner , 2009, Bioinform..

[39]  Sergey I. Nikolenko,et al.  SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing , 2012, J. Comput. Biol..

[40]  Thomas L. Madden,et al.  The BLAST Sequence Analysis Tool , 2013 .

[41]  Chris Upton,et al.  Base-By-Base: Single nucleotide-level analysis of whole viral genome alignments , 2004, BMC Bioinformatics.

[42]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[43]  F. Blattner,et al.  Mauve: multiple alignment of conserved genomic sequence with rearrangements. , 2004, Genome research.

[44]  R. Durbin,et al.  A dot-matrix program with dynamic threshold control suited for genomic DNA and protein sequence analysis. , 1995, Gene.

[45]  Yunlong Liu,et al.  NGSUtils: a software suite for analyzing and manipulating next-generation sequencing datasets , 2013, Bioinform..

[46]  Piramanayagam Shanmughavel,et al.  Functional annotation of hypothetical proteins – A review , 2006, Bioinformation.

[47]  Alexandros Stamatakis,et al.  RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models , 2006, Bioinform..

[48]  Amos Bairoch,et al.  The PROSITE dictionary of sites and patterns in proteins, its current status , 1993, Nucleic Acids Res..

[49]  Robert C. Edgar,et al.  MUSCLE: a multiple sequence alignment method with reduced time and space complexity , 2004, BMC Bioinformatics.

[50]  C Upton,et al.  Viral genome organizer: a system for analyzing complete viral genomes. , 2000, Virus research.

[51]  Melissa Da Silva,et al.  Using purine skews to predict genes in AT-rich poxviruses , 2005, BMC Genomics.

[52]  J. Zhang,et al.  Methods for comparing a DNA sequence with a protein sequence , 1996, Comput. Appl. Biosci..

[53]  Koichiro Tamura,et al.  MEGA6: Molecular Evolutionary Genetics Analysis version 6.0. , 2013, Molecular biology and evolution.

[54]  Johannes Söding,et al.  The HHpred interactive server for protein homology detection and structure prediction , 2005, Nucleic Acids Res..

[55]  Chris Upton,et al.  Sequence Searcher: A Java tool to perform regular expression and fuzzy searches of multiple DNA and protein sequences , 2009, BMC Research Notes.

[56]  B. Moss,et al.  Characterization of a Newly Identified 35-Amino-Acid Component of the Vaccinia Virus Entry/Fusion Complex Conserved in All Chordopoxviruses , 2009, Journal of Virology.