SDT: A Virus Classification Tool Based on Pairwise Sequence Alignment and Identity Calculation

The perpetually increasing rate at which viral full-genome sequences are being determined is creating a pressing demand for computational tools that will aid the objective classification of these genome sequences. Taxonomic classification approaches that are based on pairwise genetic identity measures are potentially highly automatable and are progressively gaining favour with the International Committee on Taxonomy of Viruses (ICTV). There are, however, various issues with the calculation of such measures that could potentially undermine the accuracy and consistency with which they can be applied to virus classification. Firstly, pairwise sequence identities computed based on multiple sequence alignments rather than on multiple independent pairwise alignments can lead to the deflation of identity scores with increasing dataset sizes. Also, when gap-characters need to be introduced during sequence alignments to account for insertions and deletions, methodological variations in the way that these characters are introduced and handled during pairwise genetic identity calculations can cause high degrees of inconsistency in the way that different methods classify the same sets of sequences. Here we present Sequence Demarcation Tool (SDT), a free user-friendly computer program that aims to provide a robust and highly reproducible means of objectively using pairwise genetic identity calculations to classify any set of nucleotide or amino acid sequences. SDT can produce publication quality pairwise identity plots and colour-coded distance matrices to further aid the classification of sequences according to ICTV approved taxonomic demarcation criteria. Besides a graphical interface version of the program for Windows computers, command-line versions of the program are available for a variety of different operating systems (including a parallel version for cluster computing platforms).

[1]  J. Harding,et al.  Novel circular DNA viruses identified in Procordulia grayi and Xanthocnemis zealandica larvae using metagenomic approaches. , 2014, Infection, genetics and evolution : journal of molecular epidemiology and evolutionary genetics in infectious diseases.

[2]  B. Muhire,et al.  Pigeon circoviruses display patterns of recombination, genomic secondary structure and selection similar to those of beak and feather disease viruses. , 2014, The Journal of general virology.

[3]  Darren P. Martin,et al.  A genome-wide pairwise-identity-based proposal for the classification of viruses in the genus Mastrevirus (family Geminiviridae) , 2013, Archives of Virology.

[4]  Yiming Bao,et al.  PAirwise Sequence Comparison (PASC) and Its Application in the Classification of Filoviruses , 2012, Viruses.

[5]  Rodrigo Lopez,et al.  Clustal W and Clustal X version 2.0 , 2007, Bioinform..

[6]  Complete genome sequence of a novel monopartite begomovirus infecting sweet potato in China , 2013, Virus Genes.

[7]  Darren P. Martin,et al.  Revisiting the classification of curtoviruses based on genome-wide pairwise identity , 2014, Archives of Virology.

[8]  M. P. Cummings,et al.  PAUP* Phylogenetic analysis using parsimony (*and other methods) Version 4 , 2000 .

[9]  Mehdi Kamali,et al.  Genetic diversity and host range studies of turnip curly top virus , 2012, Virus Genes.

[10]  Association of a distinct strain of hollyhock yellow vein mosaic virus and Ludwigia leaf distortion betasatellite with yellow vein mosaic disease of hollyhock (Alcea rosea) in India , 2014, Archives of Virology.

[11]  S. Kanakala,et al.  Response of chickpea genotypes to Agrobacterium-mediated delivery of Chickpea chlorotic dwarf virus (CpCDV) genome and identification of resistance source , 2013, Applied Microbiology and Biotechnology.

[12]  A. Gorbalenya,et al.  Partitioning the Genetic Diversity of a Virus Family: Approach and Evaluation through a Case Study of Picornaviruses , 2012, Journal of Virology.

[13]  A. Poon,et al.  Evidence of Pervasive Biologically Functional Secondary Structures within the Genomes of Eukaryotic Single-Stranded DNA Viruses , 2013, Journal of Virology.

[14]  A. Varsani,et al.  Identification and molecular characterization of a single-stranded circular DNA virus with similarities to Sclerotinia sclerotiorum hypovirulence-associated DNA virus 1 , 2014, Archives of Virology.

[15]  Andreas Wilm,et al.  An enhanced RNA alignment benchmark for sequence alignment programs , 2006, Algorithms for Molecular Biology.

[16]  Molecular variability of Apple chlorotic leaf spot virus in Shaanxi, China , 2014, Phytoparasitica.

[17]  Michael P. Cummings,et al.  PAUP* [Phylogenetic Analysis Using Parsimony (and Other Methods)] , 2004 .

[18]  T. Ng,et al.  Isolation and Molecular Characterization of a Novel Picornavirus from Baitfish in the USA , 2014, PloS one.

[19]  Complete genome sequence of Jacquemontia yellow mosaic virus, a novel begomovirus from Venezuela related to other New World bipartite begomoviruses infecting Convolvulaceae , 2014, Archives of Virology.

[20]  Muhammad Imtiaz Shafiq,et al.  A distinct strain of chickpea chlorotic dwarf virus (genus Mastrevirus, family Geminiviridae) identified in cotton plants affected by leaf curl disease , 2013, Archives of Virology.

[21]  Olivier Poch,et al.  A comprehensive comparison of multiple sequence alignment programs , 1999, Nucleic Acids Res..

[22]  A high degree of African streak virus diversity within Nigerian maize fields includes a new mastrevirus from Axonopus compressus , 2014, Archives of Virology.

[23]  D. Swofford PAUP*: Phylogenetic analysis using parsimony (*and other methods), Version 4.0b10 , 2002 .

[24]  A. Varsani,et al.  Novel myco-like DNA viruses discovered in the faecal matter of various animals. , 2013, Virus research.

[25]  Isaac Elias,et al.  Settling the Intractability of Multiple Alignment , 2003, ISAAC.

[26]  Jelle Matthijnssens,et al.  Full Genome-Based Classification of Rotaviruses Reveals a Common Origin between Human Wa-Like and Porcine Rotavirus Strains and Human DS-1-Like and Bovine Rotavirus Strains , 2008, Journal of Virology.

[27]  P. Broady,et al.  Diverse small circular single-stranded DNA viruses identified in a freshwater pond on the McMurdo Ice Shelf (Antarctica). , 2014, Infection, genetics and evolution : journal of molecular epidemiology and evolutionary genetics in infectious diseases.

[28]  H. Vetten,et al.  Genome diversity and evidence of recombination and reassortment in nanoviruses from Europe. , 2014, The Journal of general virology.

[29]  Isolation and molecular characterization of a distinct begomovirus and its associated betasatellite infecting Hedyotis uncinella (Hook. et Arn.) in Vietnam , 2014, Virus Genes.

[30]  Christus,et al.  A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins , 2022 .

[31]  Darren P. Martin,et al.  Establishment of three new genera in the family Geminiviridae: Becurtovirus, Eragrovirus and Turncurtovirus , 2014, Archives of Virology.

[32]  E. Mizubuti,et al.  Begomovirus diversity in tomato crops and weeds in Ecuador and the detection of a recombinant isolate of rhynchosia golden mosaic Yucatan virus infecting tomato , 2014, Archives of Virology.

[33]  J. Chun,et al.  Towards a taxonomic coherence between average nucleotide identity and 16S rRNA gene sequence similarity for species demarcation of prokaryotes. , 2014, International Journal of Systematic and Evolutionary Microbiology.

[34]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[35]  K. Katoh,et al.  MAFFT version 5: improvement in accuracy of multiple sequence alignment , 2005, Nucleic acids research.

[36]  Naimuddin,et al.  Identification and characterisation of a highly divergent geminivirus: evolutionary and taxonomic implications. , 2013, Virus research.

[37]  G. Ballard,et al.  A novel papillomavirus in Adélie penguin (Pygoscelis adeliae) faeces sampled at the Cape Crozier colony, Antarctica. , 2014, The Journal of general virology.

[38]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[39]  K. Katoh,et al.  MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability , 2013, Molecular biology and evolution.

[40]  M. Breitbart,et al.  High global diversity of cycloviruses amongst dragonflies. , 2013, The Journal of general virology.

[41]  D. Penny Inferring Phylogenies.—Joseph Felsenstein. 2003. Sinauer Associates, Sunderland, Massachusetts. , 2004 .

[42]  A. Varsani,et al.  Avihepadnavirus diversity in parrots is comparable to that found amongst all other avian species. , 2013, Virology.

[43]  A. Varsani,et al.  Diversity of Beet curly top Iran virus isolated from different hosts in Iran , 2013, Virus Genes.

[44]  Desmond G. Higgins,et al.  Making automated multiple alignments of very large numbers of protein sequences , 2013, Bioinform..

[45]  M. Nei,et al.  MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. , 2011, Molecular biology and evolution.