VoSeq: A Voucher and DNA Sequence Web Application

There is an ever growing number of molecular phylogenetic studies published, due to, in part, the advent of new techniques that allow cheap and quick DNA sequencing. Hence, the demand for relational databases with which to manage and annotate the amassing DNA sequences, genes, voucher specimens and associated biological data is increasing. In addition, a user-friendly interface is necessary for easy integration and management of the data stored in the database back-end. Available databases allow management of a wide variety of biological data. However, most database systems are not specifically constructed with the aim of being an organizational tool for researchers working in phylogenetic inference. We here report a new software facilitating easy management of voucher and sequence data, consisting of a relational database as back-end for a graphic user interface accessed via a web browser. The application, VoSeq, includes tools for creating molecular datasets of DNA or amino acid sequences ready to be used in commonly used phylogenetic software such as RAxML, TNT, MrBayes and PAUP, as well as for creating tables ready for publishing. It also has inbuilt BLAST capabilities against all DNA sequences stored in VoSeq as well as sequences in NCBI GenBank. By using mash-ups and calls to web services, VoSeq allows easy integration with public services such as Yahoo! Maps, Flickr, Encyclopedia of Life (EOL) and GBIF (by generating data-dumps that can be processed with GBIF's Integrated Publishing Toolkit).

[1]  George Smith Gibbes,et al.  On Life , 1828, The London medical and physical journal.

[2]  C. Marshall Encyclopedia of Life , 2008 .

[3]  Roderic D. M. Page,et al.  Biodiversity informatics: the challenge of linking data and the role of shared identifiers , 2008, Briefings Bioinform..

[4]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[5]  Edward Baker,et al.  Scratchpads 2.0: a Virtual Research Environment supporting scholarly collaboration, communication and data publication in biodiversity science , 2011, ZooKeys.

[6]  Pablo A. Goloboff,et al.  TNT, a free program for phylogenetic analysis , 2008 .

[7]  Roderic D. M. Page,et al.  Extracting scientific articles from a large digital archive: BioStor and the Biodiversity Heritage Library , 2011, BMC Bioinformatics.

[8]  J. Rougemont,et al.  A rapid bootstrap algorithm for the RAxML Web servers. , 2008, Systematic biology.

[9]  D. Maddison,et al.  NEXUS: an extensible file format for systematic information. , 1997, Systematic biology.

[10]  David L. Wheeler,et al.  GenBank , 2015, Nucleic Acids Res..

[11]  P. Hebert,et al.  bold: The Barcode of Life Data System (http://www.barcodinglife.org) , 2007, Molecular ecology notes.

[12]  John P. Huelsenbeck,et al.  MrBayes 3: Bayesian phylogenetic inference under mixed models , 2003, Bioinform..

[13]  Nico Cellinese,et al.  Evolutionary informatics: unifying knowledge about the diversity of life. , 2012, Trends in ecology & evolution.

[14]  P. Hebert,et al.  bold: The Barcode of Life Data System (http://www.barcodinglife.org) , 2007, Molecular ecology notes.

[15]  M. Lane The Global Biodiversity Information Facility , 2005 .