diArk 2.0 provides detailed analyses of the ever increasing eukaryotic genome sequencing data

BackgroundNowadays, the sequencing of even the largest mammalian genomes has become a question of days with current next-generation sequencing methods. It comes as no surprise that dozens of genome assemblies are released per months now. Since the number of next-generation sequencing machines increases worldwide and new major sequencing plans are announced, a further increase in the speed of releasing genome assemblies is expected. Thus it becomes increasingly important to get an overview as well as detailed information about available sequenced genomes. The different sequencing and assembly methods have specific characteristics that need to be known to evaluate the various genome assemblies before performing subsequent analyses.ResultsdiArk has been developed to provide fast and easy access to all sequenced eukaryotic genomes worldwide. Currently, diArk 2.0 contains information about more than 880 species and more than 2350 genome assembly files. Many meta-data like sequencing and read-assembly methods, sequencing coverage, GC-content, extended lists of alternatively used scientific names and common species names, and various kinds of statistics are provided. To intuitively approach the data the web interface makes extensive usage of modern web techniques. A number of search modules and result views facilitate finding and judging the data of interest. Subscribing to the RSS feed is the easiest way to stay up-to-date with the latest genome data.ConclusionsdiArk 2.0 is the most up-to-date database of sequenced eukaryotic genomes compared to databases like GOLD, NCBI Genome, NHGRI, and ISC. It is different in that only those projects are stored for which genome assembly data or considerable amounts of cDNA data are available. Projects in planning stage or in the process of being sequenced are not included. The user can easily search through the provided data and directly access the genome assembly files of the sequenced genome of interest. diArk 2.0 is available at http://www.diark.org.

[1]  David R. Kelley,et al.  A whole-genome assembly of the domestic cow, Bos taurus , 2009, Genome Biology.

[2]  I-Min A. Chen,et al.  The Genomes On Line Database (GOLD) in 2007: status of genomic and metagenomic projects and their associated metadata , 2007, Nucleic Acids Res..

[3]  Robert P. Davey,et al.  Population genomics of domestic and wild yeasts , 2008, Nature.

[4]  Nicola K. Petty Genome annotation: man versus machine , 2010, Nature Reviews Microbiology.

[5]  G. K. Davis,et al.  Genome Sequence of the Pea Aphid Acyrthosiphon pisum , 2010, PLoS biology.

[6]  Florian Odronitz,et al.  diArk – a resource for eukaryotic genome research , 2007, BMC Genomics.

[7]  Shelby L. Bidwell,et al.  Reassociation kinetics-based approach for partial genome sequencing of the cattle tick, Rhipicephalus (Boophilus) microplus , 2010, BMC Genomics.

[8]  John D McPherson,et al.  Next-generation gap , 2009, Nature Methods.

[9]  Jeffrey Heer,et al.  Declarative Language Design for Interactive Visualization , 2010, IEEE Transactions on Visualization and Computer Graphics.

[10]  P. Hu,et al.  Dandruff-associated Malassezia genomes reveal convergent and divergent virulence traits shared with plant and human fungal pathogens , 2007, Proceedings of the National Academy of Sciences.

[11]  R. Mott,et al.  The 1001 Genomes Project for Arabidopsis thaliana , 2009, Genome Biology.

[12]  M. Metzker Sequencing technologies — the next generation , 2010, Nature Reviews Genetics.

[13]  International Human Genome Sequencing Consortium Initial sequencing and analysis of the human genome , 2001, Nature.

[14]  Dawei Li,et al.  The sequence and de novo assembly of the giant panda genome , 2010, Nature.

[15]  Joshua M. Stuart,et al.  Genome 10K: a proposal to obtain whole-genome sequence for 10,000 vertebrate species. , 2009, The Journal of heredity.

[16]  Jason E Stajich,et al.  Comparative genomic analyses of the human fungal pathogens Coccidioides and their relatives. , 2009, Genome research.

[17]  C. Pipper,et al.  [''R"--project for statistical computing]. , 2008, Ugeskrift for laeger.

[18]  Inna Dubchak,et al.  Comparative genome sequencing of Drosophila pseudoobscura: chromosomal, gene, and cis-element evolution. , 2005, Genome research.

[19]  Human genome: Genomes by the thousand , 2010, Nature.

[20]  Bjarne Stroustrup,et al.  C++ Programming Language , 1986, IEEE Softw..

[21]  D. Altshuler,et al.  A map of human genome variation from population-scale sequencing , 2010, Nature.

[22]  Lewis Y. Geer,et al.  Database resources of the National Center for Biotechnology Information , 2014, Nucleic Acids Res..

[23]  Somvong Tragoonrung,et al.  Characterization of microsatellites and gene contents from genome shotgun sequences of mungbean (Vigna radiata (L.) Wilczek) , 2009, BMC Plant Biology.

[24]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology Information: update , 2004, Nucleic acids research.

[25]  Pjotr Prins,et al.  BioRuby: bioinformatics software for the Ruby programming language , 2010, Bioinform..

[26]  J. Lupski,et al.  The complete genome of an individual by massively parallel DNA sequencing , 2008, Nature.

[27]  Dawei Li,et al.  A Draft Sequence for the Genome of the Domesticated Silkworm ( Bombyx mori ) , 2004 .

[28]  Steven J. M. Jones,et al.  De novo genome sequence assembly of a filamentous fungus using Sanger, 454 and Illumina sequence data , 2009, Genome Biology.

[29]  K. Worley,et al.  The Genome Sequence of Taurine Cattle: A Window to Ruminant Biology and Evolution , 2009, Science.

[30]  G. Weinstock,et al.  Genome sequences of the honey bee pathogens Paenibacillus larvae and Ascosphaera apis , 2006, Insect molecular biology.

[31]  D. Mccormick Sequence the Human Genome , 1986, Bio/Technology.

[32]  A. T. Freitas,et al.  Combination of measures distinguishes pre-miRNAs from other stem-loops in the genome of the newly sequenced Anopheles darlingi , 2010, BMC Genomics.

[33]  Jeffrey Heer,et al.  Protovis: A Graphical Toolkit for Visualization , 2009, IEEE Transactions on Visualization and Computer Graphics.