DDBJ launches a new archive database with analytical tools for next-generation sequence data

The DNA Data Bank of Japan (DDBJ) (http://www.ddbj.nig.ac.jp) has collected and released 1 701 110 entries/1 116 138 614 bases between July 2008 and June 2009. A few highlighted data releases from DDBJ were the complete genome sequence of an endosymbiont within protist cells in the termite gut and Cap Analysis Gene Expression tags for human and mouse deposited from the Functional Annotation of the Mammalian cDNA consortium. In this period, we started a novel user announcement service using Really Simple Syndication (RSS) to deliver a list of data released from DDBJ on a daily basis. Comprehensive visualization of a DDBJ release data was attempted by using a word cloud program. Moreover, a new archive for sequencing data from next-generation sequencers, the ‘DDBJ Read Archive’ (DRA), was launched. Concurrently, for read data registered in DRA, a semi-automatic annotation tool called the ‘DDBJ Read Annotation Pipeline’ was released as a preliminary step. The pipeline consists of two parts: basic analysis for reference genome mapping and de novo assembly and high-level analysis of structural and functional annotations. These new services will aid users’ research and provide easier access to DDBJ databases.

[1]  Hideaki Sugawara,et al.  Exploration and grading of possible genes from 183 bacterial strains by a common protocol to identification of new genes: Gene Trek in Prokaryote Space (GTPS). , 2006, DNA research : an international journal for rapid publication of reports on genes and genomes.

[2]  Hideaki Sugawara,et al.  DDBJ dealing with mass data produced by the second generation sequencer , 2008, Nucleic Acids Res..

[3]  Martin S. Taylor,et al.  The transcriptional network that controls growth arrest and differentiation in a human myeloid leukemia cell line , 2009, Nature Genetics.

[4]  Rolf Apweiler,et al.  Evidence standards in experimental and inferential INSDC Third Party Annotation data. , 2006, Omics : a journal of integrative biology.

[5]  Kazuho Ikeo,et al.  CIBEX: center for information biology gene expression database. , 2003, Comptes rendus biologies.

[6]  Ibrahim Emam,et al.  ArrayExpress update—from an archive of functional genomics experiments to the atlas of gene expression , 2008, Nucleic Acids Res..

[7]  Yoshiyuki Sakaki,et al.  Genome of an Endosymbiont Coupling N2 Fixation to Cellulolysis Within Protist Cells in Termite Gut , 2008, Science.

[8]  Hideaki Sugawara,et al.  Genome Information Broker (GIB): data retrieval and comparative analysis system for completed microbial genomes and more , 2002, Nucleic Acids Res..

[9]  H. Mori,et al.  Genome Structure of the Legume, Lotus japonicus , 2008, DNA research : an international journal for rapid publication of reports on genes and genomes.

[10]  Takeshi Kawabata,et al.  GTOP: a database of protein structures predicted from genome sequences , 2002, Nucleic Acids Res..