The International Nucleotide Sequence Database Collaboration

Under the International Nucleotide Sequence Database Collaboration (INSDC; http://www.insdc.org), globally comprehensive public domain nucleotide sequence is captured, preserved and presented. The partners of this long-standing collaboration work closely together to provide data formats and conventions that enable consistent data submission to their databases and support regular data exchange around the globe. Clearly defined policy and governance in relation to free access to data and relationships with journal publishers have positioned INSDC databases as a key provider of the scientific record and a core foundation for the global bioinformatics data infrastructure. While growth in sequence data volumes comes no longer as a surprise to INSDC partners, the uptake of next-generation sequencing technology by mainstream science that we have witnessed in recent years brings a step-change to growth, necessarily making a clear mark on INSDC strategy. In this article, we introduce the INSDC, outline data growth patterns and comment on the challenges of increased growth.

[1]  T. Matise,et al.  Nucleotide Sequence Database Policies , 2002, Science.

[2]  David L. Wheeler,et al.  GenBank , 2015, Nucleic Acids Res..

[3]  Chris F. Taylor,et al.  The minimum information about a genome sequence (MIGS) specification , 2008, Nature Biotechnology.

[4]  Ibrahim Emam,et al.  ArrayExpress update—from an archive of functional genomics experiments to the atlas of gene expression , 2008, Nucleic Acids Res..

[5]  Dennis B. Troup,et al.  NCBI GEO: archive for high-throughput functional genomic data , 2008, Nucleic Acids Res..

[6]  Takashi Gojobori,et al.  DDBJ launches a new archive database with analytical tools for next-generation sequence data , 2009, Nucleic Acids Res..

[7]  Ying Cheng,et al.  The European Nucleotide Archive , 2010, Nucleic Acids Res..

[8]  Dennis B. Troup,et al.  NCBI GEO: archive for functional genomics data sets—10 years on , 2010, Nucleic Acids Res..

[9]  Hideaki Sugawara,et al.  The Sequence Read Archive , 2010, Nucleic Acids Res..

[10]  Ibrahim Emam,et al.  ArrayExpress update—an archive of microarray and high-throughput sequencing-based functional genomics experiments , 2010, Nucleic Acids Res..

[11]  Guy Cochrane,et al.  The International Nucleotide Sequence Database Collaboration , 2012, Nucleic Acids Res..

[12]  Takashi Gojobori,et al.  The DNA Data Bank of Japan launches a new resource, the DDBJ Omics Archive of functional genomics experiments , 2011, Nucleic Acids Res..

[13]  Tatiana A. Tatusova,et al.  BioProject and BioSample databases at NCBI: facilitating capture and organization of metadata , 2011, Nucleic Acids Res..

[14]  Rasko Leinonen,et al.  The sequence read archive: explosive growth of sequencing data , 2011, Nucleic Acids Res..

[15]  Marco Brandizi,et al.  The BioSample Database (BioSD) at the European Bioinformatics Institute , 2011, Nucleic Acids Res..

[16]  Ying Cheng,et al.  Major submissions tool developments at the European nucleotide archive , 2011, Nucleic Acids Res..