GSA and BIGD: Filling the Gap of Bioinformatics Resource and Service in China*

In the 2017 first issue of this Journal – Genomes, Proteomes and Bioinformatics – a special database article entitled ‘‘GSA: Genome Sequence Archive” [1] is published. This article provides a brief introduction to the platform developed by the authors from the BIG Data Center (BIGD) of Beijing Institute of Genomics (BIG), Chinese Academy of Sciences (CAS). The aim of the GSA project is to collect, integrate, and archive raw sequence data submitted by domestic and international users. It is one of the major activities being carried on by a team of around 50 young bioinformaticians at BIGD. In addition to the GSA system, they are also working on several bioinformatics service-orientated projects as described in one of their recent publications [2]. The past half century has witnessed great advances in molecular biology. The deciphering of the genetic code and the establishment of the central dogma following the discovery of the DNA double helix formed a solid theoretical basis for the field of life sciences. On the other hand, the influential works by Frederick Sanger and others to determine the peptide, tRNA, and DNA sequences, as well as the fundamental endeavor by John Kendrew and Max Perutz to solve the three-dimensional structure of proteins, marked the beginning of the accumulation of molecular biological data. Protein sequence databases