BGI-RIS: an integrated information resource and comparative analysis workbench for rice genomics

Rice is a major food staple for the world's population and serves as a model species in cereal genome research. The Beijing Genomics Institute (BGI) has long been devoting itself to sequencing, information analysis and biological research of the rice and other crop genomes. In order to facilitate the application of the rice genomic information and to provide a foundation for functional and evolutionary studies of other important cereal crops, we implemented our Rice Information System (BGI-RIS), the most up-to-date integrated information resource as well as a workbench for comparative genomic analysis. In addition to comprehensive data from Oryza sativa L. ssp. indica sequenced by BGI, BGI-RIS also hosts carefully curated genome information from Oryza sativa L. ssp. japonica and EST sequences available from other cereal crops. In this resource, sequence contigs of indica (93-11) have been further assembled into Mbp-sized scaffolds and anchored onto the rice chromosomes referenced to physical/genetic markers, cDNAs and BAC-end sequences. We have annotated the rice genomes for gene content, repetitive elements, gene duplications (tandem and segmental) and single nucleotide polymorphisms between rice subspecies. Designed as a basic platform, BGI-RIS presents the sequenced genomes and related information in systematic and graphical ways for the convenience of in-depth comparative studies (http://rise.genomics.org.cn/).

[1]  Ronald L. Phillips,et al.  Relationships of cereal crops and other grasses , 1998 .

[2]  J. Bennetzen,et al.  Microcolinearity in sh2-homologous regions of the maize, rice, and sorghum genomes. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[3]  P. B. Coaker,et al.  Applied Dynamic Programming , 1964 .

[4]  J. Jurka,et al.  Repeats in genomic DNA: mining and meaning. , 1998, Current opinion in structural biology.

[5]  Michael Y. Galperin,et al.  The COG database: new developments in phylogenetic classification of proteins from complete genomes , 2001, Nucleic Acids Res..

[6]  Huanming Yang,et al.  A Draft Sequence of the Rice Genome (Oryza sativa L. ssp. japonica) , 2002, Science.

[7]  Sean R Eddy,et al.  What is dynamic programming? , 2004, Nature Biotechnology.

[8]  G. Murphy,et al.  The small, the large and the wild: the value of comparison in plant genomics. , 1999, Trends in genetics : TIG.

[9]  D. Haussler,et al.  A hidden Markov model that finds genes in E. coli DNA. , 1994, Nucleic acids research.

[10]  P. Green,et al.  Base-calling of automated sequencer traces using phred. I. Accuracy assessment. , 1998, Genome research.

[11]  J. Kawai,et al.  Collection, Mapping, and Annotation of Over 28,000 cDNA Clones from japonica Rice , 2003, Science.

[12]  R. Guigó,et al.  Evaluation of gene structure prediction programs. , 1996, Genomics.

[13]  S McCouch,et al.  Toward a plant genomics initiative: thoughts on the value of cross-species and cross-genera comparisons in the grasses. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[14]  D. Haussler,et al.  Hidden Markov models in computational biology. Applications to protein modeling. , 1993, Journal of molecular biology.

[15]  S. Karlin,et al.  Prediction of complete gene structures in human genomic DNA. , 1997, Journal of molecular biology.

[16]  J W Fickett,et al.  Finding genes by computer: the state of the art. , 1996, Trends in genetics : TIG.

[17]  Philip Lijnzaad,et al.  The Ensembl genome database project , 2002, Nucleic Acids Res..

[18]  P Green,et al.  Base-calling of automated sequencer traces using phred. II. Error probabilities. , 1998, Genome research.

[19]  Wei Zhao,et al.  Gramene: a resource for comparative grass genomics , 2002, Nucleic Acids Res..

[20]  A. Oliphant,et al.  A draft sequence of the rice genome (Oryza sativa L. ssp. japonica). , 2002, Science.