An Integrative Bioinformatics Framework for Genome-scale Multiple Level Network Reconstruction of Rice

Understanding how metabolic reactions translate the genome of an organism into its phenotype is a grand challenge in biology. Genome-wide association studies (GWAS) statistically connect genotypes to phenotypes, without any recourse to known molecular interactions, whereas a molecular mechanistic description ties gene function to phenotype through gene regulatory networks (GRNs), protein-protein interactions (PPIs) and molecular pathways. Integration of different regulatory information levels of an organism is expected to provide a good way for mapping genotypes to phenotypes. However, the lack of curated metabolic model of rice is blocking the exploration of genome-scale multi-level network reconstruction. Here, we have merged GRNs, PPIs and genome-scale metabolic networks (GSMNs) approaches into a single framework for rice via omics’ regulatory information reconstruction and integration. Firstly, we reconstructed a genome-scale metabolic model, containing 4,462 function genes, 2,986 metabolites involved in 3,316 reactions, and compartmentalized into ten subcellular locations. Furthermore, 90,358 pairs of protein-protein interactions, 662,936 pairs of gene regulations and 1,763 microRNA-target interactions were integrated into the metabolic model. Eventually, a database was developped for systematically storing and retrieving the genome-scale multi-level network of rice. This provides a reference for understanding genotype-phenotype relationship of rice, and for analysis of its molecular regulatory network.

[1]  B. Burr,et al.  International Rice Genome Sequencing Project: the effort to completely sequence the rice genome. , 2000, Current opinion in plant biology.

[2]  K. Devos,et al.  Comparative genetics in the grasses. , 1998, Plant molecular biology.

[3]  Paul Horton,et al.  Nucleic Acids Research Advance Access published May 21, 2007 WoLF PSORT: protein localization predictor , 2007 .

[4]  S. Goff,et al.  Rice as a model for cereal genomics. , 1999, Current opinion in plant biology.

[5]  Joost J B Keurentjes,et al.  Genetic analysis of metabolome-phenotype interactions: from model to crop species. , 2013, Trends in genetics : TIG.

[6]  F. Legeai,et al.  Predotar: A tool for rapidly screening proteomes for N‐terminal targeting sequences , 2004, Proteomics.

[7]  Cathy H. Wu,et al.  UniProt: the Universal Protein knowledgebase , 2004, Nucleic Acids Res..

[8]  Maren L Friesen,et al.  Genotype-phenotype mapping in a post-GWAS world. , 2012, Trends in genetics : TIG.

[9]  Yanli Wang,et al.  PubChem: a public information system for analyzing bioactivities of small molecules , 2009, Nucleic Acids Res..

[10]  R. Fleischmann,et al.  Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. , 1995, Science.

[11]  C. Mathews Thermodynamics of biochemical reactions , 2004 .

[12]  Michael Darsow,et al.  ChEBI: a database and ontology for chemical entities of biological interest , 2007, Nucleic Acids Res..

[13]  Antje Chang,et al.  BRENDA, enzyme data and metabolic information , 2002, Nucleic Acids Res..

[14]  S. Brunak,et al.  Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. , 2000, Journal of molecular biology.

[15]  Hiroyuki Ogata,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 1999, Nucleic Acids Res..

[16]  Guo-Zheng Li,et al.  Using AdaBoost for the prediction of subcellular location of prokaryotic and eukaryotic proteins , 2008, Molecular Diversity.

[17]  Hagit Shatkay,et al.  Pacific Symposium on Biocomputing 13:604-615(2008) EPILOC: A (WORKING) TEXT-BASED SYSTEM FOR PREDICTING PROTEIN SUBCELLULAR LOCATION , 2022 .

[18]  L. Quek,et al.  AraGEM, a Genome-Scale Reconstruction of the Primary Metabolic Network in Arabidopsis1[W] , 2009, Plant Physiology.

[19]  Jenn-Kang Hwang,et al.  Predicting subcellular localization of proteins for Gram‐negative bacteria by support vector machines based on n‐peptide compositions , 2004, Protein science : a publication of the Protein Society.

[20]  E. Marcotte,et al.  Genetic dissection of the biotic stress response using a genome-scale gene network for rice , 2011, Proceedings of the National Academy of Sciences.

[21]  C. Maranas,et al.  Zea mays iRS1563: A Comprehensive Genome-Scale Metabolic Reconstruction of Maize Metabolism , 2011, PloS one.

[22]  J. Edwards,et al.  Systems Properties of the Haemophilus influenzaeRd Metabolic Genotype* , 1999, The Journal of Biological Chemistry.

[23]  Ian M. Donaldson,et al.  BIND: the Biomolecular Interaction Network Database , 2001, Nucleic Acids Res..

[24]  K. Chou,et al.  Plant-mPLoc: A Top-Down Strategy to Augment the Power for Predicting Plant Protein Subcellular Localization , 2010, PloS one.

[25]  Neil Swainston,et al.  Integration of metabolic databases for the reconstruction of genome-scale metabolic networks , 2010, BMC Systems Biology.

[26]  Ming Chen,et al.  PRIN: a predicted rice interactome network , 2011, BMC Bioinformatics.

[27]  B. Williams,et al.  An Integrated Physical and Genetic Map of the Rice Genome , 2002, The Plant Cell Online.

[28]  K. Waki,et al.  A Comprehensive Rice Transcript Map Containing 6591 Expressed Sequence Tag Sites , 2002, The Plant Cell Online.

[29]  M. Vihinen,et al.  PROlocalizer: integrated web service for protein subcellular localization prediction , 2010, Amino Acids.

[30]  John A. Hamilton,et al.  The TIGR Rice Genome Annotation Resource: improvements and new features , 2006, Nucleic Acids Res..

[31]  Ming Chen,et al.  Construction of MicroRNA- and MicroRNA*-mediated regulatory networks in plants , 2011, RNA biology.

[32]  S. Lin,et al.  A high-density rice genetic linkage map with 2275 markers using a single F2 population. , 1998, Genetics.

[33]  Lincoln Stein,et al.  Gramene: a growing plant comparative genomics resource , 2007, Nucleic Acids Res..