Predicting gene regulatory networks of soybean nodulation from RNA-Seq transcriptome data

BackgroundHigh-throughput RNA sequencing (RNA-Seq) is a revolutionary technique to study the transcriptome of a cell under various conditions at a systems level. Despite the wide application of RNA-Seq techniques to generate experimental data in the last few years, few computational methods are available to analyze this huge amount of transcription data. The computational methods for constructing gene regulatory networks from RNA-Seq expression data of hundreds or even thousands of genes are particularly lacking and urgently needed.ResultsWe developed an automated bioinformatics method to predict gene regulatory networks from the quantitative expression values of differentially expressed genes based on RNA-Seq transcriptome data of a cell in different stages and conditions, integrating transcriptional, genomic and gene function data. We applied the method to the RNA-Seq transcriptome data generated for soybean root hair cells in three different development stages of nodulation after rhizobium infection. The method predicted a soybean nodulation-related gene regulatory network consisting of 10 regulatory modules common for all three stages, and 24, 49 and 70 modules separately for the first, second and third stage, each containing both a group of co-expressed genes and several transcription factors collaboratively controlling their expression under different conditions. 8 of 10 common regulatory modules were validated by at least two kinds of validations, such as independent DNA binding motif analysis, gene function enrichment test, and previous experimental data in the literature.ConclusionsWe developed a computational method to reliably reconstruct gene regulatory networks from RNA-Seq transcriptome data. The method can generate valuable hypotheses for interpreting biological data and designing biological experiments such as ChIP-Seq, RNA interference, and yeast two hybrid experiments.

[1]  A. Muñoz,et al.  GRAS Proteins Form a DNA Binding Complex to Induce Gene Expression during Nodulation Signaling in Medicago truncatula[W] , 2009, The Plant Cell Online.

[2]  G. Stacey,et al.  Complete Transcriptome of the Soybean Root Hair Cell, a Single-Cell Model, and Its Alteration in Response to Bradyrhizobium japonicum Infection1[C][W][OA] , 2009, Plant Physiology.

[3]  M. Crespi,et al.  The Medicago truncatula CRE1 Cytokinin Receptor Regulates Lateral Root Development and Early Symbiotic Interaction with Sinorhizobium meliloti[W] , 2006, The Plant Cell Online.

[4]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[5]  Serban Nacu,et al.  Fast and SNP-tolerant detection of complex variants and splicing in short reads , 2010, Bioinform..

[6]  B. Meyers,et al.  MicroRNAs in the Rhizobia Legume Symbiosis1 , 2009, Plant Physiology.

[7]  T. Sakurai,et al.  Genome sequence of the palaeopolyploid soybean , 2010, Nature.

[8]  Gene Ontology Consortium The Gene Ontology (GO) database and informatics resource , 2003 .

[9]  E. Mardis,et al.  Transcriptome-Wide Identification of Novel Imprinted Genes in Neonatal Mouse Brain , 2008, PloS one.

[10]  Adam A. Margolin,et al.  Reverse engineering of regulatory networks in human B cells , 2005, Nature Genetics.

[11]  Bogumil J. Karas,et al.  A Cytokinin Perception Mutant Colonized by Rhizobium in the Absence of Nodule Organogenesis , 2007, Science.

[12]  Trupti Joshi,et al.  An integrated transcriptome atlas of the crop model Glycine max, and its use in comparative analyses in plants. , 2010, The Plant journal : for cell and molecular biology.

[13]  Y. Li,et al.  Comprehensive transcriptome analysis reveals novel genes involved in cardiac glycoside biosynthesis and mlncRNAs associated with secondary metabolism and stress response in Digitalis purpurea , 2012, BMC Genomics.

[14]  J. Collins,et al.  Large-Scale Mapping and Validation of Escherichia coli Transcriptional Regulation from a Compendium of Expression Profiles , 2007, PLoS biology.

[15]  J. Silberg,et al.  A transposase strategy for creating libraries of circularly permuted proteins , 2012, Nucleic acids research.

[16]  Damian Szklarczyk,et al.  The STRING database in 2011: functional interaction networks of proteins, globally integrated and scored , 2010, Nucleic Acids Res..

[17]  Jianlin Cheng,et al.  SoyDB: a knowledge database of soybean transcription factors , 2010, BMC Plant Biology.

[18]  N. Sandal,et al.  A small family of nodule specific genes from soybean. , 1987, Nucleic acids research.

[19]  Trupti Joshi,et al.  Reconstructing differentially co-expressed gene modules and regulatory networks of soybean cells , 2012, BMC Genomics.

[20]  Tommi S. Jaakkola,et al.  Combining Location and Expression Data for Principled Discovery of Genetic Regulatory Network Models , 2001, Pacific Symposium on Biocomputing.

[21]  Alvaro J. González,et al.  Management of High-Throughput DNA Sequencing Projects: Alpheus. , 2008, Journal of computer science and systems biology.

[22]  M. Holsters,et al.  Nodule numbers are governed by interaction between CLE peptides and cytokinin signaling. , 2012, The Plant journal : for cell and molecular biology.

[23]  J. Habben,et al.  A maize cytokinin gene encoding an O-glucosyltransferase specific to cis-zeatin , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[24]  Jianlin Cheng,et al.  MULTICOM: a multi-level combination approach to protein structure prediction and its assessments in CASP8 , 2010, Bioinform..

[25]  Trupti Joshi,et al.  Soybean Knowledge Base (SoyKB): a web resource for soybean translational genomics , 2012, BMC Genomics.

[26]  Kathleen Marchal,et al.  Module networks revisited: computational assessment and prioritization of model predictions , 2009, Bioinform..

[27]  J. Hofer,et al.  Legume Transcription Factors: Global Regulators of Plant Development and Response to the Environment1[W] , 2007, Plant Physiology.

[28]  J. M. Seguí-Simarro,et al.  Arginine Decarboxylase expression, polyamines biosynthesis and reactive oxygen species during organogenic nodule formation in hop , 2011, Plant signaling & behavior.

[29]  Mats Ensterö,et al.  Large-scale mRNA sequencing determines global regulation of RNA editing during brain development. , 2009, Genome research.

[30]  Mikael Bodén,et al.  MEME Suite: tools for motif discovery and searching , 2009, Nucleic Acids Res..

[31]  Gary Stacey,et al.  A Protein Domain Co-Occurrence Network Approach for Predicting Protein Function and Inferring Species Phylogeny , 2011, PloS one.

[32]  E. Journet,et al.  Four Genes of Medicago truncatula Controlling Components of a Nod Factor Transduction Pathway , 2000, Plant Cell.

[33]  Xuegong Zhang,et al.  DEGseq: an R package for identifying differentially expressed genes from RNA-seq data , 2010, Bioinform..

[34]  Cyrus Chothia,et al.  SUPERFAMILY: HMMs representing all proteins of known structure. SCOP sequence searches, alignments and genome assignments , 2002, Nucleic Acids Res..

[35]  T. Soyano,et al.  Function of GRAS proteins in root nodule symbiosis is retained in homologs of a non-legume, rice. , 2010, Plant & cell physiology.

[36]  J. F. Marsh,et al.  Nodulation Signaling in Legumes Requires NSP2, a Member of the GRAS Family of Transcriptional Regulators , 2005, Science.

[37]  The UniProt Consortium,et al.  Reorganizing the protein space at the Universal Protein Resource (UniProt) , 2011, Nucleic Acids Res..

[38]  Mark D. Robinson,et al.  edgeR: a Bioconductor package for differential expression analysis of digital gene expression data , 2009, Bioinform..

[39]  Rex T. Nelson,et al.  RNA-Seq Atlas of Glycine max: A guide to the soybean transcriptome , 2010, BMC Plant Biology.

[40]  S. Shimizu,et al.  Hydratases involved in nitrile conversion: screening, characterization and application. , 2002, Chemical record.

[41]  BMC Bioinformatics , 2005 .

[42]  D. Pe’er,et al.  Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data , 2003, Nature Genetics.

[43]  J. Perry,et al.  Lotus japonicus Nodulation Requires Two GRAS Domain Regulators, One of Which Is Functionally Conserved in a Non-Legume1[C][W] , 2006, Plant Physiology.

[44]  Kimberly Van Auken,et al.  WormBase: a multi-species resource for nematode biology and genomics , 2004, Nucleic Acids Res..

[45]  William Stafford Noble,et al.  Quantifying similarity between motifs , 2007, Genome Biology.

[46]  Ruth C. Martin,et al.  Isolation of a cytokinin gene, ZOG1, encoding zeatin O-glucosyltransferase from Phaseolus lunatus. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[47]  P. Gresshoff,et al.  Molecular analysis of legume nodule development and autoregulation. , 2010, Journal of integrative plant biology.

[48]  Eric T. Wang,et al.  Alternative Isoform Regulation in Human Tissue Transcriptomes , 2008, Nature.

[49]  J. Mol,et al.  The flavonoid biosynthetic pathway in plants: Function and evolution , 1994 .

[50]  Renzhi Cao,et al.  Three-Level Prediction of Protein Function by Combining Profile-Sequence Search, Profile-Profile Search, and Domain Co-Occurrence Networks , 2013, BMC Bioinformatics.

[51]  Amos Tanay,et al.  Minreg: Inferring an active regulator set , 2002, ISMB.

[52]  S. Chouhan,et al.  Enhancement in leghemoglobin content of root nodules by exclusion of solar UV-A and UV-B radiation in soybean , 2008, Journal of Plant Biology.

[53]  T. Bisseling,et al.  NSP1 of the GRAS Protein Family Is Essential for Rhizobial Nod Factor-Induced Transcription , 2005, Science.

[54]  W. Yin,et al.  The salt- and drought-inducible poplar GRAS protein SCL7 confers salt and drought tolerance in Arabidopsis thaliana , 2010, Journal of experimental botany.

[55]  G. Oldroyd,et al.  GRAS-domain transcription factors that regulate plant development , 2009, Plant signaling & behavior.

[56]  K. Walsh Physiology of the legume nodule and its response to stress , 1995 .

[57]  I. Goodhead,et al.  Dynamic repertoire of a eukaryotic transcriptome surveyed at single-nucleotide resolution , 2008, Nature.

[58]  S. Tabata,et al.  A Gain-of-Function Mutation in a Cytokinin Receptor Triggers Spontaneous Root Nodule Organogenesis , 2007, Science.

[59]  J. Peng,et al.  Plant GRAS and metazoan STATs: one family? , 2000, BioEssays : news and reviews in molecular, cellular and developmental biology.

[60]  J. Downie,et al.  Coordinating nodule morphogenesis with rhizobial infection in legumes. , 2008, Annual review of plant biology.