Bottom-up GGM algorithm for constructing multilayered hierarchical gene regulatory networks that govern biological pathways or processes

BackgroundMultilayered hierarchical gene regulatory networks (ML-hGRNs) are very important for understanding genetics regulation of biological pathways. However, there are currently no computational algorithms available for directly building ML-hGRNs that regulate biological pathways.ResultsA bottom-up graphic Gaussian model (GGM) algorithm was developed for constructing ML-hGRN operating above a biological pathway using small- to medium-sized microarray or RNA-seq data sets. The algorithm first placed genes of a pathway at the bottom layer and began to construct a ML-hGRN by evaluating all combined triple genes: two pathway genes and one regulatory gene. The algorithm retained all triple genes where a regulatory gene significantly interfered two paired pathway genes. The regulatory genes with highest interference frequency were kept as the second layer and the number kept is based on an optimization function. Thereafter, the algorithm was used recursively to build a ML-hGRN in layer-by-layer fashion until the defined number of layers was obtained or terminated automatically.ConclusionsWe validated the algorithm and demonstrated its high efficiency in constructing ML-hGRNs governing biological pathways. The algorithm is instrumental for biologists to learn the hierarchical regulators associated with a given biological pathway from even small-sized microarray or RNA-seq data sets.

[1]  A. Hetherington,et al.  AtMYB61, an R2R3-MYB Transcription Factor Controlling Stomatal Aperture in Arabidopsis thaliana , 2005, Current Biology.

[2]  M. Gerstein,et al.  Genomic analysis of the hierarchical structure of regulatory networks , 2006, Proceedings of the National Academy of Sciences.

[3]  R. Dixon,et al.  On-off switches for secondary cell wall biosynthesis. , 2012, Molecular plant.

[4]  M. Delseny,et al.  AtERF38 (At2g35700), an AP2/ERF family transcription factor gene from Arabidopsis thaliana, is expressed in specific cell types of roots, stems and seeds that undergo suberization. , 2008, Plant physiology and biochemistry : PPB.

[5]  R. Zhong,et al.  A Battery of Transcription Factors Involved in the Regulation of Secondary Cell Wall Biosynthesis in Arabidopsis , 2008, The Plant Cell Online.

[6]  Hyunsoo Kim,et al.  Unraveling condition specific gene transcriptional regulatory networks in Saccharomyces cerevisiae , 2006, BMC Bioinformatics.

[7]  N. Chaffey,et al.  Secondary xylem development in Arabidopsis: a model for wood formation. , 2002, Physiologia plantarum.

[8]  Chris Wiggins,et al.  ARACNE: An Algorithm for the Reconstruction of Gene Regulatory Networks in a Mammalian Cellular Context , 2004, BMC Bioinformatics.

[9]  I S Kohane,et al.  Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements. , 1999, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[10]  Agustino Martínez-Antonio,et al.  Escherichia coli transcriptional regulatory network , 2011 .

[11]  Megan F. Cole,et al.  Core Transcriptional Regulatory Circuitry in Human Embryonic Stem Cells , 2005, Cell.

[12]  Hong-Gu Kang,et al.  Characterization of salicylic acid-responsive, arabidopsis Dof domain proteins: overexpression of OBP3 leads to growth defects. , 2000, The Plant journal : for cell and molecular biology.

[13]  Ting Chen,et al.  Modeling Gene Expression with Differential Equations , 1998, Pacific Symposium on Biocomputing.

[14]  Xiang Li,et al.  Nitrogen deprivation promotes Populus root growth through global transcriptome reprogramming and activation of hierarchical genetic networks. , 2013, The New phytologist.

[15]  J. Collins,et al.  Large-Scale Mapping and Validation of Escherichia coli Transcriptional Regulation from a Compendium of Expression Profiles , 2007, PLoS biology.

[16]  Z. N. Oltvai,et al.  Topological units of environmental signal processing in the transcriptional regulatory network of Escherichia coli , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[17]  I. De Smet,et al.  Multimodular auxin response controls lateral root development in Arabidopsis , 2010, Plant signaling & behavior.

[18]  R. Zhong,et al.  Two NAC domain transcription factors, SND1 and NST1, function redundantly in regulation of secondary wall synthesis in fibers of Arabidopsis , 2007, Planta.

[19]  Paul M. Magwene,et al.  Estimating genomic coexpression networks using first-order conditional independence , 2004, Genome Biology.

[20]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..

[21]  Matthieu Louis,et al.  Binary and Graded Responses in Gene Networks , 2002, Science's STKE.

[22]  H. Bohnert,et al.  PLANT CELLULAR AND MOLECULAR RESPONSES TO HIGH SALINITY. , 2000, Annual review of plant physiology and plant molecular biology.

[23]  R. Sederoff,et al.  SND1 Transcription Factor–Directed Quantitative Functional Hierarchical Genetic Regulatory Network in Wood Formation in Populus trichocarpa[C][W] , 2013, Plant Cell.

[24]  Sapna Kumari,et al.  Evaluation of Gene Association Methods for Coexpression Network Construction and Biological Knowledge Discovery , 2012, PloS one.

[25]  Nitin Bhardwaj,et al.  Rewiring of Transcriptional Regulatory Networks: Hierarchy, Rather Than Connectivity, Better Reflects the Importance of Regulators , 2010, Science Signaling.

[26]  Marcel J. T. Reinders,et al.  Integration of Known Transcription Factor Binding Site Information and Gene Expression Data to Advance from Co-Expression to Co-Regulation , 2007, Genom. Proteom. Bioinform..

[27]  Jungmook Kim,et al.  Transcription factor MYB46 is an obligate component of the transcriptional regulatory complex for functional expression of secondary wall-associated cellulose synthases in Arabidopsis thaliana. , 2013, Journal of plant physiology.

[28]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[29]  Staffan Persson,et al.  Identification of genes required for cellulose synthesis by regression analysis of public microarray data sets. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[30]  R. Altman,et al.  Coherent Functional Modules Improve Transcription Factor Target Identification, Cooperativity Prediction, and Disease Association , 2014, PLoS genetics.

[31]  Tong Wang,et al.  TF-finder: A software package for identifying transcription factors involved in biological processes using microarray data and existing knowledge base , 2010, BMC Bioinformatics.

[32]  R. Zhong,et al.  The MYB46 Transcription Factor Is a Direct Target of SND1 and Regulates Secondary Wall Biosynthesis in Arabidopsis , 2007, The Plant Cell Online.

[33]  Abraham Blum,et al.  Drought resistance - is it really a complex trait? , 2011, Functional plant biology : FPB.

[34]  T. Demura,et al.  Multiple classes of transcription factors regulate the expression of VASCULAR-RELATED NAC-DOMAIN7, a master switch of xylem vessel differentiation. , 2015, Plant & cell physiology.

[35]  T. Demura,et al.  SND1, a NAC Domain Transcription Factor, Is a Key Regulator of Secondary Wall Synthesis in Fibers of Arabidopsis[W] , 2006, The Plant Cell Online.

[36]  Hang Zhang,et al.  TF-Cluster: A pipeline for identifying functionally coordinated transcription factors via network decomposition of the shared coexpression connectivity matrix (SCCM) , 2011, BMC Systems Biology.

[37]  S. West,et al.  A comparison of methods to test mediation and other intervening variable effects. , 2002, Psychological methods.

[38]  Alexei A. Sharov,et al.  Identification of Pou5f1, Sox2, and Nanog downstream target genes with statistical confidence by applying a novel algorithm to time course microarray and genome-wide chromatin immunoprecipitation data , 2008, BMC Genomics.

[39]  Roger E Bumgarner,et al.  From co-expression to co-regulation: how many microarray experiments do we need? , 2004, Genome Biology.

[40]  B. Chabbert,et al.  In situ analysis of lignins in transgenic tobacco reveals a differential impact of individual transformations on the spatial patterns of lignin deposition at the cellular and subcellular levels. , 2001, The Plant journal : for cell and molecular biology.

[41]  Kathleen Marchal,et al.  SynTReN: a generator of synthetic gene expression data for design and analysis of structure learning algorithms , 2006, BMC Bioinformatics.

[42]  A. Zeng,et al.  An extended transcriptional regulatory network of Escherichia coli and analysis of its hierarchical structure and network motifs. , 2004, Nucleic acids research.

[43]  Jong Hoon Park,et al.  Induction of a homeodomain-leucine zipper gene by auxin is inhibited by cytokinin in Arabidopsis roots. , 2004, Biochemical and biophysical research communications.

[44]  L. Donaldson Lignification and lignin topochemistry - an ultrastructural view. , 2001, Phytochemistry.

[45]  M. Bevan,et al.  MYB61 Is Required for Mucilage Deposition and Extrusion in the Arabidopsis Seed Coat Article, publication date, and citation information can be found at www.plantcell.org/cgi/doi/10.1105/tpc.010265. , 2001, The Plant Cell Online.

[46]  Atul J. Butte,et al.  Quantifying the relationship between co-expression, co-regulation and gene function , 2004, BMC Bioinformatics.

[47]  M. Kater,et al.  A new role for the SHATTERPROOF genes during Arabidopsis gynoecium development. , 2010, Developmental biology.

[48]  Nobutaka Mitsuda,et al.  NAC transcription factors NST1 and NST3 regulate pod shattering in a partially redundant manner by promoting secondary wall formation after the establishment of tissue identity. , 2008, The Plant journal : for cell and molecular biology.

[49]  R. Tibshirani,et al.  A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. , 2009, Biostatistics.

[50]  Hairong Wei,et al.  Designing microarray and RNA-Seq experiments for greater systems biology discovery in modern plant genomics. , 2014, Molecular plant.

[51]  P. Bühlmann,et al.  Sparse graphical Gaussian modeling of the isoprenoid gene network in Arabidopsis thaliana , 2004, Genome Biology.

[52]  A. Myburg,et al.  SND2, a NAC transcription factor gene, regulates genes involved in secondary cell wall development in Arabidopsis fibres and increases fibre cell area in Eucalyptus , 2011, BMC Plant Biology.

[53]  Frank Emmert-Streib,et al.  Inferring the conservative causal core of gene regulatory networks , 2010, BMC Systems Biology.

[54]  S. Kauffman Homeostasis and Differentiation in Random Genetic Control Networks , 1969, Nature.

[55]  Hairong Wei,et al.  Genetic networks involved in poplar root response to low nitrogen , 2013, Plant signaling & behavior.

[56]  K. Shinozaki,et al.  NAC Transcription Factors, NST1 and NST3, Are Key Regulators of the Formation of Secondary Walls in Woody Tissues of Arabidopsis[W][OA] , 2007, The Plant Cell Online.

[57]  K. Herrmann,et al.  AtMYB4: a transcription factor general in the battle against UV. , 2001, Trends in plant science.

[58]  Alvis Brazma,et al.  Reconstruction of gene regulatory networks under the finite state linear model. , 2005, Genome informatics. International Conference on Genome Informatics.

[59]  Bartek Wilczynski,et al.  Applying dynamic Bayesian networks to perturbed gene expression data , 2006, BMC Bioinformatics.

[60]  Yan Meng,et al.  A Hierarchical Gene Regulatory Network for Adaptive Multirobot Pattern Formation , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[61]  E. Wisman,et al.  The arabidopsis ATHB-8 HD-zip protein acts as a differentiation-promoting transcription factor of the vascular meristems. , 2001, Plant physiology.

[62]  Peter J. Woolf,et al.  Learning transcriptional regulatory networks from high throughput gene expression data using continuous three-way mutual information , 2008, BMC Bioinformatics.

[63]  E. Davidson,et al.  The evolution of hierarchical gene regulatory networks , 2009, Nature Reviews Genetics.

[64]  R. Sederoff,et al.  Ptr-miR397a is a negative regulator of laccase genes affecting lignin content in Populus trichocarpa , 2013, Proceedings of the National Academy of Sciences.

[65]  Korbinian Strimmer,et al.  An empirical Bayes approach to inferring large-scale gene association networks , 2005, Bioinform..

[66]  Jessika Weiss,et al.  Graphical Models In Applied Multivariate Statistics , 2016 .

[67]  R. Dixon,et al.  Stress-Induced Phenylpropanoid Metabolism. , 1995, The Plant cell.

[68]  K. Shinozaki,et al.  The NAC Transcription Factors NST1 and NST2 of Arabidopsis Regulate Secondary Wall Thickenings and Are Required for Anther Dehiscencew⃞ , 2005, The Plant Cell Online.

[69]  Bor-Sen Chen,et al.  Robust model matching design methodology for a stochastic synthetic gene network. , 2011, Mathematical biosciences.