Inference of transcriptional regulatory network by two-stage constrained space factor analysis

MOTIVATION Microarray gene expression and cross-linking chromatin immunoprecipitation data contain voluminous information that can help the identification of transcriptional regulatory networks at the full genome scale. Such high-throughput data are noisy however. In contrast, from the biomedical literature, we can find many evidenced transcription factor (TF)-target gene binding relationships that have been elucidated at the molecular level. But such sporadically generated knowledge only offers glimpses on limited patches of the network. How to incorporate this valuable knowledge resource to build more reliable network models remains a question. RESULTS We present a modified factor analysis approach. Our algorithm starts with the evidenced TF-gene linkages. It iterates between the network configuration estimation step and the connection strength estimation step, using the high-throughput data, till convergence. We report two comprehensive regulatory networks obtained for Saccharomyces cerevisiae, one under the normal growth condition and the other under the environmental stress condition. SUPPLEMENTARY INFORMATION http://kiefer.stat.ucla.edu/lap2/download/bti656_supplement.pdf.

[1]  G. Church,et al.  Computational identification of transcription factor binding sites via a transcription-factor-centric clustering (TFCC) algorithm. , 2002, Journal of molecular biology.

[2]  Nicola J. Rinaldi,et al.  Computational discovery of gene modules and regulatory networks , 2003, Nature Biotechnology.

[3]  D. Pe’er,et al.  Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data , 2003, Nature Genetics.

[4]  Patrik D'haeseleer,et al.  Genetic network inference: from co-expression clustering to reverse engineering , 2000, Bioinform..

[5]  Chiara Sabatti,et al.  Network component analysis: Reconstruction of regulatory signals in biological systems , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[6]  Nicola J. Rinaldi,et al.  Transcriptional Regulatory Networks in Saccharomyces cerevisiae , 2002, Science.

[7]  Kara Dolinski,et al.  Saccharomyces Genome Database (SGD) provides secondary gene annotation using the Gene Ontology (GO) , 2002, Nucleic Acids Res..

[8]  D. Botstein,et al.  Genomic expression programs in the response of yeast cells to environmental changes. , 2000, Molecular biology of the cell.

[9]  Xin Chen,et al.  The TRANSFAC system on gene expression regulation , 2001, Nucleic Acids Res..

[10]  Michael A. Beer,et al.  Whole-genome discovery of transcription factor binding sites by network-level conservation. , 2003, Genome research.

[11]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Ker-Chau Li,et al.  A system for enhancing genome-wide coexpression dynamics study. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[13]  D. F. Morrison,et al.  Multivariate Statistical Methods , 1968 .

[14]  Nir Friedman,et al.  Inferring subnetworks from perturbed expression profiles , 2001, ISMB.

[15]  Ker-Chau Li,et al.  Genome-wide coexpression dynamics: Theory and application , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[16]  Yaniv Ziv,et al.  Revealing modular organization in the yeast transcriptional network , 2002, Nature Genetics.

[17]  Nicola J. Rinaldi,et al.  Transcriptional regulatory code of a eukaryotic genome , 2004, Nature.

[18]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[19]  I. Jolliffe,et al.  Nonlinear Multivariate Analysis , 1992 .

[20]  Holger H. Hoos,et al.  Inference of transcriptional regulation relationships from gene expression data , 2003, SAC '03.

[21]  Mark Gerstein,et al.  Prediction of regulatory networks: genome-wide identification of transcription factor targets from gene expression data , 2003, Bioinform..

[22]  Xiaojiang Xu,et al.  Learning module networks from genome‐wide location and expression data , 2004, FEBS letters.

[23]  Holger H. Hoos,et al.  Inference of Transcriptional Regulation Relationships from Gene Expression Data , 2003, Bioinform..

[24]  Scott Chapman,et al.  Using biplots to interpret gene expression patterns in plants , 2002, Bioinform..

[25]  Sven Bergmann,et al.  Defining transcription modules using large-scale gene expression data , 2004, Bioinform..

[26]  Lorenz Wernisch,et al.  Reconstruction of gene networks using Bayesian learning and manipulation experiments , 2004, Bioinform..

[27]  S. Rafii,et al.  Splitting vessels: Keeping lymph apart from blood , 2003, Nature Medicine.

[28]  A. Kimura,et al.  Chromosomal gradient of histone acetylation established by Sas2p and Sir2p functions as a shield against gene silencing , 2002, Nature Genetics.

[29]  W. Wong,et al.  Functional annotation and network reconstruction through cross-platform integration of microarray data , 2005, Nature Biotechnology.

[30]  Concepcion R. Nierras,et al.  Transcriptional Elements Involved in the Repression of Ribosomal Protein Synthesis , 1999, Molecular and Cellular Biology.

[31]  Adam Godzik,et al.  Comparative analysis of protein domain organization. , 2004, Genome research.

[32]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[33]  P. Qiu Recent advances in computational promoter analysis in understanding the transcriptional regulatory network. , 2003, Biochemical and biophysical research communications.

[34]  Hiroyuki Toh,et al.  Inference of a genetic network by a combined approach of cluster analysis and graphical Gaussian modeling , 2002, Bioinform..

[35]  M. Hill,et al.  NONLINEAR MULTIVARIATE ANALYSIS , 1990 .

[36]  Jaak Vilo,et al.  Building and analysing genome-wide gene disruption networks , 2002, ECCB.