Derivation of Transcriptional Regulatory Relationships by Partial Least Squares Regression

As the number of genes in a transcriptional regulatory network is large and the number of samples inbiological data types is usually small, there is a need for integrating multiple data types for reverseengineering these networks. In this paper, we propose a method to integrate microarray gene expression,ChIP-chip and transcription factor binding motif data sets in a partial least squares regression model toderive transcription factors (TFs) -~gene interactions. Both single and synergistic effects of TFs on thepromoters are considered in the model. A method that dynamically updates the significance level based onChIP-chip and binding motif data is proposed. The results evaluated by methods based on Gene Ontologydemonstrate the effectiveness of the proposed approach.

[1]  Chao Cheng,et al.  BMC Genomics BioMed Central Methodology article , 2008 .

[2]  S. Datta,et al.  Exploring relationships in gene expressions: a partial least squares approach. , 2001, Gene expression.

[3]  S. Griffis EDITOR , 1997, Journal of Navigation.

[4]  Mohammed Al-Shalalfa,et al.  Combining multiple types of biological data in constraint-based learning of gene regulatory networks , 2008, 2008 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology.

[5]  Huai Li,et al.  Unraveling transcriptional regulatory programs by integrative analysis of microarray and transcription factor binding data , 2008, Bioinform..

[6]  Mohammed Al-Shalalfa,et al.  Influence of Prior Knowledge in Constraint-Based Learning of Gene Regulatory Networks , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[7]  H. Lähdesmäki,et al.  Probabilistic Inference of Transcription Factor Binding from Multiple Data Sources , 2008, PloS one.

[8]  Vincent Frouin,et al.  Gene Association Networks from Microarray Data Using a Regularized Estimation of Partial Correlation Based on PLS Regression , 2010, TCBB.

[9]  Pooja Jain,et al.  The YEASTRACT database: a tool for the analysis of transcription regulatory associations in Saccharomyces cerevisiae , 2005, Nucleic Acids Res..

[10]  Vincent Frouin,et al.  Gene Association Networks from Microarray Data Using a Regularized Estimation of Partial Correlation Based on PLS Regression , 2010, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[11]  M. Newton Large-Scale Simultaneous Hypothesis Testing: The Choice of a Null Hypothesis , 2008 .

[12]  P. Shannon,et al.  Cytoscape: a software environment for integrated models of biomolecular interaction networks. , 2003, Genome research.

[13]  Nicola J. Rinaldi,et al.  Transcriptional regulatory code of a eukaryotic genome , 2004, Nature.

[14]  Russ B. Altman,et al.  Missing value estimation methods for DNA microarrays , 2001, Bioinform..

[15]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[16]  Vasyl Pihur,et al.  Reconstruction of genetic association networks from microarray data: a partial least squares approach , 2008, Bioinform..

[17]  Christina Backes,et al.  GeneTrail—advanced gene set enrichment analysis , 2007, Nucleic Acids Res..

[18]  D. Botstein,et al.  Genomic expression programs in the response of yeast cells to environmental changes. , 2000, Molecular biology of the cell.

[19]  B. Efron Large-Scale Simultaneous Hypothesis Testing , 2004 .

[20]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[21]  Alexander J. Hartemink,et al.  Informative Structure Priors: Joint Learning of Dynamic Regulatory Networks from Multiple Types of Data , 2004, Pacific Symposium on Biocomputing.