A multi-layer inference approach to reconstruct condition-specific genes and their regulation

UNLABELLED An important topic in systems biology is the reverse engineering of regulatory mechanisms through reconstruction of context-dependent gene networks. A major challenge is to identify the genes and the regulations specific to a condition or phenotype, given that regulatory processes are highly connected such that a specific response is typically accompanied by numerous collateral effects. In this study, we design a multi-layer approach that is able to reconstruct condition-specific genes and their regulation through an integrative analysis of large-scale information of gene expression, protein interaction and transcriptional regulation (transcription factor-target gene relationships). We establish the accuracy of our methodology against synthetic datasets, as well as a yeast dataset. We then extend the framework to the application of higher eukaryotic systems, including human breast cancer and Arabidopsis thaliana cold acclimation. Our study identified TACSTD2 (TROP2) as a target gene for human breast cancer and discovered its regulation by transcription factors CREB, as well as NFkB. We also predict KIF2C is a target gene for ER-/HER2- breast cancer and is positively regulated by E2F1. The predictions were further confirmed through experimental studies. AVAILABILITY The implementation and detailed protocol of the layer approach is available at http://www.egr.msu.edu/changroup/Protocols/Three-layer%20approach%20 to % 20reconstruct%20condition.html.

[1]  Jesús S. Aguilar-Ruiz,et al.  Fast Feature Ranking Algorithm , 2003, KES.

[2]  T. Ideker,et al.  Network-based classification of breast cancer metastasis , 2007, Molecular systems biology.

[3]  Marco Grzegorczyk,et al.  Comparative evaluation of reverse engineering gene regulatory networks with relevance networks, graphical gaussian models and bayesian networks , 2006, Bioinform..

[4]  Colleen J. Doherty,et al.  Roles for Arabidopsis CAMTA Transcription Factors in Cold-Regulated Gene Expression and Freezing Tolerance[W][OA] , 2009, The Plant Cell Online.

[5]  S. Safe,et al.  Transcriptional activation of E2F1 gene expression by 17beta-estradiol in MCF-7 cells is regulated by NF-Y-Sp1/estrogen receptor interactions. , 1999, Molecular endocrinology.

[6]  Hongzhe Li,et al.  Network-Based Empirical Bayes Methods for Linear Models with Applications to Genomic Data , 2010, Journal of biopharmaceutical statistics.

[7]  G. Churchill Using ANOVA to analyze microarray data. , 2004, BioTechniques.

[8]  Chris Wiggins,et al.  ARACNE: An Algorithm for the Reconstruction of Gene Regulatory Networks in a Mammalian Cellular Context , 2004, BMC Bioinformatics.

[9]  R. Tripaldi,et al.  Upregulation of Trop-2 quantitatively stimulates human cancer growth , 2013, Oncogene.

[10]  Ming Wu,et al.  Learning transcriptional regulation on a genome scale: a theoretical analysis based on gene expression data , 2012, Briefings Bioinform..

[11]  Rainer Spang,et al.  Inferring cellular networks – a review , 2007, BMC Bioinformatics.

[12]  V. Thorsson,et al.  Discovery of regulatory interactions through perturbation: inference and experimental design. , 1999, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[13]  James C. Liao,et al.  Transcriptome network component analysis with limited microarray data , 2006, Bioinform..

[14]  Korbinian Strimmer,et al.  An empirical Bayes approach to inferring large-scale gene association networks , 2005, Bioinform..

[15]  D. Pe’er,et al.  Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data , 2003, Nature Genetics.

[16]  Eivind Hovig,et al.  MGraph: graphical models for microarray data analysis , 2003, Bioinform..

[17]  Eva M Farré,et al.  CIRCADIAN CLOCK-ASSOCIATED 1 and LATE ELONGATED HYPOCOTYL regulate expression of the C-REPEAT BINDING FACTOR (CBF) pathway in Arabidopsis , 2011, Proceedings of the National Academy of Sciences.

[18]  J. Collins,et al.  Large-Scale Mapping and Validation of Escherichia coli Transcriptional Regulation from a Compendium of Expression Profiles , 2007, PLoS biology.

[19]  J. Hasty,et al.  Reverse engineering gene networks: Integrating genetic perturbations with dynamical modeling , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[20]  Igor Kononenko,et al.  Estimating Attributes: Analysis and Extensions of RELIEF , 1994, ECML.

[21]  Florence Jaffrezic,et al.  Gene network reconstruction from microarray data , 2009, BMC proceedings.

[22]  V. Hurry,et al.  Cold signalling and cold acclimation in plants , 2009 .

[23]  Thomas G. Dietterich,et al.  Learning with Many Irrelevant Features , 1991, AAAI.

[24]  Li Liu,et al.  Identification of novel targets for breast cancer by exploring gene switches on a genome scale , 2011, BMC Genomics.

[25]  Ron Shamir,et al.  Identification of functional modules using network topology and high-throughput data , 2007, BMC Systems Biology.

[26]  Jake Y. Chen,et al.  Biological Data Mining , 2009 .

[27]  Nicola J. Rinaldi,et al.  Transcriptional regulatory code of a eukaryotic genome , 2004, Nature.

[28]  G. Stephanopoulos,et al.  Exploiting biological complexity for strain improvement through systems biology , 2004, Nature Biotechnology.

[29]  Xianwu Zheng,et al.  A R2R3 Type MYB Transcription Factor Is Involved in the Cold Regulation of CBF Genes and in Acquired Freezing Tolerance* , 2006, Journal of Biological Chemistry.

[30]  Nor Hayati Othman,et al.  A review of feature selection techniques via gene expression profiles , 2008, 2008 International Symposium on Information Technology.

[31]  Ziv Bar-Joseph,et al.  STEM: a tool for the analysis of short time series gene expression data , 2006, BMC Bioinformatics.

[32]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..

[33]  E. Goormaghtigh,et al.  Kinesin Assembly and Movement in Cells , 2011 .

[34]  M. Rubio-Texeira,et al.  A comparative analysis of the GAL genetic switch between not-so-distant cousins: Saccharomyces cerevisiae versus Kluyveromyces lactis. , 2005, FEMS yeast research.

[35]  R. Tripaldi,et al.  The Trop-2 signalling network in cancer growth , 2013, Oncogene.

[36]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[37]  Marko Robnik-Sikonja,et al.  Theoretical and Empirical Analysis of ReliefF and RReliefF , 2003, Machine Learning.

[38]  Georgios C. Anagnostopoulos,et al.  Knowledge-Based Intelligent Information and Engineering Systems , 2003, Lecture Notes in Computer Science.

[39]  Gordon K Smyth,et al.  Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments , 2004, Statistical applications in genetics and molecular biology.

[40]  Gianluca Bontempi,et al.  Causal filter selection in microarray data , 2010, ICML.

[41]  I Kimber,et al.  Anti-proliferative effect of estrogen in breast cancer cells that re-express ERalpha is mediated by aberrant regulation of cell cycle genes. , 2005, Journal of molecular endocrinology.

[42]  Satoru Miyano,et al.  Estimation of Genetic Networks and Functional Structures Between Genes by Using Bayesian Networks and Nonparametric Regression , 2001, Pacific Symposium on Biocomputing.

[43]  Jason H. Moore,et al.  Tuning ReliefF for Genome-Wide Genetic Analysis , 2007, EvoBIO.

[44]  Fillia Makedon,et al.  Application of Relief-F feature filtering algorithm to selecting informative genes for cancer classification using microarray data , 2004 .

[45]  Barry Komm,et al.  Estrogen-regulated gene networks in human breast cancer cells: involvement of E2F1 in the regulation of cell proliferation. , 2007, Molecular endocrinology.

[46]  Chiara Sabatti,et al.  Network component analysis: Reconstruction of regulatory signals in biological systems , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[47]  Adam A. Margolin,et al.  Reverse engineering of regulatory networks in human B cells , 2005, Nature Genetics.

[48]  Lakhmi C. Jain,et al.  Knowledge-Based Intelligent Information and Engineering Systems , 2004, Lecture Notes in Computer Science.

[49]  K. Strimmer,et al.  Statistical Applications in Genetics and Molecular Biology A Shrinkage Approach to Large-Scale Covariance Matrix Estimation and Implications for Functional Genomics , 2011 .

[50]  Xia Yang,et al.  Systematic genetic and genomic analysis of cytochrome P450 enzyme activities in human liver. , 2010, Genome research.

[51]  Tom M. Mitchell,et al.  A Combined Expression-Interaction Model for Inferring the Temporal Activity of Transcription Factors , 2009, J. Comput. Biol..

[52]  Satoru Miyano,et al.  Bayesian Network and Nonparametric Heteroscedastic Regression for Nonlinear Modeling of Genetic Network , 2003, J. Bioinform. Comput. Biol..

[53]  Jie Tian,et al.  Transcriptional regulation of estrogen receptor-alpha by p53 in human breast cancer cells. , 2009, Cancer research.

[54]  Davis J. McCarthy,et al.  Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation , 2012, Nucleic acids research.

[55]  Mariano J. Alvarez,et al.  A human B-cell interactome identifies MYB and FOXM1 as master regulators of proliferation in germinal centers , 2010, Molecular systems biology.

[56]  Hiroshi Motoda,et al.  Computational Methods of Feature Selection , 2022 .

[57]  Benno Schwikowski,et al.  Discovering regulatory and signalling circuits in molecular interaction networks , 2002, ISMB.

[58]  Zheng Li,et al.  A Three Stage Integrative Pathway Search (TIPS©) framework to identify toxicity relevant genes and pathways , 2007, BMC Bioinformatics.

[59]  T. Gilmore Introduction to NF-κB: players, pathways, perspectives , 2006, Oncogene.

[60]  I. O. Ellis,et al.  P4-09-11: Kinesin Family Member 2C (KIF2C) Is a New Surrogate Prognostic Marker in Breast Cancer (BC). , 2011 .

[61]  P. Farnham,et al.  The identification of E2F1-specific target genes , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[62]  Mariano J. Alvarez,et al.  Genome-wide Identification of Post-translational Modulators of Transcription Factor Activity in Human B-Cells , 2009, Nature Biotechnology.

[63]  Hiroshi Motoda,et al.  Book Review: Computational Methods of Feature Selection , 2007, The IEEE intelligent informatics bulletin.

[64]  C. Ding,et al.  Gene selection algorithm by combining reliefF and mRMR , 2007, 2007 IEEE 7th International Symposium on BioInformatics and BioEngineering.

[65]  Lokesh P. Tripathi,et al.  TargetMine, an Integrated Data Warehouse for Candidate Gene Prioritisation and Target Discovery , 2011, PloS one.

[66]  Sohail Asghar,et al.  A REVIEW OF FEATURE SELECTION TECHNIQUES IN STRUCTURE LEARNING , 2013 .

[67]  Young Ho Kim,et al.  Regulation of the human mitotic centromere-associated kinesin (MCAK) promoter by the transcription factors Sp1 and E2F1. , 2008, Biochimica et biophysica acta.