Characterizing disease states from topological properties of transcriptional regulatory networks

BackgroundHigh throughput gene expression experiments yield large amounts of data that can augment our understanding of disease processes, in addition to classifying samples. Here we present new paradigms of data Separation based on construction of transcriptional regulatory networks for normal and abnormal cells using sequence predictions, literature based data and gene expression studies. We analyzed expression datasets from a number of diseased and normal cells, including different types of acute leukemia, and breast cancer with variable clinical outcome.ResultsWe constructed sample-specific regulatory networks to identify links between transcription factors (TFs) and regulated genes that differentiate between healthy and diseased states. This approach carries the advantage of identifying key transcription factor-gene pairs with differential activity between healthy and diseased states rather than merely using gene expression profiles, thus alluding to processes that may be involved in gene deregulation. We then generalized this approach by studying simultaneous changes in functionality of multiple regulatory links pointing to a regulated gene or emanating from one TF (or changes in gene centrality defined by its in-degree or out-degree measures, respectively). We found that samples can often be separated based on these measures of gene centrality more robustly than using individual links.We examined distributions of distances (the number of links needed to traverse the path between each pair of genes) in the transcriptional networks for gene subsets whose collective expression profiles could best separate each dataset into predefined groups. We found that genes that optimally classify samples are concentrated in neighborhoods in the gene regulatory networks. This suggests that genes that are deregulated in diseased states exhibit a remarkable degree of connectivity.ConclusionTranscription factor-regulated gene links and centrality of genes on transcriptional networks can be used to differentiate between cell types. Transcriptional network blueprints can be used as a basis for further research into gene deregulation in diseased states.

[1]  Diego di Bernardo,et al.  Robust Identification of Large Genetic Networks , 2003, Pacific Symposium on Biocomputing.

[2]  A. Barabasi,et al.  Network biology: understanding the cell's functional organization , 2004, Nature Reviews Genetics.

[3]  Yuval Kluger,et al.  Lineage specificity of gene expression patterns. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Jesper Tegnér,et al.  Reverse engineering gene networks using singular value decomposition and robust regression , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[5]  Martin C. Frith,et al.  Cluster-Buster: finding dense clusters of motifs in DNA sequences , 2003, Nucleic Acids Res..

[6]  M. West,et al.  Gene expression predictors of breast cancer outcomes , 2003, The Lancet.

[7]  B. Palsson,et al.  Combining pathway analysis with flux balance analysis for the comprehensive study of metabolic systems. , 2000, Biotechnology and bioengineering.

[8]  Gary D Bader,et al.  Global Mapping of the Yeast Genetic Interaction Network , 2004, Science.

[9]  Alexander E. Kel,et al.  TRANSFAC®: transcriptional regulation, from patterns to profiles , 2003, Nucleic Acids Res..

[10]  Yudong D. He,et al.  Gene expression profiling predicts clinical outcome of breast cancer , 2002, Nature.

[11]  Nir Friedman,et al.  Inferring Cellular Networks Using Probabilistic Graphical Models , 2004, Science.

[12]  Markus J. Herrgård,et al.  Reconciling gene expression data with known genome-scale regulatory network structures. , 2003, Genome research.

[13]  Lan V. Zhang,et al.  Evidence for dynamically organized modularity in the yeast protein–protein interaction network , 2004, Nature.

[14]  M. Gerstein,et al.  Genomic analysis of regulatory network dynamics reveals large topological changes , 2004, Nature.

[15]  Chiara Sabatti,et al.  Network component analysis: Reconstruction of regulatory signals in biological systems , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[16]  Van,et al.  A gene-expression signature as a predictor of survival in breast cancer. , 2002, The New England journal of medicine.

[17]  Eytan Domany,et al.  Outcome signature genes in breast cancer: is there a unique set? , 2004, Breast Cancer Research.

[18]  Yudong D. He,et al.  Expression profiling predicts outcome in breast cancer , 2002, Breast Cancer Research.

[19]  A. Barabasi,et al.  Lethality and centrality in protein networks , 2001, Nature.

[20]  N. Gerry,et al.  Previously unidentified changes in renal cell carcinoma gene expression identified by parametric analysis of microarray data , 2003, BMC Cancer.

[21]  M. Cronin,et al.  A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer. , 2004, The New England journal of medicine.

[22]  Joseph T. Chang,et al.  Spectral biclustering of microarray data: coclustering genes and conditions. , 2003, Genome research.

[23]  R. Young,et al.  A common set of gene regulatory networks links metabolism and growth inhibition. , 2004, Molecular cell.

[24]  Sven Bergmann,et al.  Defining transcription modules using large-scale gene expression data , 2004, Bioinform..

[25]  M. Baiget,et al.  Sequence conservation of RAG-1 and RAG-2 genes in hematologic malignancies , 2002, Leukemia.

[26]  Soumyaroop Bhattacharya,et al.  A classification-based machine learning approach for the analysis of genome-wide expression data. , 2003, Genome research.

[27]  J. Foekens,et al.  Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer , 2005, The Lancet.

[28]  Bing Ren,et al.  Use of chromatin immunoprecipitation assays in genome-wide location analysis of mammalian transcription factors. , 2004, Methods in enzymology.

[29]  R. Verhaak,et al.  Prognostically useful gene-expression profiles in acute myeloid leukemia. , 2004, The New England journal of medicine.

[30]  C. Preudhomme,et al.  CEBPA point mutations in hematological malignancies , 2005, Leukemia.

[31]  Richard Simon,et al.  Roadmap for developing and validating therapeutically relevant genomic classifiers. , 2005, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[32]  J. Collins,et al.  Inferring Genetic Networks and Identifying Compound Mode of Action via Expression Profiling , 2003, Science.

[33]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[34]  K. Sikora,et al.  Leukemia , 1984, British Journal of Cancer.

[35]  A Zelent,et al.  Two critical hits for promyelocytic leukemia. , 2000, Molecular cell.

[36]  S. Shen-Orr,et al.  Superfamilies of Evolved and Designed Networks , 2004, Science.

[37]  J. Hasty,et al.  Reverse engineering gene networks: Integrating genetic perturbations with dynamical modeling , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[38]  R. Sharan,et al.  An initial blueprint for myogenic differentiation. , 2005, Genes & development.

[39]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[40]  Shinichiro Wachi,et al.  Interactome-transcriptome analysis reveals the high centrality of genes differentially expressed in lung cancer tissues , 2005, Bioinform..

[41]  Nigam H. Shah,et al.  Can we identify cellular pathways implicated in cancer using gene expression data? , 2003, Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003.

[42]  T. Volkert,et al.  E2F integrates cell cycle progression with DNA repair, replication, and G(2)/M checkpoints. , 2002, Genes & development.

[43]  Howard Y. Chang,et al.  Robustness, scalability, and integration of a wound-response gene expression signature in predicting breast cancer survival. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[44]  Nicola J. Rinaldi,et al.  Computational discovery of gene modules and regulatory networks , 2003, Nature Biotechnology.

[45]  Roland Eils,et al.  Cross-platform analysis of cancer microarray data improves gene expression based classification of phenotypes , 2005, BMC Bioinformatics.

[46]  Bernhard O. Palsson,et al.  Metabolic flux balance analysis and the in silico analysis of Escherichia coli K-12 gene deletions , 2000, BMC Bioinformatics.

[47]  Edgar Wingender,et al.  TRANSFAC, TRANSPATH and CYTOMER as starting points for an ontology of regulatory networks. , 2004, In silico biology.

[48]  Paul M. Magwene,et al.  Estimating genomic coexpression networks using first-order conditional independence , 2004, Genome Biology.

[49]  A. Barabasi,et al.  Global organization of metabolic fluxes in the bacterium Escherichia coli , 2004, Nature.

[50]  Michael Q. Zhang,et al.  A global transcriptional regulatory role for c-Myc in Burkitt's lymphoma cells , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[51]  D. Koller,et al.  A module map showing conditional activity of expression modules in cancer , 2004, Nature Genetics.

[52]  T. Eberlein A Multigene Assay to Predict Recurrence of Tamoxifen-Treated, Node-Negative Breast Cancer , 2006 .

[53]  Gustavo Stolovitzky,et al.  Reconstructing biological networks using conditional correlation analysis , 2005, Bioinform..

[54]  Sergei Maslov,et al.  Detection of topological patterns in protein networks. , 2004, Genetic engineering.

[55]  Nikolaus Rajewsky,et al.  A cis element in the recombination activating gene locus regulates gene expression by counteracting a distant silencer , 2004, Nature Immunology.

[56]  Nigel Mackman,et al.  Egr-1, a master switch coordinating upregulation of divergent gene families underlying ischemic stress , 2000, Nature Medicine.

[57]  D. Pe’er,et al.  Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data , 2003, Nature Genetics.

[58]  E. Rosen,et al.  The transcriptional basis of adipocyte development. , 2005, Prostaglandins, leukotrienes, and essential fatty acids.

[59]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[60]  A. Görlach,et al.  Rac regulates thrombin-induced tissue factor expression in pulmonary artery smooth muscle cells involving the nuclear factor-kappaB pathway. , 2004, Antioxidants & redox signaling.

[61]  Joachim Selbig,et al.  Hypothesis-driven approach to predict transcriptional units from gene expression data , 2004, Bioinform..

[62]  Sangsoo Kim,et al.  Gene expression Differential coexpression analysis using microarray data and its application to human cancer , 2005 .

[63]  Roded Sharan,et al.  CREME: a framework for identifying cis-regulatory modules in human-mouse conserved segments , 2003, ISMB.