A hyper-graph approach for analyzing transcriptional networks in breast cancer

Breast cancer is the most common malignancy and a leading cause of cancer related deaths in women. In recent years, gene expression profiling has proved useful in delineating molecular subtypes of breast cancer and in the development of prognostic signatures. We are developing an analytical pipeline to characterize the transcriptional regulators of these subtypes and signatures. Our approach complements current bioinformatics approaches for transcription factor analysis with a vertex cover algorithm on hypergraphs. We utilize this approach to build a network of differentially expressed genes in a tumor subtype or based on a predefined signature and the candidate transcription factors regulating these genes. Maximum cardinality and minimum weight vertex covers in hypergraphs are used to choose a set of candidate transcription factors that (1) are provably within a small factor of the optimum cover, and (2) are the key regulators of disease pathogenesis. Our model can then be used to predict the most important transcription factors regulating the network. We then use this approach to find modules or combinations of transcription factors regulating different functional subsets of genes. We test our approach using data generated with cell lines in the context of estrogen receptor mediated transcription and demonstrate that we can recover previously known or expected regulators. Then, we apply the method to a primary breast cancer cohort partitioned into two groups with prognostic differences defined by high or low levels of an insulin-like growth factor gene expression signature. These results suggest the method has the potential to identify transcription factors regulating different molecular subtypes of breast cancer.

[1]  Marco Grzegorczyk,et al.  Comparative evaluation of reverse engineering gene regulatory networks with relevance networks, graphical gaussian models and bayesian networks , 2006, Bioinform..

[2]  John W. M. Martens,et al.  DNA hypermethylation of PITX2 is a marker of poor prognosis in untreated lymph node-negative hormone receptor-positive breast cancer patients , 2008, Breast Cancer Research and Treatment.

[3]  K. Gunsalus,et al.  Network modeling links breast cancer susceptibility and centrosome dysfunction. , 2007, Nature genetics.

[4]  Eran Segal,et al.  Motif module map reveals enforcement of aging by continual NF-κB activity , 2007 .

[5]  K. Lindblad-Toh,et al.  Systematic discovery of regulatory motifs in human promoters and 3′ UTRs by comparison of several mammals , 2005, Nature.

[6]  Susumu Goto,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 2000, Nucleic Acids Res..

[7]  D. Pe’er,et al.  Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data , 2003, Nature Genetics.

[8]  A. Nobel,et al.  The molecular portraits of breast tumors are conserved across microarray platforms , 2006, BMC Genomics.

[9]  Alfonso Valencia,et al.  Translational disease interpretation with molecular networks , 2009, Genome Biology.

[10]  Jianhua Ruan A Top-Performing Algorithm for the DREAM3 Gene Expression Prediction Challenge , 2010, PloS one.

[11]  Herbert Yu,et al.  Peptide concentrations and mRNA expression of IGF-I, IGF-II and IGFBP-3 in breast cancer and their associations with disease characteristics , 2009, Breast Cancer Research and Treatment.

[12]  Z. Weng,et al.  Detection of functional DNA motifs via statistical over-representation. , 2004, Nucleic acids research.

[13]  Hiroyuki Ogata,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 1999, Nucleic Acids Res..

[14]  K. Struhl,et al.  Current Protocols in Molecular Biology (New York: Greene Publishing Associates and Wiley-Interscience). Host-Range Shuttle System for Gene Insertion into the Chromosomes of Gram-negative Bacteria. , 1988 .

[15]  Vijay V. Vazirani,et al.  Approximation Algorithms , 2001, Springer Berlin Heidelberg.

[16]  Steffen Klamt,et al.  Hypergraphs and Cellular Networks , 2009, PLoS Comput. Biol..

[17]  Brad T. Sherman,et al.  Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources , 2008, Nature Protocols.

[18]  Saurabh Sinha,et al.  A probabilistic method to detect regulatory modules , 2003, ISMB.

[19]  J. Foekens,et al.  DNA-methylation of the homeodomain transcription factor PITX2 reliably predicts risk of distant disease recurrence in tamoxifen-treated, node-negative breast cancer patients--Technical and clinical validation in a multi-centre setting in collaboration with the European Organisation for Research and , 2007, European journal of cancer.

[20]  D. Agard,et al.  Estrogen receptor pathways to AP-1 , 2000, The Journal of Steroid Biochemistry and Molecular Biology.

[21]  Alexander E. Kel,et al.  TRANSFAC® and its module TRANSCompel®: transcriptional gene regulation in eukaryotes , 2005, Nucleic Acids Res..

[22]  Chris Wiggins,et al.  ARACNE: An Algorithm for the Reconstruction of Gene Regulatory Networks in a Mammalian Cellular Context , 2004, BMC Bioinformatics.

[23]  F. Stossi,et al.  Whole-Genome Cartography of Estrogen Receptor α Binding Sites , 2007, PLoS genetics.

[24]  Clifford A. Meyer,et al.  FoxA1 Translates Epigenetic Signatures into Enhancer-Driven Lineage-Specific Transcription , 2008, Cell.

[25]  Korbinian Strimmer,et al.  An empirical Bayes approach to inferring large-scale gene association networks , 2005, Bioinform..