GREAT improves functional interpretation of cis-regulatory regions

We developed the Genomic Regions Enrichment of Annotations Tool (GREAT) to analyze the functional significance of cis-regulatory regions identified by localized measurements of DNA binding events across an entire genome. Whereas previous methods took into account only binding proximal to genes, GREAT is able to properly incorporate distal binding sites and control for false positives using a binomial test over the input genomic regions. GREAT incorporates annotations from 20 ontologies and is available as a web application. Applying GREAT to data sets from chromatin immunoprecipitation coupled with massively parallel sequencing (ChIP-seq) of multiple transcription-associated factors, including SRF, NRSF, GABP, Stat3 and p300 in different developmental contexts, we recover many functions of these factors that are missed by existing gene-based tools, and we generate testable hypotheses. The utility of GREAT is not limited to ChIP-seq, as it could also be applied to open chromatin, localized epigenomic markers and similar functional data sets, as well as comparative genomics sets.

[1]  M. Gilman,et al.  YY1 facilitates the association of serum response factor with the c-fos serum response element , 1995, Molecular and cellular biology.

[2]  J. Massagué,et al.  SMADs: mediators and regulators of TGF-β signaling , 1998 .

[3]  J. Massagué,et al.  SMADs: mediators and regulators of TGF-beta signaling. , 1998, Current opinion in genetics & development.

[4]  C. Bertolotto,et al.  Cleavage of the Serum Response Factor during Death Receptor-induced Apoptosis Results in an Inhibition of the c-FOS Promoter Transcriptional Activity* , 2000, The Journal of Biological Chemistry.

[5]  D. Storm,et al.  SRF‐dependent gene expression is required for PI3‐kinase‐regulated cell proliferation , 2000, The EMBO journal.

[6]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[7]  Wolfgang Wurst,et al.  Neural plate patterning: Upstream and downstream of the isthmic organizer , 2001, Nature Reviews Neuroscience.

[8]  A. Wilkie,et al.  Genetics of craniofacial development and malformation , 2001, Nature Reviews Genetics.

[9]  Juan Carlos Izpisúa Belmonte,et al.  Patterning mechanisms controlling vertebrate limb development. , 2001, Annual review of cell and developmental biology.

[10]  P. Levings,et al.  The human β‐globin locus control region , 2002 .

[11]  Tom H. Pringle,et al.  The human genome browser at UCSC. , 2002, Genome research.

[12]  P. Levings,et al.  The human beta-globin locus control region. , 2002, European journal of biochemistry.

[13]  A. Tarnawski,et al.  Serum response factor: discovery, biochemistry, biological roles and implications for tissue injury healing. , 2002, Journal of physiology and pharmacology : an official journal of the Polish Physiological Society.

[14]  J. Massagué,et al.  E2F4/5 and p107 as Smad Cofactors Linking the TGFβ Receptor to c-myc Repression , 2002, Cell.

[15]  L. Niswander Pattern formation: old models out on a limb , 2003, Nature Reviews Genetics.

[16]  Denis Duboule,et al.  A Global Control Region Defines a Chromosomal Regulatory Landscape Containing the HoxD Cluster , 2003, Cell.

[17]  B. Oostra,et al.  A long-range Shh enhancer regulates expression in the developing limb and fin and is associated with preaxial polydactyly. , 2003, Human molecular genetics.

[18]  D. Haussler,et al.  Ultraconserved Elements in the Human Genome , 2004, Science.

[19]  Purvesh Khatri,et al.  Ontological analysis of gene expression data: current tools, limitations, and open problems , 2005, Bioinform..

[20]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[21]  J. Dopazo Functional interpretation of microarray experiments. , 2006, Omics : a journal of integrative biology.

[22]  Gary D. Bader,et al.  cPath: open source software for collecting, storing, and querying biological pathways , 2006, BMC Bioinformatics.

[23]  C. Nusbaum,et al.  Chromosome Conformation Capture Carbon Copy (5C): a massively parallel solution for mapping interactions between genomic elements. , 2006, Genome research.

[24]  Michael R. Green,et al.  Transcriptional regulatory elements in the human genome. , 2006, Annual review of genomics and human genetics.

[25]  David Haussler,et al.  The UCSC Known Genes , 2006, Bioinform..

[26]  D. Allison,et al.  Microarray data analysis: from disarray to consolidation and consensus , 2006, Nature Reviews Genetics.

[27]  J. Rubenstein,et al.  Neuronal production and precursor proliferation defects in the neocortex of mice with loss of function in the canonical Wnt signaling pathway , 2006, Neuroscience.

[28]  D. Haussler,et al.  A distal enhancer and an ultraconserved exon are derived from a novel retroposon , 2006, Nature.

[29]  David Bryant,et al.  DAVID Bioinformatics Resources: expanded annotation database and novel algorithms to better extract biology from large gene lists , 2007, Nucleic Acids Res..

[30]  W-H Kim,et al.  SRF is a nuclear repressor of Smad3-mediated TGF-beta signaling. , 2007, Oncogene.

[31]  William Stafford Noble,et al.  Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project , 2007, Nature.

[32]  E. Mardis ChIP-seq: welcome to the new frontier , 2007, Nature Methods.

[33]  K. Fujiwara,et al.  Serum response factor: master regulator of the actin cytoskeleton and contractile apparatus. , 2007, American journal of physiology. Cell physiology.

[34]  A. Mortazavi,et al.  Genome-Wide Mapping of in Vivo Protein-DNA Interactions , 2007, Science.

[35]  David Haussler,et al.  Thousands of human mobile element fragments undergo strong purifying selection near developmental genes , 2007, Proceedings of the National Academy of Sciences.

[36]  W-H Kim,et al.  SRF is a nuclear repressor of Smad3-mediated TGF-β signaling , 2007, Oncogene.

[37]  N. D. Clarke,et al.  Integration of External Signaling Pathways with the Core Transcriptional Network in Embryonic Stem Cells , 2008, Cell.

[38]  Tao Liu,et al.  TreeFam: 2008 Update , 2007, Nucleic Acids Res..

[39]  R. Myers,et al.  An Integrated Software System for Analyzing Chip-chip and Chip-seq Data (supplementary Information) , 2008 .

[40]  Judith A. Blake,et al.  The Mouse Genome Database (MGD): mouse biology and model systems , 2007, Nucleic Acids Res..

[41]  D. Duboule,et al.  Global control regions and regulatory landscapes in vertebrate development and evolution. , 2008, Advances in genetics.

[42]  R. Shamir,et al.  Transcription factor and microRNA motif discovery: the Amadeus platform and a compendium of metazoan target sets. , 2008, Genome research.

[43]  K. Lange,et al.  Fine mapping of regulatory loci for mammalian gene expression using radiation hybrids , 2008, Nature Genetics.

[44]  S. Batzoglou,et al.  Genome-Wide Analysis of Transcription Factor Binding Sites Based on ChIP-Seq Data , 2008, Nature Methods.

[45]  P. Park,et al.  Design and analysis of ChIP-seq experiments for DNA-binding proteins , 2008, Nature Biotechnology.

[46]  I. Amit,et al.  Comprehensive mapping of long range interactions reveals folding principles of the human genome , 2011 .

[47]  Judith A. Blake,et al.  The Mouse Genome Database genotypes::phenotypes , 2008, Nucleic Acids Res..

[48]  Ivan Ovcharenko,et al.  Variable locus length in the human genome leads to ascertainment bias in functional inference for non-coding elements , 2009, Bioinform..

[49]  G. Tuteja,et al.  Extracting transcription factor targets from ChIP-Seq data , 2009, Nucleic acids research.

[50]  Raymond K. Auerbach,et al.  PeakSeq enables systematic scoring of ChIP-seq experiments relative to controls , 2009, Nature Biotechnology.

[51]  A. Visel,et al.  ChIP-seq accurately predicts tissue-specific activity of enhancers , 2009, Nature.

[52]  P. Park ChIP–seq: advantages and challenges of a maturing technology , 2009, Nature Reviews Genetics.

[53]  Jennifer A. Mitchell,et al.  Preferential associations between co-regulated genes reveal a transcriptional interactome in erythroid cells , 2010, Nature Genetics.