Identification of muscle-specific regulatory modules in Caenorhabditis elegans.

Transcriptional regulation is the major regulatory mechanism that controls the spatial and temporal expression of genes during development. This is carried out by transcription factors (TFs), which recognize and bind to their cognate binding sites. Recent studies suggest a modular organization of TF-binding sites, in which clusters of transcription-factor binding sites cooperate in the regulation of downstream gene expression. In this study, we report our computational identification and experimental verification of muscle-specific cis-regulatory modules in Caenorhabditis elegans. We first identified a set of motifs that are correlated with muscle-specific gene expression. We then predicted muscle-specific regulatory modules based on clusters of those motifs with characteristics similar to a collection of well-studied modules in other species. The method correctly identifies 88% of the experimentally characterized modules with a positive predictive value of at least 65%. The prediction accuracy of muscle-specific expression on an independent test set is highly significant (P<0.0001). We performed in vivo experimental tests of 12 predicted modules, and 10 of those drive muscle-specific gene expression. These results suggest that our method is highly accurate in identifying functional sequences important for muscle-specific gene expression and is a valuable tool for guiding experimental designs.

[1]  S. Mango,et al.  Role of T-box gene tbx-2 for anterior foregut muscle development in C. elegans. , 2007, Developmental biology.

[2]  D. Moerman,et al.  Sarcomere assembly in C. elegans muscle. , 2006, WormBook : the online review of C. elegans biology.

[3]  Ting Wang,et al.  An improved map of conserved regulatory sites for Saccharomyces cerevisiae , 2006, BMC Bioinformatics.

[4]  Christian A. Grove,et al.  A compendium of Caenorhabditis elegans regulatory transcription factors: a resource for mapping transcription regulatory networks , 2005, Genome Biology.

[5]  Ting Wang,et al.  Identifying the conserved network of cis-regulatory sites of a eukaryotic genome. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[6]  L. Stein,et al.  Distinct Regulatory Elements Mediate Similar Expression Patterns in the Excretory Cell of Caenorhabditis elegans* , 2005, Journal of Biological Chemistry.

[7]  Kimberly Van Auken,et al.  WormBase: a comprehensive data resource for Caenorhabditis biology and genomics , 2004, Nucleic Acids Res..

[8]  Lisa R. Girard,et al.  Dissection of cis-regulatory elements in the C. elegans Hox gene egl-5 promoter. , 2004, Developmental biology.

[9]  G. Stormo,et al.  Novel transcription regulatory elements in Caenorhabditis elegans muscle genes. , 2004, Genome research.

[10]  David E Hill,et al.  A first version of the Caenorhabditis elegans Promoterome. , 2004, Genome research.

[11]  W. J. Kent,et al.  Environmentally Induced Foregut Remodeling by PHA-4/FoxA and DAF-12/NHR , 2004, Science.

[12]  P. Okkema,et al.  An early pharyngeal muscle enhancer from the Caenorhabditis elegans ceh-22 gene is targeted by the Forkhead factor PHA-4. , 2004, Developmental biology.

[13]  H. Chamberlin,et al.  Evolutionary innovation of the excretory system in Caenorhabditis elegans , 2004, Nature Genetics.

[14]  M. Labouesse,et al.  Multiple regulatory elements with spatially and temporally distinct activities control the expression of the epithelial differentiation gene lin-26 in C. elegans. , 2004, Developmental biology.

[15]  Ting Wang,et al.  Combining phylogenetic data with co-regulated genes to identify regulatory motifs , 2003, Bioinform..

[16]  L. Avery,et al.  LIM homeobox gene-dependent expression of biogenic amine receptors in restricted regions of the C. elegans nervous system. , 2003, Developmental biology.

[17]  Junho Lee,et al.  Neuron cell type-specific SNAP-25 expression driven by multiple regulatory elements in the nematode Caenorhabditis elegans. , 2003, Journal of molecular biology.

[18]  Stephen J. Palmer,et al.  hMusTRD1α1 Represses MEF2 Activation of the Troponin I Slow Enhancer* , 2003, Journal of Biological Chemistry.

[19]  Vladimir B. Bajic,et al.  Content analysis of the core promoter region of human genes , 2003, Silico Biol..

[20]  Alexander E. Kel,et al.  TRANSFAC®: transcriptional regulation, from patterns to profiles , 2003, Nucleic Acids Res..

[21]  Joshua M. Stuart,et al.  Chromosomal clustering of muscle-expressed genes in Caenorhabditis elegans , 2002, Nature.

[22]  G. Rubin,et al.  Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the Drosophila genome , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[23]  A. Fire,et al.  The T-box factor MLS-1 acts as a molecular switch during specification of nonstriated muscle in C. elegans. , 2002, Genes & development.

[24]  Gary D. Stormo,et al.  Identifying Muscle Regulatory Elements and Genes in the Nematode Caenorhabditis Elegans , 2001, Pacific Symposium on Biocomputing.

[25]  Peter W. Markstein,et al.  Genome-wide analysis of clustered Dorsal binding sites identifies putative target genes in the Drosophila embryo , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[26]  H. Kagawa,et al.  The third and fourth tropomyosin isoforms of Caenorhabditis elegans are expressed in the pharynx and intestines and are essential for development and morphology. , 2001, Journal of molecular biology.

[27]  Jeremy Buhler,et al.  Finding motifs using random projections , 2001, RECOMB.

[28]  Denise S Walker,et al.  Dissection of the promoter region of the inositol 1,4,5-trisphosphate receptor gene, itr-1, in C. elegans: a molecular basis for cell-specific expression of IP3R isoforms. , 2001, Journal of molecular biology.

[29]  Runzhao Li,et al.  Regulation of Ets function by protein–protein interactions , 2000, Oncogene.

[30]  H. Kondoh,et al.  Pairing SOX off: with partners in the regulation of embryonic development. , 2000, Trends in genetics : TIG.

[31]  Gary D. Stormo,et al.  DNA binding sites: representation and discovery , 2000, Bioinform..

[32]  Andreas Wagner,et al.  Genes regulated cooperatively by one or more transcription factors and their identification in whole eukaryotic genomes , 1999, Bioinform..

[33]  D. Combes,et al.  Structure and promoter activity of the 5' flanking region of ace-1, the gene encoding acetylcholinesterase of class A in Caenorhabditis elegans. , 1999, Journal of molecular biology.

[34]  Gary D. Stormo,et al.  Identifying DNA and protein patterns with statistically significant alignments of multiple sequences , 1999, Bioinform..

[35]  S. Eom,et al.  Analysis of calsequestrin gene expression using green fluorescent protein in Caenorhabditis elegans. , 1999, Molecules and cells.

[36]  J. McGhee,et al.  ELT-3: A Caenorhabditis elegans GATA factor expressed in the embryonic epidermis during morphogenesis. , 1999, Developmental biology.

[37]  R. Chalkley,et al.  Multiple domains for initiator binding proteins TFII-I and YY-1 are present in the initiator and upstream regions of the rat XDH/XO TATA-less promoter. , 1998, Nucleic acids research.

[38]  J. Fickett,et al.  Identification of regulatory regions which confer muscle-specific gene expression. , 1998, Journal of molecular biology.

[39]  A. Fire,et al.  Muscle and nerve-specific regulation of a novel NK-2 class homeodomain factor in Caenorhabditis elegans. , 1998, Development.

[40]  E. Davidson,et al.  The hardwiring of development: organization and function of genomic regulatory systems. , 1997, Development.

[41]  Andrew Fire,et al.  Muscle: Structure, Function, and Development , 1997 .

[42]  E. Davidson,et al.  Modular cis-regulatory organization of developmentally expressed genes: two genes transcribed territorially in the sea urchin embryo, and additional examples. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[43]  Michael Gribskov,et al.  Use of Receiver Operating Characteristic (ROC) Analysis to Evaluate Sequence Matching , 1996, Comput. Chem..

[44]  T. Inoue,et al.  Genome structure, mapping and expression of the tropomyosin gene tmy-1 of Caenorhabditis elegans. , 1995, Journal of molecular biology.

[45]  A. Fire,et al.  Elements regulating cell- and stage-specific expression of the C. elegans MyoD family homolog hlh-1. , 1994, Developmental biology.

[46]  A. Fire,et al.  Combinatorial structure of a body muscle-specific transcriptional enhancer in Caenorhabditis elegans. , 1994, The Journal of biological chemistry.

[47]  A. Fire,et al.  The Caenorhabditis elegans NK-2 class homeoprotein CEH-22 is involved in combinatorial activation of gene expression in pharyngeal muscle. , 1994, Development.

[48]  A. Fire,et al.  The Caenorhabditis elegans MYOD homologue HLH-1 is essential for proper muscle function and complete morphogenesis. , 1994, Development.

[49]  A. Fire,et al.  Sequence requirements for myosin gene expression and regulation in Caenorhabditis elegans. , 1993, Genetics.

[50]  V. Ambros,et al.  Efficient gene transfer in C.elegans: extrachromosomal maintenance and integration of transforming sequences. , 1991, The EMBO journal.

[51]  A. Fire,et al.  A modular set of lacZ fusion vectors for studying gene expression in Caenorhabditis elegans. , 1990, Gene.

[52]  Rodger Staden,et al.  Methods for calculating the probabilities of finding patterns in sequences , 1989, Comput. Appl. Biosci..

[53]  J. Spieth,et al.  Regulated expression of a vitellogenin fusion gene in transgenic nematodes. , 1988, Developmental biology.