Integrative Analysis of the Caenorhabditis elegans Genome by the modENCODE Project

From Genome to Regulatory Networks For biologists, having a genome in hand is only the beginning—much more investigation is still needed to characterize how the genome is used to help to produce a functional organism (see the Perspective by Blaxter). In this vein, Gerstein et al. (p. 1775) summarize for the Caenorhabditis elegans genome, and The modENCODE Consortium (p. 1787) summarize for the Drosophila melanogaster genome, full transcriptome analyses over developmental stages, genome-wide identification of transcription factor binding sites, and high-resolution maps of chromatin organization. Both studies identified regions of the nematode and fly genomes that show highly occupied targets (or HOT) regions where DNA was bound by more than 15 of the transcription factors analyzed and the expression of related genes were characterized. Overall, the studies provide insights into the organization, structure, and function of the two genomes and provide basic information needed to guide and correlate both focused and genome-wide studies. Extensive analysis of the Caenorhabditis elegans genome reveals regions highly occupied by multiple transcription factors. We systematically generated large-scale data sets to improve genome annotation for the nematode Caenorhabditis elegans, a key model organism. These data sets include transcriptome profiling across a developmental time course, genome-wide identification of transcription factor–binding sites, and maps of chromatin organization. From this, we created more complete and accurate gene models, including alternative splice forms and candidate noncoding RNAs. We constructed hierarchical networks of transcription factor–binding and microRNA interactions and discovered chromosomal locations bound by an unusually large number of transcription factors. Different patterns of chromatin composition and histone modification were revealed between chromosome arms and centers, with similarly prominent differences between autosomes and the X chromosome. Integrating data types, we built statistical models relating chromatin, transcription factor binding, and gene expression. Overall, our analyses ascribed putative functions to most of the conserved genome.

Raymond K. Auerbach | Kevin Y. Yip | Stefan R. Henz | Gennifer E. Merrihew | Bradley I. Arshinoff | Sebastian D. Mackowiak | P. Green | M. Gerstein | S. Lewis | S. Henikoff | J. Henikoff | L. Stein | F. Slack | A. Rechtsteiner | L. Hillier | R. Waterston | G. Zeller | Gunnar Rätsch | F. Piano | K. Gunsalus | V. Reinke | Stuart K. Kim | M. Mangone | J. Ahringer | E. Segal | M. MacCoss | Z. Lu | P. Good | M. Guyer | M. Snyder | Ekta Khurana | Jing Leng | H. Clawson | A. Hinrichs | E. Preston | J. Murray | N. Rajewsky | J. Lieb | J. Rozowsky | E. Feingold | G. Euskirchen | G. Barber | Tao Liu | X. Liu | S. Ooi | A. Sboner | Xin Feng | E. Lai | John K. Kim | Kim M Rutherford | R. Lyne | C. Shou | B. Ewing | Koon-Kiu Yan | Chao Cheng | Roger Alexander | P. Alves | Thea A. Egelhofer | A. Vielle | I. Latorre | Ming-Sin Cheung | Sevinç Ercan | Kohta Ikegami | M. Jensen | P. Kolasinska-Zwierz | H. Rosenbaum | Hyunjin Shin | S. Taing | T. Takasaki | A. Desai | A. Dernburg | S. Strome | M. Perry | L. Habegger | R. Lowdon | L. Lochovsky | Eric L. Van Nostrand | R. Robilotto | A. Chateigner | M. Morris | W. Niu | K. Rhrissorrakrai | A. Agarwal | Cathleen Brdlik | J. Brennan | J. Brouillet | Adrian Carr | S. Contrino | Luke O Dannenberg | L. Dick | A. Dosé | Jiang Du | R. Gassmann | F. Gullier | M. Gutwein | T. Han | H. Holster | T. Hyman | A. Iniguez | J. Janette | Masaomi Kato | W. Kent | E. Kephart | Vishal Khivansara | A. Leahey | P. Lloyd | Yaniv Lubling | S. McKay | D. Mecenas | Gennifer E Merrihew | David M. Miller | A. Muroyama | Hoang N. Pham | T. Phippen | P. Ruzanov | M. Sarov | R. Sasidharan | P. Scheid | C. Slightam | Richard Smith | W. C. Spencer | E. Stinson | D. Vafeados | K. Voronina | Guilin Wang | N. Washington | Christina M. Whittle | Beijing Wu | Z. Zha | Mei Zhong | Xingliang Zhou | G. Micklem | Roger P. Alexander | Heidi Rosenbaum | Ting Han | Rajkumar Sasidharan | A. Dosé | X. Liu | Marco Mangone | X. Liu | S. Lewis | Kevin Y. Yip | Kahn Rhrissorrakrai

[1]  M. Ivimey Annual report , 1958, IRE Transactions on Engineering Writing and Speech.

[2]  B. Bainbridge,et al.  Genetics , 1981, Experientia.

[3]  W Hörz,et al.  Sequence specific cleavage of DNA by micrococcal nuclease. , 1981, Nucleic acids research.

[4]  J. Sulston,et al.  The embryonic cell lineage of the nematode Caenorhabditis elegans. , 1983, Developmental biology.

[5]  S. Brenner,et al.  The structure of the nervous system of the nematode Caenorhabditis elegans. , 1986, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[6]  Dean Keith Simonton,et al.  Scientific Genius: A Psychology of Science , 1988 .

[7]  AC Tose Cell , 1993, Cell.

[8]  W B Wood,et al.  Early transcription in Caenorhabditis elegans embryos. , 1994, Development.

[9]  A. Fire,et al.  Soma-germline asymmetry in the distributions of embryonic RNAs in Caenorhabditis elegans. , 1994, Development.

[10]  A. Coulson,et al.  Meiotic recombination, noncoding DNA and genomic organization in Caenorhabditis elegans. , 1995, Genetics.

[11]  A. Mccarthy Development , 1996, Current Opinion in Neurobiology.

[12]  Andrew Smith Genome sequence of the nematode C-elegans: A platform for investigating biology , 1998 .

[13]  M. Biggin,et al.  A comparison of in vivo and in vitro DNA‐binding specificities suggests a new model for homeoprotein DNA binding in Drosophila embryos , 1999, The EMBO journal.

[14]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[15]  W. G. Kelly,et al.  X-chromosome silencing in the germline of C. elegans. , 2002, Development.

[16]  Joshua M. Stuart,et al.  Chromosomal clustering of muscle-expressed genes in Caenorhabditis elegans , 2002, Nature.

[17]  M. Gerstein,et al.  Studying genomes through the aeons: protein families, pseudogenes and proteome evolution. , 2002, Journal of molecular biology.

[18]  Y. Dong,et al.  Systematic functional analysis of the Caenorhabditis elegans genome using RNAi , 2003, Nature.

[19]  J. Hudson,et al.  C. elegans ORFeome version 1.1: experimental verification of the genome annotation and resource for proteome-scale protein expression , 2003, Nature Genetics.

[20]  M. Vidal,et al.  C. elegans ORFeome version 3.1: increasing the coverage of ORFeome resources with improved gene predictions. , 2004, Genome research.

[21]  Anton Wutz,et al.  A Chromosomal Memory Triggered by Xist Regulates Histone Methylation in X Inactivation , 2004, PLoS biology.

[22]  S. Batzoglou,et al.  Characterization of evolutionary rates and constraints in three Mammalian genomes. , 2004, Genome research.

[23]  B. Meyer X-Chromosome dosage compensation. , 2005, WormBook : the online review of C. elegans biology.

[24]  Brian D. Strahl,et al.  A Novel Domain in Set2 Mediates RNA Polymerase II Interaction and Couples Histone H3 K36 Methylation with Transcript Elongation , 2005, Molecular and Cellular Biology.

[25]  宁北芳,et al.  疟原虫var基因转换速率变化导致抗原变异[英]/Paul H, Robert P, Christodoulou Z, et al//Proc Natl Acad Sci U S A , 2005 .

[26]  J. Mattick The Functional Genomics of Noncoding RNA , 2005, Science.

[27]  A. Villeneuve,et al.  Chromosome Sites Play Dual Roles to Establish Homologous Synapsis during Meiosis in C. elegans , 2005, Cell.

[28]  D. Haussler,et al.  Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. , 2005, Genome research.

[29]  Leo Egghe,et al.  Little science, big science... and beyond , 1994, Scientometrics.

[30]  M. Krause,et al.  The myogenic potency of HLH-1 reveals wide-spread developmental plasticity in early C. elegans embryos , 2005, Development.

[31]  Ling V. Sun,et al.  Hotspots of transcription factor colocalization in the genome of Drosophila melanogaster , 2006, Proceedings of the National Academy of Sciences.

[32]  David A. Nix,et al.  Large-Scale Turnover of Functional Transcription Factor Binding Sites in Drosophila , 2006, PLoS Comput. Biol..

[33]  Steven M. Johnson,et al.  Flexibility and constraint in the nucleosome core landscape of Caenorhabditis elegans chromatin. , 2006, Genome research.

[34]  Noam Shomron,et al.  Canalization of development by microRNAs , 2006, Nature Genetics.

[35]  M. Gerstein,et al.  Genomic analysis of the hierarchical structure of regulatory networks , 2006, Proceedings of the National Academy of Sciences.

[36]  Christopher M. Player,et al.  Large-Scale Sequencing Reveals 21U-RNAs and Additional MicroRNAs and Endogenous siRNAs in C. elegans , 2006, Cell.

[37]  T. Kouzarides Chromatin Modifications and Their Function , 2007, Cell.

[38]  D. Bartel,et al.  Intronic microRNA precursors that bypass Drosha processing , 2007, Nature.

[39]  U. Alon Network motifs: theory and experimental approaches , 2007, Nature Reviews Genetics.

[40]  P. Giresi,et al.  X chromosome repression by localization of the C. elegans dosage compensation machinery to sites of transcription initiation , 2007, Nature Genetics.

[41]  William Stafford Noble,et al.  Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project , 2007, Nature.

[42]  S. Berger The complex language of chromatin regulation during transcription , 2007, Nature.

[43]  Stijn van Dongen,et al.  miRBase: tools for microRNA genomics , 2007, Nucleic Acids Res..

[44]  Steven M. Johnson,et al.  A high-resolution, nucleosome position map of C. elegans reveals a lack of universal sequence-dictated positioning. , 2008, Genome research.

[45]  Lynn Doucette-Stamm,et al.  A C . elegans genome-scale microRNA network contains composite feedback motifs with high flux capacity , 2008 .

[46]  Yaniv Lubling,et al.  Distinct Modes of Regulation by Chromatin Encoded through Nucleosome Positioning Signals , 2008, PLoS Comput. Biol..

[47]  B. Williams,et al.  Mapping and quantifying mammalian transcriptomes by RNA-Seq , 2008, Nature Methods.

[48]  Mark Gerstein,et al.  Genomics: Protein fossils live on as RNA , 2008, Nature.

[49]  P. Green,et al.  Massively parallel sequencing of the polyadenylated transcriptome of C. elegans. , 2009, Genome research.

[50]  K. Gunsalus,et al.  Empirically controlled mapping of the Caenorhabditis elegans protein-protein interactome network , 2009, Nature Methods.

[51]  L. Kruglyak,et al.  Recombinational Landscape and Population Genomics of Caenorhabditis elegans , 2009, PLoS genetics.

[52]  E. Segal,et al.  Poly(da:dt) Tracts: Major Determinants of Nucleosome Organization This Review Comes from a Themed Issue on Protein-nucleic Acid Interactions Edited , 2022 .

[53]  J. Lieb,et al.  The C. elegans Dosage Compensation Complex Propagates Dynamically and Independently of X Chromosome Sequence , 2009, Current Biology.

[54]  J. Vivanco,et al.  ‡ To whom correspondence should be addressed: , 2022 .

[55]  James B. Brown,et al.  Developmental roles of 21 Drosophila transcription factors are determined by quantitative differences in binding to an overlapping set of thousands of genomic regions , 2009, Genome Biology.

[56]  Irene K. Moore,et al.  The DNA-encoded nucleosome organization of a eukaryotic genome , 2009, Nature.

[57]  J. Cariño State of the World’s Indigenous Peoples , 2019, State of the World’s Indigenous Peoples.

[58]  Eugene W. Myers,et al.  Analysis of Cell Fate from Single-Cell Gene Expression Profiles in C. elegans , 2009, Cell.

[59]  Zachary Pincus,et al.  Dynamic expression of small non-coding RNAs, including novel microRNAs and piRNAs/21U-RNAs, during Caenorhabditis elegans development , 2009, Genome Biology.

[60]  Daniel E. Newburger,et al.  A Multiparameter Network Reveals Extensive Divergence between C. elegans bHLH Transcription Factors , 2009, Cell.

[61]  M. Gerstein,et al.  Comparison and calibration of transcriptome data from RNA-Seq and tiling arrays , 2010, BMC Genomics.

[62]  A. Fire,et al.  Partitioning the C. elegans genome by nucleosome modification, occupancy, and positioning , 2010, Chromosoma.

[63]  M. Gerstein,et al.  Unlocking the secrets of the genome , 2009, Nature.

[64]  R. Rosenfeld Nature , 2009, Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery.

[65]  M. Eisen,et al.  A condensin-like dosage compensation complex acts at a distance to control expression throughout the genome. , 2009, Genes & development.

[66]  F. Piano,et al.  Large scale sorting of C. elegans embryos reveals the dynamics of small RNA expression , 2009, Nature Methods.

[67]  A. Dernburg,et al.  Identification of Chromosome Sequence Motifs That Mediate Meiotic Pairing and Synapsis in C. elegans , 2009, Nature Cell Biology.

[68]  Kimberly Van Auken,et al.  WormBase: a comprehensive resource for nematode research , 2009, Nucleic Acids Res..

[69]  Aviv Regev,et al.  The Role of Nucleosome Positioning in the Evolution of Gene Regulation , 2010, PLoS biology.

[70]  A. Rechtsteiner,et al.  The Histone H3K36 Methyltransferase MES-4 Acts Epigenetically to Transmit the Memory of Germline Gene Expression to Progeny , 2010, PLoS genetics.

[71]  Sebastian D. Mackowiak,et al.  The Landscape of C. elegans 3′UTRs , 2010, Science.

[72]  Nancy R. Zhang,et al.  Subsampling methods for genomic inference , 2010, 1101.0947.

[73]  M. Elowitz,et al.  Functional roles for noise in genetic circuits , 2010, Nature.

[74]  Job Harms,et al.  THE LANDSCAPE OF , 2010 .

[75]  S. Henikoff,et al.  A native chromatin purification system for epigenomic profiling in Caenorhabditis elegans , 2009, Nucleic acids research.

[76]  Irene K. Moore,et al.  High Nucleosome Occupancy Is Encoded at Human Regulatory Sequences , 2010, PloS one.

[77]  E. Segal,et al.  High nucleosome occupancy is encoded at X-linked gene promoters in C. elegans. , 2011, Genome research.

[78]  Peter J. Park,et al.  An assessment of histone-modification antibody quality , 2010, Nature Structural &Molecular Biology.

[79]  Mark Gerstein,et al.  Diverse transcription factor binding features revealed by genome-wide ChIP-seq in C. elegans. , 2011, Genome research.

[80]  Raymond K. Auerbach,et al.  Prediction and characterization of noncoding RNAs in C. elegans by integrating conservation, secondary structure, and high-throughput sequencing and array data. , 2011, Genome research.

[81]  A. Rechtsteiner,et al.  Broad chromosomal domains of histone modification patterns in C. elegans. , 2011, Genome research.

[82]  Michael Chen,et al.  Computational and experimental identification of mirtrons in Drosophila melanogaster and Caenorhabditis elegans. , 2011, Genome research.