Lessons from modENCODE.

The modENCODE (Model Organism Encyclopedia of DNA Elements) Consortium aimed to map functional elements-including transcripts, chromatin marks, regulatory factor binding sites, and origins of DNA replication-in the model organisms Drosophila melanogaster and Caenorhabditis elegans. During its five-year span, the consortium conducted more than 2,000 genome-wide assays in developmentally staged animals, dissected tissues, and homogeneous cell lines. Analysis of these data sets provided foundational insights into genome, epigenome, and transcriptome structure and the evolutionary turnover of regulatory pathways. These studies facilitated a comparative analysis with similar data types produced by the ENCODE Consortium for human cells. Genome organization differs drastically in these distant species, and yet quantitative relationships among chromatin state, transcription, and cotranscriptional RNA processing are deeply conserved. Of the many biological discoveries of the modENCODE Consortium, we highlight insights that emerged from integrative studies. We focus on operational and scientific lessons that may aid future projects of similar scale or aims in other, emerging model systems.

[1]  Christopher R. Sibley,et al.  Mirtrons, an emerging class of atypical miRNA , 2012, Wiley interdisciplinary reviews. RNA.

[2]  C Joel McManus,et al.  Global analysis of trans-splicing in Drosophila , 2010, Proceedings of the National Academy of Sciences.

[3]  L. Hillier,et al.  A global analysis of C. elegans trans-splicing. , 2011, Genome research.

[4]  A. Fatica,et al.  Long non-coding RNAs: new players in cell differentiation and development , 2013, Nature Reviews Genetics.

[5]  D. MacAlpine,et al.  Preferential Re-Replication of Drosophila Heterochromatin in the Absence of Geminin , 2010, PLoS genetics.

[6]  P. Park,et al.  Enrichment of HP1a on Drosophila Chromosome 4 Genes Creates an Alternate Chromatin Structure Critical for Regulation in this Heterochromatic Domain , 2012, PLoS genetics.

[7]  A. Jacquier The complex eukaryotic transcriptome: unexpected pervasive transcription and novel small RNAs , 2009, Nature Reviews Genetics.

[8]  Eric Boerwinkle,et al.  A Drosophila Genetic Resource of Mutants to Study Mechanisms Underlying Human Genetic Diseases , 2014, Cell.

[9]  Howard Y. Chang,et al.  Control of somatic tissue differentiation by the long non-coding RNA TINCR , 2012, Nature.

[10]  Raymond K. Auerbach,et al.  An Integrated Encyclopedia of DNA Elements in the Human Genome , 2012, Nature.

[11]  Stuart K. Kim,et al.  Integrative analysis of C. elegans modENCODE ChIP-seq data sets to infer gene regulatory interactions , 2013, Genome research.

[12]  J. Rinn,et al.  lincRNAs act in the circuitry controlling pluripotency and differentiation , 2011, Nature.

[13]  Xia Sun,et al.  Design of RNA splicing analysis null models for post hoc filtering of Drosophila head RNA-Seq data with the splicing analysis kit (Spanki) , 2013, BMC Bioinformatics.

[14]  Vishwanath R. Iyer,et al.  Widespread Misinterpretable ChIP-seq Bias in Yeast , 2013, PloS one.

[15]  J. Lieb,et al.  Caenorhabditis elegans chromosome arms are anchored to the nuclear membrane via discontinuous association with LEM-2 , 2010, Genome Biology.

[16]  D. MacAlpine,et al.  DNA replication and transcription programs respond to the same chromatin cues , 2014, Genome research.

[17]  E. Lai,et al.  The Mirtron Pathway Generates microRNA-Class Regulatory RNAs in Drosophila , 2007, Cell.

[18]  Mark Gerstein,et al.  Diverse transcription factor binding features revealed by genome-wide ChIP-seq in C. elegans. , 2011, Genome research.

[19]  Paulo P. Amaral,et al.  Long noncoding RNAs in mouse embryonic stem cell pluripotency and differentiation. , 2008, Genome research.

[20]  James B. Brown,et al.  Global patterns of tissue-specific alternative polyadenylation in Drosophila. , 2012, Cell reports.

[21]  A. Villeneuve,et al.  A Caenorhabditis elegans RNA-Directed RNA Polymerase in Sperm Development and Endogenous RNA Interference , 2009, Genetics.

[22]  Sin Lam Tan,et al.  Complex Loci in Human and Mouse Genomes , 2006, PLoS genetics.

[23]  Manolis Kellis,et al.  Interpreting non-coding variation in complex disease genetics , 2012, Nature Biotechnology.

[24]  R. Durbin,et al.  Mapping Quality Scores Mapping Short Dna Sequencing Reads and Calling Variants Using P

, 2022 .

[25]  N. Negre,et al.  Genomic data integration for ecological and evolutionary traits in non-model organisms , 2014, BMC Genomics.

[26]  Peter J. Bickel,et al.  The Developmental Transcriptome of Drosophila melanogaster , 2010, Nature.

[27]  K. White,et al.  Adaptive Evolution and the Birth of CTCF Binding Sites in the Drosophila Genome , 2012, PLoS biology.

[28]  Ling V. Sun,et al.  Hotspots of transcription factor colocalization in the genome of Drosophila melanogaster , 2006, Proceedings of the National Academy of Sciences.

[29]  Peter J. Park,et al.  Impact of sequencing depth in ChIP-seq experiments , 2014, Nucleic acids research.

[30]  S. Henikoff,et al.  Epigenome characterization at single base-pair resolution , 2011, Proceedings of the National Academy of Sciences.

[31]  David R. Kelley,et al.  Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks , 2012, Nature Protocols.

[32]  Kotb Abdelmohsen,et al.  LincRNA-p21 suppresses target mRNA translation. , 2012, Molecular cell.

[33]  Łukasz M. Boryń,et al.  Genome-Wide Quantitative Enhancer Activity Maps Identified by STARR-seq , 2013, Science.

[34]  E. Schierenberg,et al.  Loss of the insulator protein CTCF during nematode evolution , 2009, BMC Molecular Biology.

[35]  Maria Carmo-Fonseca,et al.  Splicing enhances recruitment of methyltransferase HYPB/Setd2 and methylation of histone H3 Lys36 , 2011, Nature Structural &Molecular Biology.

[36]  Y. Gruenbaum,et al.  The nuclear lamina and heterochromatin: a complex relationship. , 2011, Biochemical Society transactions.

[37]  Z. Weng,et al.  Endogenous siRNAs Derived from Transposons and mRNAs in Drosophila Somatic Cells , 2008, Science.

[38]  P. Park ChIP–seq: advantages and challenges of a maturing technology , 2009, Nature Reviews Genetics.

[39]  A. Siepel,et al.  Diversity of miRNAs, siRNAs, and piRNAs across 25 Drosophila cell lines , 2014, Genome research.

[40]  Yuanwei Zhang,et al.  MicroRNA and piRNA Profiles in Normal Human Testis Detected by Next Generation Sequencing , 2013, PloS one.

[41]  G. Faulkner,et al.  L1 retrotransposons and somatic mosaicism in the brain. , 2014, Annual review of genetics.

[42]  I. Amit,et al.  Comprehensive mapping of long range interactions reveals folding principles of the human genome , 2011 .

[43]  James B. Brown,et al.  Genome-guided transcript assembly from integrative analysis of RNA sequence data , 2014, Nature Biotechnology.

[44]  P. Bickel,et al.  Systematic evaluation of factors influencing ChIP-seq fidelity , 2012, Nature Methods.

[45]  L. Hillier,et al.  The landscape of RNA polymerase II transcription initiation in C. elegans reveals promoter and enhancer architectures , 2013, Genome research.

[46]  Cole Trapnell,et al.  Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. , 2010, Nature biotechnology.

[47]  Dave T. Gerrard,et al.  Gene expression divergence recapitulates the developmental hourglass model , 2010, Nature.

[48]  E. Frise,et al.  Systematic image-driven analysis of the spatial Drosophila embryonic expression landscape , 2010, Molecular systems biology.

[49]  Jacob D. Jaffe,et al.  Plasticity in patterns of histone modifications and chromosomal proteins in Drosophila heterochromatin. , 2011, Genome research.

[50]  A. Rechtsteiner,et al.  Broad chromosomal domains of histone modification patterns in C. elegans. , 2011, Genome research.

[51]  Raymond K. Auerbach,et al.  Genome-Wide Identification of Binding Sites Defines Distinct Functions for Caenorhabditis elegans PHA-4/FOXA in Development and Environmental Response , 2010, PLoS genetics.

[52]  Alexander van Oudenaarden,et al.  Highly expressed loci are vulnerable to misleading ChIP localization of multiple unrelated proteins , 2013, Proceedings of the National Academy of Sciences.

[53]  A. Rechtsteiner,et al.  The Histone H3K36 Methyltransferase MES-4 Acts Epigenetically to Transmit the Memory of Germline Gene Expression to Progeny , 2010, PLoS genetics.

[54]  M. Levine,et al.  ELAV mediates 3' UTR extension in the Drosophila nervous system. , 2012, Genes & development.

[55]  Y. Kaneda,et al.  A histone H3 lysine 36 trimethyltransferase links Nkx2-5 to Wolf–Hirschhorn syndrome , 2009, Nature.

[56]  Howard Y. Chang,et al.  Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position , 2013, Nature Methods.

[57]  James B. Brown,et al.  Modeling gene expression using chromatin features in various cellular contexts , 2012, Genome Biology.

[58]  William Stafford Noble,et al.  Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project , 2007, Nature.

[59]  E. Ullian,et al.  A role for noncanonical microRNAs in the mammalian brain revealed by phenotypic differences in Dgcr8 versus Dicer1 knockouts and small RNA sequencing. , 2011, RNA.

[60]  Pu Zhang,et al.  DNMT1-interacting RNAs block gene specific DNA methylation , 2013, Nature.

[61]  E. Lai,et al.  Discovery of hundreds of mirtrons in mouse and human small RNA data , 2012, Genome research.

[62]  R. Gordân,et al.  Drosophila ORC localizes to open chromatin and marks sites of cohesin complex loading. , 2010, Genome research.

[63]  Chris P. Ponting,et al.  Identification and Properties of 1,119 Candidate LincRNA Loci in the Drosophila melanogaster Genome , 2012, Genome biology and evolution.

[64]  M. Gerstein,et al.  Unlocking the secrets of the genome , 2009, Nature.

[65]  Li Yang,et al.  The transcriptional diversity of 25 Drosophila cell lines. , 2011, Genome research.

[66]  Michael Y Tolstorukov,et al.  Nature and function of insulator protein binding sites in the Drosophila genome , 2012, Genome research.

[67]  Robert L. Grossman,et al.  A cis-regulatory map of the Drosophila genome , 2011, Nature.

[68]  M. MacCoss,et al.  Proteomic discovery of previously unannotated, rapidly evolving seminal fluid genes in Drosophila. , 2009, Genome research.

[69]  Steven E Brenner,et al.  Comparison of D. melanogaster and C. elegans developmental stages, tissues, and cells by modENCODE RNA-seq data , 2014, Genome research.

[70]  J. Mattick,et al.  Long noncoding RNAs in neuronal-glial fate specification and oligodendrocyte lineage maturation , 2010, BMC Neuroscience.

[71]  P. Park,et al.  Design and analysis of ChIP-seq experiments for DNA-binding proteins , 2008, Nature Biotechnology.

[72]  Marc D. Perry,et al.  ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia , 2012, Genome research.

[73]  Diana S Chu,et al.  26G endo-siRNAs regulate spermatogenic and zygotic gene expression in Caenorhabditis elegans , 2009, Proceedings of the National Academy of Sciences.

[74]  G. Hannon,et al.  Evolutionary flux of canonical microRNAs and mirtrons in Drosophila , 2010, Nature Genetics.

[75]  Yadong Wang,et al.  Modeling Exon Expression Using Histone Modifications , 2013, PloS one.

[76]  O. Rando,et al.  Combinatorial complexity in chromatin structure and function: revisiting the histone code. , 2012, Current opinion in genetics & development.

[77]  Peter J. Bickel,et al.  Comparative Analysis of the Transcriptome across Distant Species , 2014, Nature.

[78]  A. Fire,et al.  Amplification of siRNA in Caenorhabditis elegans generates a transgenerational sequence-targeted histone H3 lysine 9 methylation footprint , 2012, Nature Genetics.

[79]  James B. Brown,et al.  Developmental roles of 21 Drosophila transcription factors are determined by quantitative differences in binding to an overlapping set of thousands of genomic regions , 2009, Genome Biology.

[80]  Howard Y. Chang,et al.  Genome regulation by long noncoding RNAs. , 2012, Annual review of biochemistry.

[81]  J. Ahringer,et al.  Differential chromatin marking of introns and expressed exons by H3K36me3 , 2008, Nature Genetics.

[82]  S. Henikoff,et al.  Genome-Wide Kinetics of Nucleosome Turnover Determined by Metabolic Labeling of Histones , 2010, Science.

[83]  K. Asai,et al.  A regulatory circuit for piwi by the large Maf gene traffic jam in Drosophila , 2009, Nature.

[84]  B. Frey,et al.  Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing , 2008, Nature Genetics.

[85]  Sebastian D. Mackowiak,et al.  The Landscape of C. elegans 3′UTRs , 2010, Science.

[86]  Gos Micklem,et al.  Supporting Online Material Materials and Methods Figs. S1 to S50 Tables S1 to S18 References Identification of Functional Elements and Regulatory Circuits by Drosophila Modencode , 2022 .

[87]  Raymond K. Auerbach,et al.  Integrative Analysis of the Caenorhabditis elegans Genome by the modENCODE Project , 2010, Science.

[88]  J. Irudayaraj,et al.  Single-molecule tools elucidate H2A.Z nucleosome composition , 2012, Journal of Cell Science.

[89]  G. Helt,et al.  Transcriptional Maps of 10 Human Chromosomes at 5-Nucleotide Resolution , 2005, Science.

[90]  J. Mattick,et al.  Somatic retrotransposition alters the genetic landscape of the human brain , 2011, Nature.

[91]  Abdullah Ozer,et al.  Comprehensive Analysis of RNA-Protein Interactions by High Throughput Sequencing-RNA Affinity Profiling , 2014, Nature Methods.

[92]  Manolis Kellis,et al.  Discovery and characterization of chromatin states for systematic annotation of the human genome , 2010, Nature Biotechnology.

[93]  E. Lai,et al.  Common and distinct patterns of terminal modifications to mirtrons and canonical microRNAs. , 2012, RNA.

[94]  Lovelace J. Luquette,et al.  Comprehensive analysis of the chromatin landscape in Drosophila , 2010, Nature.

[95]  M. Tomita,et al.  Computational analysis of associations between alternative splicing and histone modifications , 2013, FEBS letters.

[96]  Benjamin J. Blencowe,et al.  Dynamic Integration of Splicing within Gene Regulatory Pathways , 2013, Cell.

[97]  T. Gingeras Implications of chimaeric non-co-linear transcripts , 2009, Nature.

[98]  E. Lai,et al.  Widespread and extensive lengthening of 3′ UTRs in the mammalian brain , 2013, Genome research.

[99]  James B. Brown,et al.  DNA regions bound at low occupancy by transcription factors do not drive patterned reporter gene expression in Drosophila , 2012, Proceedings of the National Academy of Sciences.

[100]  G. Reuter,et al.  Transgene analysis proves mRNA trans-splicing at the complex mod(mdg4) locus in Drosophila , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[101]  F. Piano,et al.  Large scale sorting of C. elegans embryos reveals the dynamics of small RNA expression , 2009, Nature Methods.

[102]  Peter J. Bickel,et al.  Comparative analysis of regulatory information and circuits across distant species , 2014, Nature.

[103]  D. Bartel,et al.  Intronic microRNA precursors that bypass Drosha processing , 2007, Nature.

[104]  Sanjay Gupta,et al.  HIstome—a relational knowledgebase of human histone proteins and histone modifying enzymes , 2011, Nucleic Acids Res..

[105]  Thomas R. Gingeras,et al.  STAR: ultrafast universal RNA-seq aligner , 2013, Bioinform..

[106]  S. Zipursky,et al.  Probabilistic Splicing of Dscam1 Establishes Identity at the Level of Single Neurons , 2013, Cell.

[107]  Sharon R Grossman,et al.  RNA-RNA Interactions Enable Specific Targeting of Noncoding RNAs to Nascent Pre-mRNAs and Chromatin Sites , 2014, Cell.

[108]  K. Sneppen,et al.  Theoretical Analysis of Epigenetic Cell Memory by Nucleosome Modification , 2007, Cell.

[109]  T. E. Wilson,et al.  Use of Bru-Seq and BruChase-Seq for genome-wide assessment of the synthesis and stability of RNA. , 2014, Methods.

[110]  Daniel L. Mace,et al.  Regulatory analysis of the C. elegans genome with spatiotemporal resolution , 2014, Nature.

[111]  John K. Kim,et al.  A Conserved Upstream Motif Orchestrates Autonomous, Germline-Enriched Expression of Caenorhabditis elegans piRNAs , 2013, PLoS genetics.

[112]  Li Yang,et al.  Conservation of an RNA regulatory map between Drosophila and mammals. , 2011, Genome research.

[113]  W. D. Laat,et al.  A Decade of 3c Technologies: Insights into Nuclear Organization References , 2022 .

[114]  Peter J. Park,et al.  An assessment of histone-modification antibody quality , 2010, Nature Structural &Molecular Biology.

[115]  Ying Liu,et al.  R2D2 Organizes Small Regulatory RNA Pathways in Drosophila , 2010, Molecular and Cellular Biology.

[116]  D. Cacchiarelli,et al.  A Long Noncoding RNA Controls Muscle Differentiation by Functioning as a Competing Endogenous RNA , 2011, Cell.

[117]  P. Green,et al.  Massively parallel sequencing of the polyadenylated transcriptome of C. elegans. , 2009, Genome research.

[118]  Yusuke Nakamura,et al.  WHSC1 Promotes Oncogenesis through Regulation of NIMA-Related Kinase-7 in Squamous Cell Carcinoma of the Head and Neck , 2014, Molecular Cancer Research.

[119]  N. Lau,et al.  Abundant primary piRNAs, endo-siRNAs, and microRNAs in a Drosophila ovary cell line. , 2009, Genome research.

[120]  Ammar S Naqvi,et al.  Deep annotation of Drosophila melanogaster microRNAs yields insights into their processing, modification, and emergence. , 2011, Genome research.

[121]  Peter V Kharchenko,et al.  Chromatin signatures of the Drosophila replication program. , 2011, Genome research.

[122]  Robert B Darnell,et al.  HITS‐CLIP: panoramic views of protein–RNA regulation in living cells , 2010, Wiley interdisciplinary reviews. RNA.

[123]  James B. Brown,et al.  Comparative validation of the D. melanogaster modENCODE transcriptome annotation , 2014, Genome research.

[124]  Steven Henikoff,et al.  High-resolution digital profiling of the epigenome , 2014, Nature Reviews Genetics.

[125]  Toshiro Aigaki,et al.  Alternative trans-splicing of constant and variable exons of a Drosophila axon guidance gene, lola. , 2003, Genes & development.

[126]  Robert L Moritz,et al.  PRMT5-mediated methylation of histone H4R3 recruits DNMT3A, coupling histone and DNA methylation in gene silencing , 2009, Nature Structural &Molecular Biology.

[127]  Christopher M. Player,et al.  Large-Scale Sequencing Reveals 21U-RNAs and Additional MicroRNAs and Endogenous siRNAs in C. elegans , 2006, Cell.

[128]  Nadav S. Bar,et al.  Landscape of transcription in human cells , 2012, Nature.

[129]  Moritz Herrmann,et al.  Comparative analysis of metazoan chromatin organization , 2014, Nature.

[130]  Haifan Lin,et al.  Yb modulates the divisions of both germline and somatic stem cells through piwi- and hh-mediated mechanisms in the Drosophila ovary. , 2001, Molecular cell.

[131]  Dong-Yeon Cho,et al.  DNA copy number evolution in Drosophila cell lines , 2014, Genome Biology.

[132]  Eric T. Wang,et al.  Alternative Isoform Regulation in Human Tissue Transcriptomes , 2008, Nature.

[133]  L. Farinelli,et al.  Implication of sperm RNAs in transgenerational inheritance of the effects of early trauma in mice , 2014, Nature Neuroscience.

[134]  G. Felsenfeld,et al.  Methylation of histone H4 by arginine methyltransferase PRMT1 is essential in vivo for many subsequent histone modifications. , 2005, Genes & development.

[135]  James B. Brown,et al.  Diversity and dynamics of the Drosophila transcriptome , 2014, Nature.

[136]  Piero Carninci,et al.  Genome-wide analysis of promoter architecture in Drosophila melanogaster. , 2011, Genome research.

[137]  J. Ahringer,et al.  Extreme HOT regions are CpG-dense promoters in C. elegans and humans , 2014, Genome research.

[138]  M. Biggin Animal transcription networks as highly connected, quantitative continua. , 2011, Developmental cell.

[139]  H. Bussemaker,et al.  Global Chromatin Domain Organization of the Drosophila Genome , 2008, PLoS genetics.