Identification of Functional Elements and Regulatory Circuits by Drosophila modENCODE

has an advisory role with DNANexus, aD NA sequence storage and analysis company. Transfer of GFP-tagged fosmids requiresaMaterials Transfer Agreement with the Max Planck Institute of Molecular Cell Biology and Genetics. Raw microarray data are available from the Gene Expression Omnibus archive, and raw sequencing data are available from the SRA archive (accessions are in table S18).W ea ppreciate help from thankC .J an andD .B artelf or sharingd atao np oly(A) sites before publication,W ormBasec uratorG .W illiams for assistance in qualityc heckinga nd preparingt he transcriptomicsd atas etsf or publication, as well as his fellow curatorP .D avis forr eviewinga nd hand-checking thel isto fp seudogenes. || To gain insight into how genomic information is translated into cellular and developmental programs, the Drosophila model organism Encyclopedia of DNA Elements (modENCODE) project is comprehensively mapping transcripts, histone modifications, chromosomal proteins, transcription factors, replication proteins and intermediates, and nucleosome properties across adevelopmental time course and in multiple cell lines. We have generated more than 700 data sets and discovered protein-coding, noncoding, RNA regulatory, replication, and chromatin elements, more than tripling the annotated portion of the Drosophila genome. Correlated activity patterns of these elements reveal af unctional regulatory network, which predicts putative new functions for genes, reveals stage-and tissue-specific regulators, and enables gene-expression prediction. Our results provide af oundation for directed experimental and computational studies in Drosophila and related species and also am odel for systematic data integration toward comprehensive genomic and functional annotation. S everal years after the complete genetic se-quencing of many species, it is still unclear howtotranslate genomic information into af unctional mapo fc ellular and developmental programs.T he Encyclopedia of DNAE lements (ENCODE) (1)a nd model organism ENCODE (modENCODE) (2)projects use diverse genomic assays to comprehensivelya nnotatet he Homo sapiens (human), Drosophila melanogaster (fruit fly),and Caenorhabditis elegans (worm) genomes, through systematic generationa nd computational integration of functional genomic data sets. Previous genomic studies in flies have made seminalc ontributionst oo ur understanding of basicb iological mechanisms and genome functions , facilitated by genetic, experimental, computational , andmanualannotation of theeuchromatic and heterochromatic genome (3), small genome size, short life cycle, and ad eep knowledge of development, gene function, and chromosome biology.T he functions of ~40% of the protein-and nonprotein-coding genes [FlyBase 5.12(4)] have been determinedf romc DNAc ollections (5, 6), manual curation of gene models (7), gene mutations and comprehensive genome-wide RNA interference screens (8–10), andc ompara-tive genomic …