Systematic Identification of cis-Regulatory Sequences Active in Mouse and Human Embryonic Stem Cells

Understanding the transcriptional regulation of pluripotent cells is of fundamental interest and will greatly inform efforts aimed at directing differentiation of embryonic stem (ES) cells or reprogramming somatic cells. We first analyzed the transcriptional profiles of mouse ES cells and primordial germ cells and identified genes upregulated in pluripotent cells both in vitro and in vivo. These genes are enriched for roles in transcription, chromatin remodeling, cell cycle, and DNA repair. We developed a novel computational algorithm, CompMoby, which combines analyses of sequences both aligned and non-aligned between different genomes with a probabilistic segmentation model to systematically predict short DNA motifs that regulate gene expression. CompMoby was used to identify conserved overrepresented motifs in genes upregulated in pluripotent cells. We show that the motifs are preferentially active in undifferentiated mouse ES and embryonic germ cells in a sequence-specific manner, and that they can act as enhancers in the context of an endogenous promoter. Importantly, the activity of the motifs is conserved in human ES cells. We further show that the transcription factor NF-Y specifically binds to one of the motifs, is differentially expressed during ES cell differentiation, and is required for ES cell proliferation. This study provides novel insights into the transcriptional regulatory networks of pluripotent cells. Our results suggest that this systematic approach can be broadly applied to understanding transcriptional networks in mammalian species.

[1]  J. Miyazaki,et al.  Quantitative expression of Oct-3/4 defines differentiation, dedifferentiation or self-renewal of ES cells , 2000, Nature Genetics.

[2]  F. Coustry,et al.  The two activation domains of the CCAAT-binding factor CBF interact with the dTAFII110 component of the Drosophila TFIID complex. , 1998, The Biochemical journal.

[3]  J. Nichols,et al.  BMP Induction of Id Proteins Suppresses Differentiation and Sustains Embryonic Stem Cell Self-Renewal in Collaboration with STAT3 , 2003, Cell.

[4]  A. Trounson,et al.  Embryonic stem cell lines from human blastocysts: somatic differentiation in vitro , 2000, Nature Biotechnology.

[5]  L. Pennacchio,et al.  Genomic strategies to identify mammalian regulatory sequences , 2001, Nature Reviews Genetics.

[6]  M. Tada,et al.  Octamer and Sox Elements Are Required for Transcriptional cis Regulation of Nanog Gene Expression , 2005, Molecular and Cellular Biology.

[7]  M. Kaufman,et al.  Establishment in culture of pluripotential cells from mouse embryos , 1981, Nature.

[8]  Xiaohui Xie,et al.  Erralpha and Gabpa/b specify PGC-1alpha-dependent oxidative phosphorylation gene expression that is altered in diabetic muscle. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[9]  R. Mantovani,et al.  The molecular biology of the CCAAT-binding factor NF-Y. , 1999, Gene.

[10]  K. Lindblad-Toh,et al.  Systematic discovery of regulatory motifs in human promoters and 3′ UTRs by comparison of several mammals , 2005, Nature.

[11]  Kuang Yu Chen,et al.  Transcriptional regulation of cellular ageing by the CCAAT box-binding factor CBF/NF-Y , 2002, Ageing Research Reviews.

[12]  Antonio Porro,et al.  In vivo transcriptional regulation of N-Myc target genes is controlled by E-box methylation. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Gary D. Stormo,et al.  Identifying DNA and protein patterns with statistically significant alignments of multiple sequences , 1999, Bioinform..

[14]  Ting Wang,et al.  Combining phylogenetic data with co-regulated genes to identify regulatory motifs , 2003, Bioinform..

[15]  X. Chen,et al.  The Oct4 and Nanog transcription network regulates pluripotency in mouse embryonic stem cells , 2006, Nature Genetics.

[16]  B. Wang,et al.  Changing potency by spontaneous fusion , 2022 .

[17]  Hao Li,et al.  Regulatory Element Detection Using a Probabilistic Segmentation Model , 2000, ISMB.

[18]  N. Corbi,et al.  Developmental-specific activity of the FGF-4 enhancer requires the synergistic action of Sox2 and Oct-3. , 1995, Genes & development.

[19]  G. Daley,et al.  High‐Efficiency RNA Interference in Human Embryonic Stem Cells , 2005, Stem cells.

[20]  S. Yamanaka,et al.  Induction of Pluripotent Stem Cells from Mouse Embryonic and Adult Fibroblast Cultures by Defined Factors , 2006, Cell.

[21]  M. Abdelrahim,et al.  Sp transcription factor family and its role in cancer. , 2005, European journal of cancer.

[22]  Akihiko Okuda,et al.  The Gene for the Embryonic Stem Cell Coactivator UTF1 Carries a Regulatory Element Which Selectively Interacts with a Complex Composed of Oct-3/4 and Sox-2 , 1999, Molecular and Cellular Biology.

[23]  Michael S. German,et al.  Paired-Homeodomain Transcription Factor PAX4 Acts as a Transcriptional Repressor in Early Pancreatic Development , 1999, Molecular and Cellular Biology.

[24]  H. Bussemaker,et al.  Building a dictionary for genomes: identification of presumptive regulatory sites by statistical analysis. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[25]  H. Schöler,et al.  Germline regulatory element of Oct-4 specific for the totipotent cycle of embryonal cells. , 1996, Development.

[26]  S. Batalov,et al.  A gene atlas of the mouse and human protein-encoding transcriptomes. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[27]  Erik van Nimwegen,et al.  PhyloGibbs: A Gibbs Sampling Motif Finder That Incorporates Phylogeny , 2005, PLoS Comput. Biol..

[28]  Xiaohui S. Xie,et al.  Errα and Gabpa/b specify PGC-1α-dependent oxidative phosphorylation gene expression that is altered in diabetic muscle , 2004 .

[29]  G. Horgan,et al.  Relative expression software tool (REST©) for group-wise comparison and statistical analysis of relative expression results in real-time PCR , 2002 .

[30]  Jun S. Liu,et al.  Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. , 1993, Science.

[31]  T. Burdon,et al.  Oct‐4 Knockdown Induces Similar Patterns of Endoderm and Trophoblast Differentiation Markers in Human and Mouse Embryonic Stem Cells , 2004, Stem cells.

[32]  Jiang Zhu,et al.  NF-Ya activates multiple hematopoietic stem cell (HSC) regulatory genes and promotes HSC self-renewal. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[33]  P. Robson,et al.  Transcriptional Regulation of Nanog by OCT4 and SOX2* , 2005, Journal of Biological Chemistry.

[34]  Megan F. Cole,et al.  Core Transcriptional Regulatory Circuitry in Human Embryonic Stem Cells , 2005, Cell.

[35]  A. Clark,et al.  Evolution of transcription factor binding sites in Mammalian gene regulatory regions: conservation and turnover. , 2002, Molecular biology and evolution.

[36]  S. Maity,et al.  Recombinant rat CBF-C, the third subunit of CBF/NFY, allows formation of a protein-DNA complex with CBF-A and CBF-B and with yeast HAP2 and HAP3. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[37]  J. Deng,et al.  The B subunit of the CCAAT box binding transcription factor complex (CBF/NF-Y) is essential for early mouse development and cell proliferation. , 2003, Cancer research.

[38]  F. C. Lucibello,et al.  Cell cycle regulation of cdc25C transcription is mediated by the periodic repression of the glutamine-rich activators NF-Y and Sp1. , 1995, Nucleic acids research.

[39]  D. Haussler,et al.  Human-mouse alignments with BLASTZ. , 2003, Genome research.

[40]  P. Donovan,et al.  Long-term proliferation of mouse primordial germ cells in culture , 1992, Nature.

[41]  C. Murre,et al.  Helix-Loop-Helix Proteins: Regulators of Transcription in Eucaryotic Organisms , 2000, Molecular and Cellular Biology.

[42]  Li Chai,et al.  Sall4 modulates embryonic stem cell pluripotency and early embryonic development by the transcriptional regulation of Pou5f1 , 2006, Nature Cell Biology.

[43]  J. Zeitlinger,et al.  Polycomb complexes repress developmental regulators in murine embryonic stem cells , 2006, Nature.

[44]  P. Donovan,et al.  Turning germ cells into stem cells. , 2003, Current opinion in genetics & development.

[45]  Kuang Yu Chen,et al.  Possible role of subunit A of nuclearfactor Y (NF-YA) in normal human diploidfibroblasts during senescence , 2004, Biogerontology.

[46]  Eldon Emberly,et al.  Conservation of regulatory elements between two species of Drosophila , 2003, BMC Bioinformatics.

[47]  C. Mummery,et al.  Regulation of human embryonic stem cell differentiation by BMP-2 and its antagonist noggin , 2004, Journal of Cell Science.

[48]  S. Sinha,et al.  The Transcriptional Activity of the CCAAT-binding Factor CBF Is Mediated by Two Distinct Activation Domains, One in the CBF-B Subunit and the Other in the CBF-C Subunit* , 1996, The Journal of Biological Chemistry.

[49]  M. Tiainen,et al.  Down-regulation of cyclin B1 gene transcription in terminally differentiated skeletal muscle cells is associated with loss of functional CCAAT-binding NF-Y complex , 1999, Oncogene.

[50]  Damian Smedley,et al.  Ensembl 2004 , 2004, Nucleic Acids Res..

[51]  J. Thomson,et al.  BMP4 initiates human embryonic stem cell differentiation to trophoblast , 2002, Nature Biotechnology.

[52]  Radu Dobrin,et al.  Dissecting self-renewal in stem cells with RNA interference , 2006, Nature.

[53]  B. Hogan,et al.  Derivation of Pluripotential Embryonic Stem Cells from Murine Primordial Germ Cells in Culture. , 1993 .

[54]  A. Smith,et al.  Self-renewal of pluripotent embryonic stem cells is mediated via activation of STAT3. , 1998, Genes & development.

[55]  Peter Walter,et al.  Gcn4p and Novel Upstream Activating Sequences Regulate Targets of the Unfolded Protein Response , 2004, PLoS biology.

[56]  J. Nichols,et al.  Functional Expression Cloning of Nanog, a Pluripotency Sustaining Factor in Embryonic Stem Cells , 2003, Cell.

[57]  Ryo Matoba,et al.  High-throughput screen for genes predominantly expressed in the ICM of mouse blastocysts by whole mount in situ hybridization. , 2006, Gene expression patterns : GEP.

[58]  Douglas L. Brutlag,et al.  BioProspector: Discovering Conserved DNA Motifs in Upstream Regulatory Regions of Co-Expressed Genes , 2000, Pacific Symposium on Biocomputing.

[59]  Serafim Batzoglou,et al.  Eukaryotic regulatory element conservation analysis and identification using comparative genomics. , 2004, Genome research.

[60]  P. Sharp,et al.  Cre-lox-regulated conditional RNA interference from transgenes. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[61]  William Stafford Noble,et al.  Assessing computational tools for the discovery of transcription factor binding sites , 2005, Nature Biotechnology.

[62]  J. Thomson,et al.  Embryonic stem cell lines derived from human blastocysts. , 1998, Science.

[63]  H. Weintraub,et al.  Sequence-specific DNA binding by the c-Myc protein. , 1990, Science.

[64]  Austin G Smith,et al.  Signalling, cell cycle and pluripotency in embryonic stem cells. , 2002, Trends in cell biology.

[65]  T. Andrews,et al.  The Ensembl automatic gene annotation system. , 2004, Genome research.

[66]  Y. Matsui,et al.  Mechanisms of germ‐cell specification in mouse embryos , 2005, BioEssays : news and reviews in molecular, cellular and developmental biology.

[67]  D. Guhathakurta,et al.  Computational identification of transcriptional regulatory elements in DNA sequence , 2006, Nucleic acids research.

[68]  John K. Heath,et al.  Inhibition of pluripotential embryonic stem cell differentiation by purified polypeptides , 1988, Nature.

[69]  C. Li,et al.  Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[70]  Ron Shamir,et al.  Clustering Gene Expression Patterns , 1999, J. Comput. Biol..

[71]  Donald Metcalf,et al.  Myeloid leukaemia inhibitory factor maintains the developmental potential of embryonic stem cells , 1988, Nature.

[72]  A. Trounson,et al.  Human embryonic stem cells: prospects for development , 2004, Development.

[73]  P. Farnham,et al.  Identification of unknown target genes of human transcription factors using chromatin immunoprecipitation. , 2002, Methods.

[74]  Shinya Yamanaka,et al.  Fbx15 Is a Novel Target of Oct3/4 but Is Dispensable for Embryonic Stem Cell Self-Renewal and Mouse Development , 2003, Molecular and Cellular Biology.

[75]  R. Mantovani,et al.  Conservation and divergence of NF-Y transcriptional activation function. , 1998, Nucleic acids research.

[76]  Simon C. Potter,et al.  An overview of Ensembl. , 2004, Genome research.

[77]  M. Murakami,et al.  The Homeoprotein Nanog Is Required for Maintenance of Pluripotency in Mouse Epiblast and ES Cells , 2003, Cell.

[78]  D. Melton,et al.  "Stemness": Transcriptional Profiling of Embryonic and Adult Stem Cells , 2002, Science.

[79]  Terrence S. Furey,et al.  The UCSC Genome Browser Database , 2003, Nucleic Acids Res..

[80]  Ernest Fraenkel,et al.  Practical Strategies for Discovering Regulatory DNA Sequence Motifs , 2006, PLoS Comput. Biol..

[81]  Charles Elkan,et al.  The Value of Prior Knowledge in Discovering Motifs with MEME , 1995, ISMB.

[82]  H. Schöler,et al.  Germline‐specific expression of the Oct‐4/green fluorescent protein (GFP) transgene in mice , 1999, Development, growth & differentiation.

[83]  G. Martin,et al.  Isolation of a pluripotent cell line from early mouse embryos cultured in medium conditioned by teratocarcinoma stem cells. , 1981, Proceedings of the National Academy of Sciences of the United States of America.

[84]  H. Schöler,et al.  Formation of Pluripotent Stem Cells in the Mammalian Embryo Depends on the POU Transcription Factor Oct4 , 1998, Cell.

[85]  S. Maity,et al.  Stable Expression of a Dominant Negative Mutant of CCAAT Binding Factor/NF-Y in Mouse Fibroblast Cells Resulting in Retardation of Cell Growth and Inhibition of Transcription of Various Cellular Genes* , 2000, The Journal of Biological Chemistry.

[86]  H. Niwa,et al.  Identification of Sox-2 regulatory region which is under the control of Oct-3/4-Sox-2 complex. , 2002, Nucleic acids research.

[87]  C. Benoist,et al.  Intron-exon organization of the NF-Y genes. Tissue-specific splicing modifies an activation domain. , 1992, The Journal of biological chemistry.

[88]  P. Khatri,et al.  Profiling gene expression using onto-express. , 2002, Genomics.

[89]  Motoki Saito,et al.  Oct-3/4 and Sox2 Regulate Oct-3/4 Gene in Embryonic Stem Cells* , 2005, Journal of Biological Chemistry.

[90]  Francesca Chiaromonte,et al.  Scoring Pairwise Genomic Sequence Alignments , 2001, Pacific Symposium on Biocomputing.

[91]  B. Crombrugghe,et al.  Three different polypeptides are necessary for DNA binding of the mammalian heteromeric CCAAT binding factor. , 1992, The Journal of biological chemistry.

[92]  S. Dalton,et al.  LIF/STAT3 controls ES cell self-renewal and pluripotency by a Myc-dependent mechanism , 2005, Development.

[93]  A. Reynolds,et al.  Rational siRNA design for RNA interference , 2004, Nature Biotechnology.