Computational Methods for Analyzing Gene Regulation in Model Organisms Research Thesis In Partial Fulfillment of the Requirements for the Degree of Master of Science in Computer Science

Acknowledged. We thank Merck for a small grant and the CGC for strains used in " A Genomic Bias for Genotype-environment Interactions in C. elegans. ". Spatial localization of co-regulated genes is greater than genomic gene clustering in the S. cerevisae genome. " Figure S5 The effect of a permutation on gene identities to the enrichment of co-localized targets of GLN-3. As is evident in this figure, there are no significant enrichments once running a permutation on the gene identities, indicating that the enrichment of gene co-localization is statistically significant and stems from non-random proximity. Figure S6 Expression of genes which participate in many co-localized regions compared to genes which participate in few co-localized regions.1 Abstract This dissertation embodies two separate research projects with a common goal-exploring gene regulation. In Biology, gene regulation encompasses a broad field which attempts to describe the molecular interactions between various cellular factors that conspire to silence or activate the machinery in charge of compiling a gene from its source code – the DNA, to an executable thread – Protein, which in turn works in cohort with other active machinery in the cell to determine the organism's phenotype. In the first project, we examine the environment's' effect on gene regulation through the lens of evolution, comparing gene expression of 5 strains of the nematode C. elegans grown in 5 different mediums. We use robust statistical methods to show that highly regulated genes, as distinguished by intergenic lengths, motif concentration, and expression levels, are particularly biased towards genotype-environment interactions. Sequencing these strains, we find that genes with expression variation across genotypes are enriched for promoter SNPs, as expected. However, genes with genotype-environment interactions do not significantly differ from background in terms of their promoter SNPs. Collectively, these results suggest that the highly-regulated nature of particular genes predispose them for exhibiting genotype-environment interaction as a consequence of changes to upstream regulators. This observation may provide a deeper understanding into the origin of the extraordinary gene expression diversity present in even closely related species.. In the second project, we take a pragmatic approach and provide an analytical framework of exploring both the structure of DNA and of detecting spatial co-localization of genomic markers. We go on to deploy this framework and provide a 3D structural model of the Saccharomyces Cerevisae genome, and use it to provide evidence of widespread co-localization of the targets of cellular factors, termed Transcription …

[1]  R. D. Hawkins,et al.  Methods for identifying higher-order chromatin structure. , 2012, Annual review of genomics and human genetics.

[2]  F. Alber,et al.  Physical tethering and volume exclusion determine higher-order genome organization in budding yeast , 2012, Genome research.

[3]  Ryan K. Dale,et al.  CTCF-mediated transcriptional regulation through cell type-specific chromosome organization in the β-globin locus , 2012, Nucleic acids research.

[4]  A. Tanay,et al.  Three-Dimensional Folding and Functional Organization Principles of the Drosophila Genome , 2012, Cell.

[5]  William Stafford Noble,et al.  On the assessment of statistical significance of three-dimensional colocalization of sets of genomic elements , 2012, Nucleic acids research.

[6]  Kimberly Van Auken,et al.  WormBase 2012: more genomes, more data, new website , 2011, Nucleic Acids Res..

[7]  A. Tanay,et al.  Probabilistic modeling of Hi-C contact maps eliminates systematic biases to characterize global chromosomal architecture , 2011, Nature Genetics.

[8]  Mathieu Blanchette,et al.  Three-dimensional modeling of chromatin structure from interaction frequency data using Markov chain Monte Carlo sampling , 2011, BMC Bioinformatics.

[9]  K. Struhl,et al.  Extensive divergence of yeast stress responses through transitions between induced and constitutive activation , 2011, Proceedings of the National Academy of Sciences.

[10]  X. Dai,et al.  Nuclear colocalization of transcription factor target genes strengthens coregulation in yeast , 2011, Nucleic acids research.

[11]  Itai Yanai,et al.  Core promoter T-blocks correlate with gene expression levels in C. elegans. , 2011, Genome research.

[12]  Christophe Zimmer,et al.  Principles of chromosomal organization: lessons from yeast , 2011, The Journal of cell biology.

[13]  Hideki Tanizawa,et al.  Mapping of long-range associations throughout the fission yeast genome reveals global genome organization linked to transcriptional regulation , 2010, Nucleic acids research.

[14]  Yael Mandel-Gutfreund,et al.  A structural-based statistical approach suggests a cooperative activity of PUM1 and miR-410 in human 3'-untranslated regions , 2010, Silence.

[15]  Barak Cohen,et al.  Gene–Environment Interactions at Nucleotide Resolution , 2010, PLoS genetics.

[16]  S. Gasser,et al.  The budding yeast nucleus. , 2010, Cold Spring Harbor perspectives in biology.

[17]  Yitzhak Pilpel,et al.  Composition and regulation of maternal and zygotic transcriptomes reflects species-specific reproductive mode , 2010, Genome Biology.

[18]  P. Flicek,et al.  Molecular maps of the reorganization of genome-nuclear lamina interactions during differentiation. , 2010, Molecular cell.

[19]  A. Ponti,et al.  The spatial dynamics of tissue-specific promoters during C. elegans development. , 2010, Genes & development.

[20]  William Stafford Noble,et al.  A Three-Dimensional Model of the Yeast Genome , 2010, Nature.

[21]  Aviv Regev,et al.  Chromatin signature of embryonic pluripotency is established during genome activation , 2010, Nature.

[22]  Itai Yanai,et al.  Comparison of diverse developmental transcriptomes reveals that coexpression of gene neighbors is not evolutionarily conserved. , 2009, Genome research.

[23]  I. Amit,et al.  Comprehensive mapping of long range interactions reveals folding principles of the human genome , 2011 .

[24]  E. Stone,et al.  The genetics of quantitative traits: challenges and prospects , 2009, Nature Reviews Genetics.

[25]  Robert-Jan Palstra Close encounters of the 3C kind: long-range chromatin interactions and transcriptional regulation. , 2009, Briefings in functional genomics & proteomics.

[26]  Naama Barkai,et al.  A Yeast Hybrid Provides Insight into the Evolution of Gene Expression Regulation , 2009, Science.

[27]  Gerald E. Farin,et al.  Natural neighbor extrapolation using ghost points , 2009, Comput. Aided Des..

[28]  Israel Steinfeld,et al.  Developmental programming of CpG island methylation profiles in the human genome , 2009, Nature Structural &Molecular Biology.

[29]  Israel Steinfeld,et al.  BMC Bioinformatics BioMed Central , 2008 .

[30]  J. Collado-Vides,et al.  Transcriptional regulation constrains the organization of genes on eukaryotic chromosomes , 2008, Proceedings of the National Academy of Sciences.

[31]  Dirk Schübeler,et al.  Global Reorganization of Replication Domains During Embryonic Stem Cell Differentiation , 2008, PLoS biology.

[32]  N. Barkai,et al.  Two strategies for gene regulation by promoter nucleosomes. , 2008, Genome research.

[33]  L. Kruglyak,et al.  Gene–Environment Interaction in Yeast Gene Expression , 2008, PLoS biology.

[34]  Elizabeth Kerr,et al.  Recruitment to the Nuclear Periphery Can Alter Expression of Genes in Human Cells , 2008, PLoS genetics.

[35]  V. Reinke,et al.  DEPS-1 promotes P-granule assembly and RNA interference in C. elegans germ cells , 2008, Development.

[36]  A. Clark,et al.  Regulatory changes underlying expression differences within and between Drosophila species , 2008, Nature Genetics.

[37]  Rasmus Wernersson,et al.  Probe selection for DNA microarrays using OligoWiz , 2007, Nature Protocols.

[38]  M. Collart,et al.  A SAGA-Independent Function of SPT3 Mediates Transcriptional Deregulation in a Mutant of the Ccr4-Not Complex in Saccharomyces cerevisiae , 2007, Genetics.

[39]  J. Gerhart,et al.  The theory of facilitated variation , 2007, Proceedings of the National Academy of Sciences.

[40]  Zohar Yakhini,et al.  Discovering Motifs in Ranked Lists of DNA Sequences , 2007, PLoS Comput. Biol..

[41]  G. Wray The evolutionary significance of cis-regulatory mutations , 2007, Nature Reviews Genetics.

[42]  Ron Shamir,et al.  A genome-wide analysis in Saccharomyces cerevisiae demonstrates the influence of chromatin modifiers on transcription , 2007, Nature Genetics.

[43]  Jingyuan Fu,et al.  Mapping Determinants of Gene Expression Plasticity by Genetical Genomics in C. elegans , 2006, PLoS genetics.

[44]  N. Barkai,et al.  A genetic signature of interspecies variations in gene expression , 2006, Nature Genetics.

[45]  E. Davidson The Regulatory Genome: Gene Regulatory Networks In Development And Evolution , 2006 .

[46]  Ting Wang,et al.  An improved map of conserved regulatory sites for Saccharomyces cerevisiae , 2006, BMC Bioinformatics.

[47]  M. Feder,et al.  The biological limitations of transcriptomics in elucidating stress and stress responses , 2005, Journal of evolutionary biology.

[48]  Wendy A Bickmore,et al.  Nuclear re-organisation of the Hoxb complex during mouse embryonic development , 2005, Development.

[49]  Nicola J. Rinaldi,et al.  Transcriptional regulatory code of a eukaryotic genome , 2004, Nature.

[50]  I. Yanai,et al.  Incongruent expression profiles between human and mouse orthologous genes suggest widespread neutral evolution of transcription control. , 2004, Omics : a journal of integrative biology.

[51]  Andrew G. Clark,et al.  Evolutionary changes in cis and trans gene regulation , 2004, Nature.

[52]  S. Pääbo,et al.  A Neutral Model of Transcriptome Evolution , 2004, PLoS biology.

[53]  Frank Grosveld,et al.  Spatial organization of gene expression: the active chromatin hub , 2003, Chromosome Research.

[54]  Gary Ruvkun,et al.  Long-Lived C. elegans daf-2 Mutants Are Resistant to Bacterial Pathogens , 2003, Science.

[55]  D. Slonim,et al.  Composition and dynamics of the Caenorhabditis elegans early embryonic transcriptome , 2003, Development.

[56]  Scott A. Rifkin,et al.  Evolution of gene expression in the Drosophila melanogaster subgroup , 2003, Nature Genetics.

[57]  Martin J. Lercher,et al.  Clustering of housekeeping genes provides a unified model of gene order in the human genome , 2002, Nature Genetics.

[58]  D. Botstein,et al.  Genomic expression programs in the response of yeast cells to environmental changes. , 2000, Molecular biology of the cell.

[59]  G. Church,et al.  A computational analysis of whole-genome expression data reveals chromosomal domains of gene expression , 2000, Nature Genetics.

[60]  Jean-Daniel Boissonnat,et al.  Smooth surface reconstruction via natural neighbour interpolation of distance functions , 2000, SCG '00.

[61]  Zohar Yakhini,et al.  Clustering gene expression patterns , 1999, J. Comput. Biol..

[62]  J. Hodgkin Molecular cloning and duplication of the nematode sex-determining gene tra-1. , 1993, Genetics.

[63]  N. Munakata [Genetics of Caenorhabditis elegans]. , 1989, Tanpakushitsu kakusan koso. Protein, nucleic acid, enzyme.

[64]  R. E. Carlson,et al.  An algorithm for monotone piecewise bicubic interpolation , 1989 .

[65]  J. Kruskal Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis , 1964 .

[66]  W. Nicholas,et al.  AXENIC CULTIVATION OF CAENORHARDITIS BRIGGSAE (NEMATODA: RHABDITIDAE) WITH CHEMICALLY UNDEFINED SUPPLEMENTS; COMPARATIVE STUDIES WITH RELATED NEMATODES * , 1959 .

[67]  Jennifer A. Mitchell,et al.  Preferential associations between co-regulated genes reveal a transcriptional interactome in erythroid cells , 2010, Nature Genetics.

[68]  A. Pombo,et al.  Gene positioning. , 2010, Cold Spring Harbor perspectives in biology.

[69]  W. Bickmore,et al.  Summary Nuclear re-organisation of the Hoxb complex during mouse embryonic development , 2005 .

[70]  J. Berg Genome sequence of the nematode C. elegans: a platform for investigating biology. , 1998, Science.

[71]  C. Goodall Procrustes methods in the statistical analysis of shape , 1991 .