Conserved Noncoding Elements Follow Power-Law-Like Distributions in Several Genomes as a Result of Genome Dynamics

Conserved, ultraconserved and other classes of constrained elements (collectively referred as CNEs here), identified by comparative genomics in a wide variety of genomes, are non-randomly distributed across chromosomes. These elements are defined using various degrees of conservation between organisms and several thresholds of minimal length. We here investigate the chromosomal distribution of CNEs by studying the statistical properties of distances between consecutive CNEs. We find widespread power-law-like distributions, i.e. linearity in double logarithmic scale, in the inter-CNE distances, a feature which is connected with fractality and self-similarity. Given that CNEs are often found to be spatially associated with genes, especially with those that regulate developmental processes, we verify by appropriate gene masking that a power-law-like pattern emerges irrespectively of whether elements found close or inside genes are excluded or not. An evolutionary model is put forward for the understanding of these findings that includes segmental or whole genome duplication events and eliminations (loss) of most of the duplicated CNEs. Simulations reproduce the main features of the observed size distributions. Power-law-like patterns in the genomic distributions of CNEs are in accordance with current knowledge about their evolutionary history in several genomes.

[1]  Li,et al.  Expansion-modification systems: A model for spatial 1/f spectra. , 1991, Physical review. A, Atomic, molecular, and optical physics.

[2]  Misako Takayasu,et al.  Statistical properties of aggregation with injection , 1991 .

[3]  P. Munson,et al.  DNA correlations , 1992, Nature.

[4]  Swethaa S. Ballakrishnen,et al.  ‘Families’ , 1992, Accidental Feminism.

[5]  R. Voss,et al.  Evolution of long-range fractal correlations and 1/f noise in DNA base sequences. , 1992, Physical review letters.

[6]  C. Peng,et al.  Long-range correlations in nucleotide sequences , 1992, Nature.

[7]  N. M. Brooke,et al.  A molecular timescale for vertebrate evolution , 1998, Nature.

[8]  T J Gibson,et al.  Evidence in favour of ancient octaploidy in the vertebrate genome. , 2000, Biochemical Society transactions.

[9]  Anton J. Enright,et al.  Estimation of Synteny Conservation and Genome Compaction Between Pufferfish (Fugu) and Human , 2000, Yeast.

[10]  Paul A. Overbeek,et al.  A transgenic insertion upstream of Sox9 is associated with dominant XX sex reversal in the mouse , 2000, Nature Genetics.

[11]  M. Lynch,et al.  The evolutionary fate and consequences of duplicate genes. , 2000, Science.

[12]  A G Clark,et al.  The search for meaning in noncoding DNA. , 2001, Genome research.

[13]  Wentian Li,et al.  Zipf's Law everywhere , 2002, Glottometrics.

[14]  Lada A. Adamic,et al.  Zipf's law and the Internet , 2002, Glottometrics.

[15]  Naoto Endo,et al.  Disruption of a long-range cis-acting regulator for Shh causes preaxial polydactyly , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[16]  M. Adams,et al.  Recent Segmental Duplications in the Human Genome , 2002, Science.

[17]  Eugene V Koonin,et al.  A significant fraction of conserved noncoding DNA in human and mouse consists of predicted matrix attachment regions. , 2003, Trends in genetics : TIG.

[18]  B. Oostra,et al.  A long-range Shh enhancer regulates expression in the developing limb and fin and is associated with preaxial polydactyly. , 2003, Human molecular genetics.

[19]  D. Haussler,et al.  Ultraconserved Elements in the Human Genome , 2004, Science.

[20]  Boris Lenhard,et al.  Arrays of ultraconserved non-coding regions span the loci of key developmental genes in vertebrate genomes , 2004, BMC Genomics.

[21]  M. Newman Power laws, Pareto distributions and Zipf's law , 2005 .

[22]  M. Hosoya,et al.  Elimination of a long-range cis-regulatory module causes complete loss of limb-specific Shh expression and truncation of the mouse limb , 2005, Development.

[23]  D. Ovcharenko,et al.  Genomic deletion of a long-range bone enhancer misregulates sclerostin in Van Buchem disease. , 2005, Genome research.

[24]  Gill Bejerano,et al.  Ultraconserved elements in insect genomes: a highly conserved intronic sequence implicated in the control of homothorax mRNA splicing. , 2005, Genome research.

[25]  I. Ovcharenko,et al.  Human-zebrafish non-coding conserved elements act in vivo to regulate transcription , 2005, Nucleic acids research.

[26]  Brandon S Gaut,et al.  Plant conserved non-coding sequences and paralogue evolution. , 2005, Trends in genetics : TIG.

[27]  D. Haussler,et al.  Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. , 2005, Genome research.

[28]  M. Nóbrega,et al.  In vivo characterization of a vertebrate ultraconserved enhancer. , 2005, Genomics.

[29]  Justin Johnson,et al.  Ancient Noncoding Elements Conserved in the Human Genome , 2006, Science.

[30]  Laurent Excoffier,et al.  Conserved noncoding sequences are selectively constrained and not mutation cold spots , 2006, Nature Genetics.

[31]  H. Ten Have,et al.  Open Access , 2021, Dictionary of Global Bioethics.

[32]  G. Church,et al.  Mammalian ultraconserved elements are strongly depleted among segmental duplications and copy number variants , 2006, Nature Genetics.

[33]  E. Lander,et al.  A family of conserved noncoding elements derived from an ancient transposable element. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[34]  Runsheng Chen,et al.  Conserved distances between vertebrate highly conserved elements. , 2006, Human molecular genetics.

[35]  Klaudia Walter,et al.  Parallel evolution of conserved non-coding elements that target a common set of developmental regulatory genes from worms to humans , 2007, Genome Biology.

[36]  Alan M. Moses,et al.  In vivo enhancer analysis of human conserved non-coding sequences , 2006, Nature.

[37]  Tanya Vavouri,et al.  Defining a genomic radius for long-range enhancer action: duplicated conserved non-coding elements hold the key. , 2006, Trends in genetics : TIG.

[38]  Paul Havlak,et al.  Scale-invariant structure of strongly conserved sequence in genomic intersections and alignments , 2006, Proceedings of the National Academy of Sciences.

[39]  T. Mikkelsen,et al.  Systematic discovery of regulatory motifs in conserved regions of the human genome, including thousands of CTCF insulator sites , 2007, Proceedings of the National Academy of Sciences.

[40]  C. V. Jongeneel,et al.  Vertebrate conserved non coding DNA regions have a high persistence length and a short persistence time , 2007, BMC Genomics.

[41]  Marie Sémon,et al.  Reciprocal gene loss between Tetraodon and zebrafish after whole genome duplication in their ancestor. , 2007, Trends in genetics : TIG.

[42]  L. Martignetti,et al.  Universal power law behaviors in genomic sequences and evolutionary models. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[43]  D. Cooper,et al.  Molecular mechanisms of chromosomal rearrangement during primate evolution , 2008, Chromosome Research.

[44]  Yannis Almirantis,et al.  Alu and LINE1 distributions in the human chromosomes: evidence of global genomic organization expressed in the form of power laws. , 2007, Molecular biology and evolution.

[45]  Su Yeon Kim,et al.  Adaptive Evolution of Conserved Noncoding Elements in Mammals , 2007, PLoS genetics.

[46]  Thomas D. Schmittgen,et al.  Ultraconserved regions encoding ncRNAs are altered in human leukemias and carcinomas. , 2007, Cancer cell.

[47]  M. Kasahara,et al.  The 2R hypothesis: an update. , 2007, Current opinion in immunology.

[48]  Bronwen L. Aken,et al.  Genome of the marsupial Monodelphis domestica reveals innovation in non-coding sequences , 2007, Nature.

[49]  K. Grzeschik,et al.  Ultraconserved non‐coding sequence element controls a subset of spatiotemporal GLI3 expression , 2007, Development, growth & differentiation.

[50]  K. Howe,et al.  Genomic regulatory blocks encompass multiple neighboring genes and maintain conserved synteny in vertebrates. , 2007, Genome research.

[51]  Axel Visel,et al.  Deletion of Ultraconserved Elements Yields Viable Mice , 2007, PLoS biology.

[52]  E. Eichler,et al.  Evolutionary dynamics of segmental duplications from human Y-chromosomal euchromatin/heterochromatin transition regions. , 2008, Genome research.

[53]  Cecilia Saccone,et al.  Genome duplication and gene-family evolution: the case of three OXPHOS gene families. , 2008, Gene.

[54]  Michael Pheasant,et al.  Large-scale appearance of ultraconserved elements in tetrapod genomes and slowdown of the molecular clock. , 2008, Molecular biology and evolution.

[55]  M. Yamamura,et al.  Identification and characterization of new long conserved noncoding sequences in vertebrates , 2008, Mammalian Genome.

[56]  Lin Zhang,et al.  Ultraconserved elements: Genomics, function and disease , 2008, RNA biology.

[57]  Tanya Vavouri,et al.  Tuning in to the signals: noncoding sequence conservation in vertebrate genomes. , 2008, Trends in genetics : TIG.

[58]  G. Elgar,et al.  Organization of Conserved Elements near Key Developmental Regulators in Vertebrate Genomes I. Introduction Ii. Gene-regulatory Networks in Development Iii. Identification of Evolutionarily Constrained Sequences Using Phylogenetic Footprinting Iv. Searches for Regulatory Elements Using Evolutionary C , 2022 .

[59]  A. Visel,et al.  Ultraconservation identifies a small subset of extremely constrained developmental enhancers , 2008, Nature Genetics.

[60]  Highly similar noncoding genomic DNA sequences: ultraconserved, or merely widespread? , 2008, Genome.

[61]  D. Goode,et al.  Early Evolution of Conserved Regulatory Sequences Associated with Development in Vertebrates , 2009, PLoS genetics.

[62]  S. Brenner,et al.  Large number of ultraconserved elements were already present in the jawed vertebrate ancestor. , 2008, Molecular biology and evolution.

[63]  I. Amit,et al.  Comprehensive mapping of long range interactions reveals folding principles of the human genome , 2011 .

[64]  G. Cocho,et al.  Universality of Rank-Ordering Distributions in the Arts and Sciences , 2009, PloS one.

[65]  Boris Lenhard,et al.  Systematic human/zebrafish comparative identification of cis-regulatory activity around vertebrate developmental transcription factor genes. , 2009, Developmental biology.

[66]  Yannis Almirantis,et al.  Power-laws in the genomic distribution of coding segments in several organisms: an evolutionary trace of segmental duplications, possible paleopolyploidy and gene loss. , 2009, Gene.

[67]  Mark E. J. Newman,et al.  Power-Law Distributions in Empirical Data , 2007, SIAM Rev..

[68]  Martin Vingron,et al.  Deeply conserved chordate noncoding sequences preserve genome synteny but do not drive gene duplicate retention. , 2009, Genome research.

[69]  Jeffrey H. Chuang,et al.  The importance of being cis: evolution of orthologous fish and mammalian enhancer activity. , 2010, Molecular biology and evolution.

[70]  Aaron R. Quinlan,et al.  Bioinformatics Applications Note Genome Analysis Bedtools: a Flexible Suite of Utilities for Comparing Genomic Features , 2022 .

[71]  Kostas Karamanos,et al.  Scaling properties and fractality in the distribution of coding segments in eukaryotic genomes revealed through a block entropy approach. , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[72]  N. Saitou,et al.  Evolution of Conserved Non-Coding Sequences Within the Vertebrate Hox Clusters Through the Two-Round Whole Genome Duplications Revealed by Phylogenetic Footprinting Analysis , 2010, Journal of Molecular Evolution.

[73]  S. Brenner,et al.  Ancient vertebrate conserved noncoding elements have been evolving rapidly in teleost fishes. , 2011, Molecular biology and evolution.

[74]  Albert J. Vilella,et al.  A high-resolution map of human evolutionary constraint using 29 mammals , 2011, Nature.

[75]  M. Porter,et al.  Critical Truths About Power Laws , 2012, Science.

[76]  Diamantis Sellis,et al.  Widespread occurrence of power-law distributions in inter-repeat distances shaped by genome dynamics. , 2012, Gene.

[77]  Gill Bejerano,et al.  Human Developmental Enhancers Conserved between Deuterostomes and Protostomes , 2012, PLoS genetics.

[78]  K. Nakao,et al.  Regulation of Six1 expression by evolutionarily conserved enhancers in tetrapods. , 2012, Developmental biology.

[79]  Philipp Bucher,et al.  Genomic context analysis reveals dense interaction network between vertebrate ultraconserved non-coding elements , 2012, Bioinform..

[80]  Naruya Saitou,et al.  Vertebrate Paralogous Conserved Noncoding Sequences May Be Related to Gene Expressions in Brain , 2012, Genome biology and evolution.

[81]  Andrew C. Nelson,et al.  Conserved non-coding elements and cis regulation: actions speak louder than words , 2013, Development.

[82]  E. Birney,et al.  Highly conserved elements discovered in vertebrates are present in non-syntenic loci of tunicates, act as enhancers and can be transcribed during development , 2013, Nucleic acids research.

[83]  Boris Lenhard,et al.  The mystery of extreme non-coding conservation , 2013, Philosophical Transactions of the Royal Society B: Biological Sciences.

[84]  Matthias Mann,et al.  A DNA-centric protein interaction map of ultraconserved elements reveals contribution of transcription factor binding hubs to conservation. , 2013, Cell reports.

[85]  Aaron Smith,et al.  Loss , 2016, Medical Humanities.