SSS-test: a novel test for detecting positive selection on RNA secondary structure

BackgroundLong non-coding RNAs (lncRNAs) play an important role in regulating gene expression and are thus important for determining phenotypes. Most attempts to measure selection in lncRNAs have focused on the primary sequence. The majority of small RNAs and at least some parts of lncRNAs must fold into specific structures to perform their biological function. Comprehensive assessments of selection acting on RNAs therefore must also encompass structure. Selection pressures acting on the structure of non-coding genes can be detected within multiple sequence alignments. Approaches of this type, however, have so far focused on negative selection. Thus, a computational method for identifying ncRNAs under positive selection is needed.ResultsWe introduce the SSS-test (test for Selection on Secondary Structure) to identify positive selection and thus adaptive evolution. Benchmarks with biological as well as synthetic controls yield coherent signals for both negative and positive selection, demonstrating the functionality of the test. A survey of a lncRNA collection comprising 15,443 families resulted in 110 candidates that appear to be under positive selection in human. In 26 lncRNAs that have been associated with psychiatric disorders we identified local structures that have signs of positive selection in the human lineage.ConclusionsIt is feasible to assay positive selection acting on RNA secondary structures on a genome-wide scale. The detection of human-specific positive selection in lncRNAs associated with cognitive disorder provides a set of candidate genes for further experimental testing and may provide insights into the evolution of cognitive abilities in humans.AvailabilityThe SSS-test and related software is available at: https://github.com/waltercostamb/SSS-test. The databases used in this work are available at: http://www.bioinf.uni-leipzig.de/Software/SSS-test/.

[1]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[2]  Wolfgang Stephan,et al.  A model of compensatory molecular evolution involving multiple sites in RNA molecules. , 2016, Journal of theoretical biology.

[3]  Chuanxing Li,et al.  Systematic analysis of genomic organization and structure of long non‐coding RNAs in the human genome , 2013, FEBS letters.

[4]  W. Stephan The rate of compensatory evolution. , 1996, Genetics.

[5]  Manolis Kellis,et al.  Evolutionary dynamics and tissue specificity of human long noncoding RNAs in six mammals , 2014, Genome research.

[6]  Thomas C Roberts,et al.  The role of long non-coding RNAs in neurodevelopment, brain function and neurological disease , 2014, Philosophical Transactions of the Royal Society B: Biological Sciences.

[7]  Robert Giegerich,et al.  Pure multiple RNA secondary structure alignments: a progressive profile approach , 2004, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[8]  Wolfgang Stephan,et al.  The Role of the Effective Population Size in Compensatory Evolution , 2011, Genome biology and evolution.

[9]  Zhiqiang Zheng,et al.  RDMAS: a web server for RNA deleterious mutation analysis , 2006, BMC Bioinformatics.

[10]  I. Hofacker,et al.  Consensus folding of aligned sequences as a new measure for the detection of functional RNAs by comparative genomics. , 2004, Journal of molecular biology.

[11]  J. Mattick,et al.  Long non-coding RNAs in nervous system function and disease , 2010, Brain Research.

[12]  M. Takeichi,et al.  The mRNA-like noncoding RNA Gomafu constitutes a novel nuclear domain in a subset of neurons , 2007, Journal of Cell Science.

[13]  Srikanth Gottipati,et al.  Evidence for the fixation of gene duplications by positive selection in Drosophila , 2016, Genome research.

[14]  A. Siepel,et al.  Fast, scalable prediction of deleterious noncoding variants from functional and population genomic data , 2016, Nature Genetics.

[15]  Robert Giegerich,et al.  RNAshapes: an integrated RNA analysis package based on abstract shapes. , 2006, Bioinformatics.

[16]  Robert D. Finn,et al.  Rfam 12.0: updates to the RNA families database , 2014, Nucleic Acids Res..

[17]  E. Rolls,et al.  Cognitive dysfunction in psychiatric disorders: characteristics, causes and the quest for improved therapy , 2012, Nature Reviews Drug Discovery.

[18]  J. Mattick,et al.  Rapid evolution of noncoding RNAs: lack of conservation does not mean lack of function. , 2006, Trends in genetics : TIG.

[19]  Benjamin Feldman,et al.  Mutations in the human SIX3 gene in holoprosencephaly are loss of function. , 2008, Human molecular genetics.

[20]  Laurence D. Hurst,et al.  Purifying Selection on Splice-Related Motifs, Not Expression Level nor RNA Folding, Explains Nearly All Constraint on Human lincRNAs , 2014, Molecular biology and evolution.

[21]  S. Blackshaw,et al.  The long noncoding RNA Six3OS acts in trans to regulate retinal development by modulating Six3 activity , 2011, Neural Development.

[22]  L. Hurst The Ka/Ks ratio: diagnosing the form of sequence evolution. , 2002, Trends in genetics : TIG.

[23]  Jan Gorodkin,et al.  Identification and characterization of novel conserved RNA structures in Drosophila , 2018, BMC Genomics.

[24]  A. Ratti,et al.  The Long Non-Coding RNAs in Neurodegenerative Diseases: Novel Mechanisms of Pathogenesis. , 2016, Current Alzheimer research.

[25]  Mikko Airavaara,et al.  Gyrus of Alzheimer Disease Regulation in Human Middle Temporal-Antisense GDNFOS Gene and Their cis Identification of Novel GDNF Isoforms and Neurobiology : , 2011 .

[26]  Li Liu,et al.  Roles of long noncoding RNAs in brain development, functional diversification and neurodegenerative diseases , 2013, Brain Research Bulletin.

[27]  P. Kapranov,et al.  Intronic RNAs constitute the major fraction of the non-coding RNA in mammalian cells , 2012, BMC Genomics.

[28]  C. Wahlestedt,et al.  Inhibition of natural antisense transcripts in vivo results in gene-specific transcriptional upregulation , 2012, Nature Biotechnology.

[29]  Sebastian Will,et al.  RNAalifold: improved consensus structure prediction for RNA alignments , 2008, BMC Bioinformatics.

[30]  Xing Chen,et al.  LncRNADisease: a database for long-non-coding RNA-associated diseases , 2012, Nucleic Acids Res..

[31]  Peter F. Stadler,et al.  RNAz 2.0: Improved Noncoding RNA Detection , 2010, Pacific Symposium on Biocomputing.

[32]  P. Stadler,et al.  RNA Maps Reveal New RNA Classes and a Possible Function for Pervasive Transcription , 2007, Science.

[33]  Kristin Reiche,et al.  STAT3-induced long noncoding RNAs in multiple myeloma cells display different properties in cancer , 2017, Scientific Reports.

[34]  An-Yuan Guo,et al.  lncRNASNP: a database of SNPs in lncRNAs and their potential functions in human and mouse , 2014, Nucleic Acids Res..

[35]  K. Morris,et al.  Evolutionary conservation of long non-coding RNAs; sequence, structure, function. , 2014, Biochimica et biophysica acta.

[36]  Peter F Stadler,et al.  Detailed secondary structure models of invertebrate 7SK RNAs , 2018, RNA biology.

[37]  Frank Grützner,et al.  The evolution of lncRNA repertoires and expression patterns in tetrapods , 2014, Nature.

[38]  C. Ponting,et al.  Functionality or transcriptional noise? Evidence for selection within long noncoding RNAs. , 2007, Genome research.

[39]  S. Eddy,et al.  A statistical test for conserved RNA structure shows lack of evidence for structure in lncRNAs , 2016, Nature Methods.

[40]  Vincent Jaquet,et al.  MALAT-1, a non protein-coding RNA is upregulated in the cerebellum, hippocampus and brain stem of human alcoholics. , 2012, Alcohol.

[41]  David Haussler,et al.  Identification and Classification of Conserved RNA Secondary Structures in the Human Genome , 2006, PLoS Comput. Biol..

[42]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[43]  Leonard Lipovich,et al.  Mining Affymetrix microarray data for long non‐coding RNAs: altered expression in the nucleus accumbens of heroin abusers , 2011, Journal of neurochemistry.

[44]  Ana Kozomara,et al.  miRBase: annotating high confidence microRNAs using deep sequencing data , 2013, Nucleic Acids Res..

[45]  Luis de la Torre Ubieta,et al.  Genome-wide changes in lncRNA, splicing, and regional gene expression patterns in autism , 2016, Nature.

[46]  D. Bartel,et al.  Conserved Function of lincRNAs in Vertebrate Embryonic Development despite Rapid Sequence Evolution , 2011, Cell.

[47]  Satoko Yamamoto,et al.  Differential role of MACC1 expression and its regulation of the HGF/c‑Met pathway between breast and colorectal cancer. , 2015, International journal of oncology.

[48]  J. Mattick,et al.  The long non-coding RNA Gomafu is acutely regulated in response to neuronal activation and involved in schizophrenia-associated alternative splicing , 2014, Molecular Psychiatry.

[49]  David G. Knowles,et al.  The GENCODE v7 catalog of human long noncoding RNAs: Analysis of their gene structure, evolution, and expression , 2012, Genome research.

[50]  S. Tapscott,et al.  An antisense transcript spanning the CGG repeat region of FMR1 is upregulated in premutation carriers but silenced in full mutation individuals. , 2007, Human molecular genetics.

[51]  Svetlana A. Shabalina,et al.  The Vast, Conserved Mammalian lincRNome , 2013, PLoS Comput. Biol..

[52]  Xiangyang Zhang,et al.  Long noncoding RNAs in psychiatric disorders , 2016, Psychiatric genetics.

[53]  Michael F. Lin,et al.  Chromatin signature reveals over a thousand highly conserved large non-coding RNAs in mammals , 2009, Nature.

[54]  J Ranstam,et al.  Multiple P-values and Bonferroni correction. , 2016, Osteoarthritis and cartilage.

[55]  Ingrid G. Abfalter,et al.  Computational design of RNAs with complex energy landscapes , 2013, Biopolymers.

[56]  D. Haussler,et al.  An RNA gene expressed during cortical development evolved rapidly in humans , 2006, Nature.

[57]  A. Mungall,et al.  Cloning and characterization of two overlapping genes in a subregion at 6q21 involved in replicative senescence and schizophrenia. , 2000, Gene.

[58]  Anna Marie Pyle,et al.  HOTAIR forms an intricate and modular secondary structure. , 2015, Molecular cell.

[59]  Peter F Stadler,et al.  Fast and reliable prediction of noncoding RNAs , 2005, Proc. Natl. Acad. Sci. USA.

[60]  M. Gottesman,et al.  Sensitive measurement of single-nucleotide polymorphism-induced changes of RNA conformation: application to disease studies , 2012, Nucleic acids research.

[61]  Ivo L Hofacker,et al.  RNA Structure Elements Conserved between Mouse and 59 Other Vertebrates , 2018, Genes.

[62]  Peter F. Stadler,et al.  RNA folding with hard and soft constraints , 2016, Algorithms for Molecular Biology.

[63]  Qin Jiang,et al.  Long non-coding RNA-MIAT promotes neurovascular remodeling in the eye and brain , 2016, Oncotarget.

[64]  C. Ponting,et al.  Catalogues of mammalian long noncoding RNAs: modest conservation and incompleteness , 2009, Genome Biology.

[65]  P. Bertolazzi,et al.  Gene expression biomarkers in the brain of a mouse model for Alzheimer's disease: mining of microarray data by logic classification and feature selection. , 2011, Journal of Alzheimer's disease : JAD.

[66]  Philip Haycock,et al.  Effect of alcohol consumption on CpG methylation in the differentially methylated regions of H19 and IG-DMR in male gametes: implications for fetal alcohol spectrum disorders. , 2009, Alcoholism, clinical and experimental research.

[67]  P. Stadler,et al.  Mapping of conserved RNA secondary structures predicts thousands of functional noncoding RNAs in the human genome , 2005, Nature Biotechnology.

[68]  Toni Gabaldón,et al.  Secondary structure impacts patterns of selection in human lncRNAs , 2016, BMC Biology.

[69]  S. Christian,et al.  Polymorphisms at the G72/G30 gene locus, on 13q33, are associated with bipolar disorder in two independent pedigree series. , 2003, American journal of human genetics.

[70]  Philipp Kapranov,et al.  Profiling of short RNAs using Helicos single-molecule sequencing. , 2012, Methods in molecular biology.

[71]  Harald Schwalbe,et al.  NMR Studies of HAR1 RNA Secondary Structures Reveal Conformational Dynamics in the Human RNA , 2012, Chembiochem : a European journal of chemical biology.

[72]  Nancy Retzlaff,et al.  Orthologs, turn-over, and remolding of tRNAs in primates and fruit flies , 2016, BMC Genomics.

[73]  Alexander Churkin,et al.  The RNAmute web server for the mutational analysis of RNA secondary structures , 2011, Nucleic Acids Res..

[74]  Kai Zheng,et al.  Long non-coding RNA MDC1-AS inhibits human gastric cancer cell proliferation and metastasis through an MDC1-dependent mechanism , 2018, Experimental and therapeutic medicine.

[75]  V D Calhoun,et al.  Polymorphisms in MIR137HG and microRNA-137-regulated genes influence gray matter structure in schizophrenia , 2016, Translational Psychiatry.

[76]  David H. Mathews,et al.  NNDB: the nearest neighbor parameter database for predicting stability of nucleic acid secondary structure , 2009, Nucleic Acids Res..

[77]  P. Kapranov,et al.  The Landscape of long noncoding RNA classification. , 2015, Trends in genetics : TIG.

[78]  Joseph Cheung,et al.  The RAY1/ST7 tumor-suppressor locus on chromosome 7q31 represents a complex multi-transcript system. , 2002, Genomics.

[79]  Katja Nowick,et al.  Temporal ordering of substitutions in RNA evolution: Uncovering the structural evolution of the Human Accelerated Region 1. , 2018, Journal of theoretical biology.

[80]  M. Furuno,et al.  Competition between a noncoding exon and introns: Gomafu contains tandem UACUAAC repeats and associates with splicing factor-1 , 2011, Genes to cells : devoted to molecular & cellular mechanisms.

[81]  A. Regev,et al.  Evolutionary analysis across mammals reveals distinct classes of long non-coding RNAs , 2015, Genome Biology.

[82]  D. Bartel,et al.  Principles of long noncoding RNA evolution derived from direct comparison of transcriptomes in 17 species. , 2015, Cell reports.

[83]  Sean R. Eddy,et al.  Infernal 1.0: inference of RNA alignments , 2009, Bioinform..

[84]  Peter F. Stadler,et al.  The Expansion of Animal MicroRNA Families Revisited , 2015, Life.

[85]  Tanja Gesell,et al.  Dinucleotide controlled null models for comparative RNA gene prediction , 2008, BMC Bioinformatics.

[86]  Peter F Stadler,et al.  Matching of Soulmates: coevolution of snoRNAs and their targets. , 2014, Molecular biology and evolution.

[87]  F. Kondrashov,et al.  The evolution of gene duplications: classifying and distinguishing between models , 2010, Nature Reviews Genetics.

[88]  Y Wang,et al.  Genome-wide differential expression of synaptic long noncoding RNAs in autism spectrum disorder , 2015, Translational Psychiatry.

[89]  Jan Gorodkin,et al.  The identification and functional annotation of RNA structures conserved in vertebrates , 2017, Genome research.

[90]  Michael Sattler,et al.  1H, 13C, 15N and 31P chemical shift assignments of a human Xist RNA A-repeat tetraloop hairpin essential for X-chromosome inactivation , 2012, Biomolecular NMR assignments.

[91]  P. Stadler,et al.  Comparison of splice sites reveals that long noncoding RNAs are evolutionarily well conserved , 2015, RNA.

[92]  Rory Johnson,et al.  Human accelerated region 1 noncoding RNA is repressed by REST in Huntington's disease. , 2010, Physiological genomics.

[93]  Chris P. Ponting,et al.  Identification and Properties of 1,119 Candidate LincRNA Loci in the Drosophila melanogaster Genome , 2012, Genome biology and evolution.

[94]  Wolfgang Stephan,et al.  Selective constraints in conserved folded RNAs of drosophilid and hominid genomes. , 2011, Molecular biology and evolution.

[95]  Wei Niu,et al.  Long Non-Coding RNA: Potential Diagnostic and Therapeutic Biomarker for Major Depressive Disorder , 2016, Medical science monitor : international medical journal of experimental and clinical research.

[96]  Rolf Backofen,et al.  Global or local? Predicting secondary structure and accessibility in mRNAs , 2012, Nucleic acids research.

[97]  Peter F Stadler,et al.  TERribly Difficult: Searching for Telomerase RNAs in Saccharomycetes , 2018, bioRxiv.

[98]  Doron Lancet,et al.  AHI1, a pivotal neurodevelopmental gene, and C6orf217 are associated with susceptibility to schizophrenia , 2006, European Journal of Human Genetics.

[99]  G. Bhagat,et al.  A long noncoding RNA associated with susceptibility to celiac disease , 2016, Science.

[100]  C. Ponting,et al.  Long noncoding RNA genes: conservation of sequence and brain expression among diverse amniotes , 2010, Genome Biology.

[101]  D. Xing,et al.  LRH1 enhances cell resistance to chemotherapy by transcriptionally activating MDC1 expression and attenuating DNA damage in human breast cancer , 2018, Oncogene.

[102]  Mark Daly,et al.  Principles and methods of in-silico prioritization of non-coding regulatory variants , 2017, Human Genetics.

[103]  Charles S Bond,et al.  The ins and outs of lncRNA structure: How, why and what comes next? , 2016, Biochimica et biophysica acta.

[104]  Elena Rivas,et al.  Noncoding RNA gene detection using comparative sequence analysis , 2001, BMC Bioinformatics.

[105]  Paulo P. Amaral,et al.  The Reality of Pervasive Transcription , 2011, PLoS biology.

[106]  Alain Laederach,et al.  Disease-Associated Mutations That Alter the RNA Structural Ensemble , 2010, PLoS genetics.

[107]  Wilfried Haerty,et al.  Mutations within lncRNAs are effectively selected against in fruitfly but not in human , 2013, Genome Biology.

[108]  Jérôme Waldispühl,et al.  corRna: a web server for predicting multiple-point deleterious mutations in structural RNAs , 2011, Nucleic Acids Res..

[109]  Data production leads,et al.  An integrated encyclopedia of DNA elements in the human genome , 2012 .

[110]  Rob W. Ness,et al.  Assessing Recent Selection and Functionality at Long Noncoding RNA Loci in the Mouse Genome , 2015, Genome biology and evolution.

[111]  M. Höchsmann,et al.  The tree alignment model : algorithms, implementations and applications for the analysis of RNA secondary structures , 2005 .

[112]  Cesare Furlanello,et al.  A promoter-level mammalian expression atlas , 2015 .

[113]  M. Bannon,et al.  Distinctive Profiles of Gene Expression in the Human Nucleus Accumbens Associated with Cocaine and Heroin Abuse , 2006, Neuropsychopharmacology.

[114]  C Joel McManus,et al.  Structural analyses of NEAT1 lncRNAs suggest long-range RNA interactions that may contribute to paraspeckle architecture , 2018, Nucleic acids research.

[115]  A. Siepel,et al.  Probabilities of Fitness Consequences for Point Mutations Across the Human Genome , 2014, Nature Genetics.

[116]  Matthew W. Hahn,et al.  Distinguishing among evolutionary models for the maintenance of gene duplicates. , 2009, The Journal of heredity.

[117]  Sachihiro Matsunaga,et al.  A nucleolar protein RRS1 contributes to chromosome congression , 2009, FEBS letters.

[118]  Zasha Weinberg,et al.  CMfinder - a covariance model based RNA motif finding algorithm , 2006, Bioinform..

[119]  A. Siepel,et al.  Probabilities of Fitness Consequences for Point Mutations Across the Human Genome , 2014, Nature Genetics.

[120]  Ilan Gronau,et al.  Inference of natural selection from interspersed genomic elements based on polymorphism and divergence. , 2011, Molecular biology and evolution.

[121]  Eric Westhof,et al.  Distinctive structures between chimpanzee and human in a brain noncoding RNA. , 2008, RNA.

[122]  Chi-Chung Hui,et al.  Disruption at the Ptchd1 Locus on Xp22.11 in Autism Spectrum Disorder and Intellectual Disability Nih Public Access , 2010 .

[123]  W Stephan,et al.  Selection intensity against deleterious mutations in RNA secondary structures and rate of compensatory nucleotide substitutions. , 2001, Genetics.

[124]  N. Delihas,et al.  Formation of a Family of Long Intergenic Noncoding RNA Genes with an Embedded Translocation Breakpoint Motif in Human Chromosomal Low Copy Repeats of 22q11.2—Some Surprises and Questions , 2018, Non-coding RNA.

[125]  Masato Matsuura,et al.  Significant linkage to chromosome 22q for exploratory eye movement dysfunction in schizophrenia , 2003, American journal of medical genetics. Part B, Neuropsychiatric genetics : the official publication of the International Society of Psychiatric Genetics.

[126]  Qi Xu,et al.  Genetic variants in long non-coding RNA MIAT contribute to risk of paranoid schizophrenia in a Chinese Han population , 2015, Schizophrenia Research.

[127]  C. Wahlestedt,et al.  Knockdown of BACE1-AS Nonprotein-Coding Transcript Modulates Beta-Amyloid-Related Hippocampal Neurogenesis , 2011, International journal of Alzheimer's disease.

[128]  Natalia Castro,et al.  Human Pancreatic β Cell lncRNAs Control Cell-Specific Regulatory Networks , 2016, bioRxiv.

[129]  Ilan Gronau,et al.  Genome-wide inference of natural selection on human transcription factor binding sites , 2013, Nature Genetics.

[130]  Joseph Y. Cheung,et al.  Depletion of the Human Ion Channel TRPM2 in Neuroblastoma Demonstrates Its Key Role in Cell Survival through Modulation of Mitochondrial Reactive Oxygen Species and Bioenergetics* , 2016, The Journal of Biological Chemistry.

[131]  Peter F. Stadler,et al.  Prediction of locally stable RNA secondary structures for genome-wide surveys , 2004, Bioinform..

[132]  W. L. Ruzzo,et al.  Comparative genomics beyond sequence-based alignments: RNA structures in the ENCODE regions. , 2008, Genome research.

[133]  P. Stadler,et al.  Widespread purifying selection on RNA structure in mammals , 2013, Nucleic acids research.

[134]  Jan Gorodkin,et al.  RNAsnp: Efficient Detection of Local RNA Secondary Structure Changes Induced by SNPs , 2013, Human mutation.

[135]  Anders D. Børglum,et al.  Genome-wide association study identifies five new schizophrenia loci , 2011, Nature Genetics.

[136]  Peter F. Stadler,et al.  ViennaRNA Package 2.0 , 2011, Algorithms for Molecular Biology.