Genetic algorithm learning as a robust approach to RNA editing site prediction

RNA editing is one of several post-transcriptional modifications that may contribute to organismal complexity in the face of limited gene complement in a genome. One form, known as C → U editing, appears to exist in a wide range of organisms, but most instances of this form of RNA editing have been discovered serendipitously. With the large amount of genomic and transcriptomic data now available, a computational analysis could provide a more rapid means of identifying novel sites of C → U RNA editing. Previous efforts have had some success but also some limitations. We present a computational method for identifying C → U RNA editing sites in genomic sequences that is both robust and generalizable. We evaluate its potential use on the best data set available for these purposes: C → U editing sites in plant mitochondrial genomes. Our method is derived from a machine learning approach known as a genetic algorithm. REGAL (RNA Editing site prediction by Genetic Algorithm Learning) is 87% accurate when tested on three mitochondrial genomes, with an overall sensitivity of 82% and an overall specificity of 91%. REGAL's performance significantly improves on other ab initio approaches to predicting RNA editing sites in this data set. REGAL has a comparable sensitivity and higher specificity than approaches which rely on sequence homology, and it has the advantage that strong sequence conservation is not required for reliable prediction of edit sites. Our results suggest that ab initio methods can generate robust classifiers of putative edit sites, and we highlight the value of combinatorial approaches as embodied by genetic algorithms. We present REGAL as one approach with the potential to be generalized to other organisms exhibiting C → U RNA editing.

[1]  E. Gibbs,et al.  Resonance Light Scattering and Its Application in Determining the Size, Shape, and Aggregation Number for Supramolecular Assemblies of Chromophores , 1999 .

[2]  Daniel S. Myers,et al.  Simple statistical models predict C-to-U edited sites in plant mitochondrial RNA , 2004, BMC Bioinformatics.

[3]  Carol Sanger,et al.  Editing , 2020, Every Day I Write the Book.

[4]  Y. Notsu,et al.  The complete sequence of the rice (Oryza sativa L.) mitochondrial genome: frequent DNA sequence acquisition and loss during the evolution of flowering plants , 2002, Molecular Genetics and Genomics.

[5]  Q. Wei,et al.  Sensitive Determination of DNA by Resonance Light Scattering with Pentamethoxyl Red , 2005 .

[6]  M. O’Connell,et al.  The many roles of an RNA editor , 2001, Nature Reviews Genetics.

[7]  P. Collings,et al.  Resonance light scattering: a new technique for studying chromophore aggregation , 1995, Science.

[8]  Jianbo Xiao,et al.  Rapid determination of ciprofloxacin lactate in drugs by the Rayleigh light scattering technique , 2007 .

[9]  Daniel R. Clutterbuck,et al.  A bioinformatic screen for novel A-I RNA editing sites reveals recoding editing in BC10 , 2005, Bioinform..

[10]  Z. Chen,et al.  A novel histidine assay using tetraphenylporphyrin manganese (III) chloride as a molecular recognition probe by resonance light scattering technique , 2006 .

[11]  Jianbo Xiao,et al.  Highly sensitive determination of trace potassium ion in serum using the resonance light scattering technique with sodium tetraphenylboron , 2007 .

[12]  R. Guigó,et al.  Evaluation of gene structure prediction programs. , 1996, Genomics.

[13]  L. Hurst,et al.  Evidence for selection on synonymous mutations affecting stability of mRNA secondary structure in mammals , 2005, Genome Biology.

[14]  H. Handa,et al.  The complete nucleotide sequence and RNA editing content of the mitochondrial genome of rapeseed (Brassica napus L.): comparative analysis of the mitochondrial genomes of rapeseed and Arabidopsis thaliana. , 2003, Nucleic acids research.

[15]  C. Huang,et al.  A wide dynamic range detection of biopolymer medicines with resonance light scattering and absorption ratiometry , 2005 .

[16]  S. Al-khalil,et al.  Electrochemical study on the determination of tinidazole in tablets. , 1999, Journal of pharmaceutical and biomedical analysis.

[17]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[18]  S. Ozkan,et al.  Electrochemical reduction of metronidazole at activated glassy carbon electrode and its determination in pharmaceutical dosage forms. , 1998, Journal of Pharmaceutical and Biomedical Analysis.

[19]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[20]  Tim Hubbard Finishing the euchromatic sequence of the human genome , 2004 .

[21]  Z. Chen,et al.  Determination of Nucleic Acids Based on their Resonance Light Scattering Enhancement Effect on Metalloporphyrin Derivatives , 2005 .

[22]  A. Brennicke,et al.  RNA editing in Arabidopsis mitochondria effects 441 C to U changes in ORFs. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[23]  S. K. Sindhwani,et al.  Spectrophotometric determination of osmium using acenaphthenequinonemonoxime (AQM) , 1973 .

[24]  Martin H. Abramson,et al.  Complete Drug Reference , 1996 .

[25]  C. Huang,et al.  On the factors affecting the enhanced resonance light scattering signals of the interactions between proteins and multiply negatively charged chromophores using water blue as an example , 2006 .

[26]  D. E. Goldberg,et al.  Genetic Algorithms in Search , 1989 .

[27]  Zipora Y. Fligelman,et al.  Systematic identification of abundant A-to-I editing sites in the human transcriptome , 2004, Nature Biotechnology.

[28]  Jing Wen Chen,et al.  Use of 3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyl tetrazolium bromide for rapid detection of methicillin-resistant Staphylococcus aureus by resonance light scattering. , 2007, Analytica chimica acta.

[29]  F. Zhao,et al.  A study on the interaction between concanavalin A and glycogen by light scattering technique and its analytical application. , 2001, Talanta.

[30]  Saranjit Singh,et al.  HPLC and LC-MS studies on stress degradation behaviour of tinidazole and development of a validated specific stability-indicating HPLC assay method. , 2004, Journal of pharmaceutical and biomedical analysis.

[31]  M. Hanson,et al.  A guide to RNA editing. , 1997, RNA.

[32]  Sean R. Eddy,et al.  Biological sequence analysis: Contents , 1998 .

[33]  Ralf Bundschuh,et al.  Computational prediction of RNA editing sites , 2004, Bioinform..

[34]  D. Higgins,et al.  SAGA: sequence alignment by genetic algorithm. , 1996, Nucleic acids research.

[35]  C. Huang,et al.  Resonance light scattering technique used for biochemical and pharmaceutical analysis , 2003 .

[36]  C. Bustamante,et al.  Porphyrin Assemblies On DNA As Studied By A Resonance Light-Scattering Technique , 1993 .

[37]  Chunhai Yang,et al.  Voltammetric Determination of Tinidazole Using a Glassy Carbon Electrode Modified with Single-Wall Carbon Nanotubes , 2004, Analytical sciences : the international journal of the Japan Society for Analytical Chemistry.

[38]  C. Huang,et al.  Enhanced plasmon resonance light scattering signals of colloidal gold resulted from its interactions with organic small molecules using captopril as an example. , 2006, Analytica chimica acta.

[39]  H. Yathirajan,et al.  Spectrophotometric determination of metronidazole and tinidazole in pharmaceutical preparations. , 2002, Journal of pharmaceutical and biomedical analysis.

[40]  Matthew R. Pocock,et al.  The Bioperl toolkit: Perl modules for the life sciences. , 2002, Genome research.

[41]  David Haussler,et al.  Analysis of human mRNAs with the reference genome sequence reveals potential errors, polymorphisms, and RNA editing. , 2004, Genome research.

[42]  H. Salomies Structure elucidation of the photolysis and hydrolysis products of tinidazole. , 1991, Acta pharmaceutica Nordica.

[43]  Jeffrey P. Mower PREP-Mt: predictive RNA editor for plant mitochondrial genes , 2005, BMC Bioinformatics.

[44]  K. Stuart,et al.  RNA editing in kinetoplastid protozoa , 1991, Current opinion in genetics & development.

[45]  Z. Zhang,et al.  In vivo and real time determination of ornidazole and tinidazole and pharmacokinetic study by capillary electrophoresis with microdialysis. , 2006, Journal of pharmaceutical and biomedical analysis.

[46]  J. Czaplicki,et al.  Mutation of exposed hydrophobic amino acids to arginine to increase protein stability , 2004, BMC Biochemistry.

[47]  Yun-you Zhou,et al.  Determination of Proteins at Nanogram Levels Using the Resonance Light Scattering Technique with a Novel PVAK Nanoparticle , 2005 .

[48]  J. Bonfield,et al.  Finishing the euchromatic sequence of the human genome , 2004, Nature.

[49]  W. Yu,et al.  RNA editing in higher plant mitochondria: analysis of biochemistry and specificity. , 1995, Biochimie.

[50]  Yizeng Liang,et al.  A simple and sensitive assay of nucleic acids based on the enhanced resonance light scattering of zwitterionics , 2005 .

[51]  Z. Chen,et al.  Determination of Proteins at Nanogram Levels Based on Their Resonance Light Scattering Decrease Effect on the Dibromo-o-Nitrophenylfluorone–Sodium Lauroyl Glutamate System , 2006 .

[52]  Stephen M. Mount,et al.  The genome sequence of Drosophila melanogaster. , 2000, Science.

[53]  Durbin,et al.  Biological Sequence Analysis , 1998 .

[54]  M A Williams,et al.  RNA editing site recognition in higher plant mitochondria. , 1999, The Journal of heredity.

[55]  P. Bork,et al.  Alternative splicing and genome complexity , 2002, Nature Genetics.

[56]  M. W. Gray,et al.  RNA editing in plant mitochondria and chloroplasts , 1993, FASEB journal : official publication of the Federation of American Societies for Experimental Biology.

[57]  H. Kössel,et al.  RNA editing in plant mitochondria and chloroplasts , 1996, Plant Molecular Biology.