CoreTracker: accurate codon reassignment prediction, applied to mitochondrial genomes

Motivation Codon reassignments have been reported across all domains of life. With the increasing number of sequenced genomes, the development of systematic approaches for genetic code detection is essential for accurate downstream analyses. Three automated prediction tools exist so far: FACIL, GenDecoder and Bagheera; the last two respectively restricted to metazoan mitochondrial genomes and CUG reassignments in yeast nuclear genomes. These tools can only analyze a single genome at a time and are often not followed by a validation procedure, resulting in a high rate of false positives. Results We present CoreTracker, a new algorithm for the inference of sense‐to‐sense codon reassignments. CoreTracker identifies potential codon reassignments in a set of related genomes, then uses statistical evaluations and a random forest classifier to predict those that are the most likely to be correct. Predicted reassignments are then validated through a phylogeny‐aware step that evaluates the impact of the new genetic code on the protein alignment. Handling simultaneously a set of genomes in a phylogenetic framework, allows tracing back the evolution of each reassignment, which provides information on its underlying mechanism. Applied to metazoan and yeast genomes, CoreTracker significantly outperforms existing methods on both precision and sensitivity. Availability and implementation CoreTracker is written in Python and available at https://github.com/UdeM‐LBIT/CoreTracker. Contact mabrouk@iro.umontreal.ca Supplementary information Supplementary data are available at Bioinformatics online.

[1]  Yi Xiong,et al.  PseUI: Pseudouridine sites identification based on RNA sequence information , 2018, BMC Bioinformatics.

[2]  Nicolas Lartillot,et al.  PhyloBayes 3: a Bayesian software package for phylogenetic reconstruction and molecular dating , 2009, Bioinform..

[3]  Jan P. Meier-Kolthoff,et al.  Comparative genomics of biotechnologically important yeasts , 2016, Proceedings of the National Academy of Sciences.

[4]  Bas E. Dutilh,et al.  FACIL: Fast and Accurate Genetic Code Inference and Logo , 2011, Bioinform..

[5]  Dieter Söll,et al.  An unusual tRNAThr derived from tRNAHis reassigns in yeast mitochondria the CUN codons to threonine , 2011, Nucleic acids research.

[6]  Martin Kollmar,et al.  Molecular Phylogeny of Sequenced Saccharomycetes Reveals Polyphyly of the Alternative Yeast Codon Usage , 2014, Genome biology and evolution.

[7]  M. Hoy,et al.  The mitochondrial genome of the predatory mite Metaseiulus occidentalis (Arthropoda: Chelicerata: Acari: Phytoseiidae) is unexpectedly large and contains several novel features. , 2007, Gene.

[8]  David Posada,et al.  GenDecoder: genetic code prediction for metazoan mitochondria , 2006, Nucleic Acids Res..

[9]  T. Fox,et al.  Five TGA "stop" codons occur within the translated sequence of the yeast mitochondrial gene for cytochrome c oxidase subunit II. , 1979, Proceedings of the National Academy of Sciences of the United States of America.

[10]  Austin Burt,et al.  Mitochondrial Genetic Codes Evolve to Match Amino Acid Requirements of Proteins , 2004, Journal of Molecular Evolution.

[11]  W. Fitch Toward Defining the Course of Evolution: Minimum Change for a Specific Tree Topology , 1971 .

[12]  M. Yarus,et al.  Transfer RNA mutation and the malleability of the genetic code. , 1994, Journal of molecular biology.

[13]  Rafael D Rosengarten,et al.  The mitochondrial genome of the hexactinellid sponge Aphrocallistes vastus: Evidence for programmed translational frameshifting , 2008, BMC Genomics.

[14]  M. Ibba,et al.  Selection of tRNA charging quality control mechanisms that increase mistranslation of the genetic code , 2012, Nucleic acids research.

[15]  Manuel A. S. Santos,et al.  Evolution of the genetic code in yeasts , 2006, Yeast.

[16]  Sam Griffiths-Jones,et al.  tRNA anticodon shifts in eukaryotic genomes , 2014, RNA.

[17]  Laura F. Landweber,et al.  Rewiring the keyboard: evolvability of the genetic code , 2001, Nature Reviews Genetics.

[18]  Shin-ichi Yokobori,et al.  tRNA Modification and Genetic Code Variations in Animal Mitochondria , 2011, Journal of nucleic acids.

[19]  P. Keeling,et al.  Genomics: Evolution of the Genetic Code , 2016, Current Biology.

[20]  M. Gerstein,et al.  Genomic analysis of membrane protein families: abundance and conserved motifs , 2002, Genome Biology.

[21]  Robert C. Edgar,et al.  MUSCLE: a multiple sequence alignment method with reduced time and space complexity , 2004, BMC Bioinformatics.

[22]  Martin Kollmar,et al.  A novel nuclear genetic code alteration in yeasts and the evolution of codon reassignment in eukaryotes , 2016, bioRxiv.

[23]  B Franz Lang,et al.  Glass sponges and bilaterian animals share derived mitochondrial genomic features: a common ancestry or parallel evolution? , 2007, Molecular biology and evolution.

[24]  Dieter Söll,et al.  Genetic code flexibility in microorganisms: novel mechanisms and impact on physiology , 2015, Nature Reviews Microbiology.

[25]  K. McCracken,et al.  Estimating the influence of selection on the variable amino acid sites of the cytochrome B protein functional domains. , 2001, Molecular biology and evolution.

[26]  Supratim Sengupta,et al.  A Unified Model of Codon Reassignment in Alternative Genetic Codes , 2004, Genetics.

[27]  Martin Kollmar,et al.  Predicting the fungal CUG codon translation with Bagheera , 2014, BMC Genomics.

[28]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[29]  Dieter Söll,et al.  Natural reassignment of CUU and CUA sense codons to alanine in Ashbya mitochondria , 2013, Nucleic acids research.

[30]  H. Khorana,et al.  Studies on polynucleotides, XLIX. Stimulation of the binding of aminoacyl-sRNA's to ribosomes by ribotrinucleotides and a survey of codon assignments for 20 amino acids. , 1965, Proceedings of the National Academy of Sciences of the United States of America.

[31]  Jacob D. Jaffe,et al.  Naturally occurring aminoacyl-tRNA synthetases editing-domain mutations that cause mistranslation in Mycoplasma parasites , 2011, Proceedings of the National Academy of Sciences.

[32]  Kenneth M. Halanych,et al.  The New View of Animal Phylogeny , 2004 .

[33]  S. Osawa,et al.  Codon reassignment (codon capture) in evolution , 1989, Journal of Molecular Evolution.

[34]  N. Lartillot,et al.  The new animal phylogeny: reliability and implications. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[35]  F. Crick Origin of the Genetic Code , 1967, Nature.

[36]  Manuel A. S. Santos,et al.  Selective advantages created by codon ambiguity allowed for the evolution of an alternative genetic code in Candida spp. , 1999, Molecular microbiology.

[37]  Martin Kollmar,et al.  Nuclear codon reassignments in the genomics era and mechanisms behind their evolution. , 2017, BioEssays : news and reviews in molecular, cellular and developmental biology.

[38]  T. Ohama,et al.  Evolution of the mitochondrial genetic code III. Reassignment of CUN codons from leucine to threonine during evolution of yeast mitochondria , 1990, Journal of Molecular Evolution.

[39]  Xiaoguang Yang,et al.  The Mechanisms of Codon Reassignments in Mitochondrial Genetic Codes , 2007, Journal of Molecular Evolution.

[40]  Manuel A. S. Santos,et al.  Driving change: the evolution of alternative genetic codes. , 2004, Trends in genetics : TIG.

[41]  M. Nowacki,et al.  Genetic Codes with No Dedicated Stop Codon: Context-Dependent Translation Termination , 2016, Cell.

[42]  R. Knight,et al.  Parallel Evolution of the Genetic Code in Arthropod Mitochondrial Genomes , 2006, PLoS biology.

[43]  A. T. Bankier,et al.  A different genetic code in human mitochondria , 1979, Nature.