Intrinsic Properties of tRNA Molecules as Deciphered via Bayesian Network and Distribution Divergence Analysis

The identity/recognition of tRNAs, in the context of aminoacyl tRNA synthetases (and other molecules), is a complex phenomenon that has major implications ranging from the origins and evolution of translation machinery and genetic code to the evolution and speciation of tRNAs themselves to human mitochondrial diseases to artificial genetic code engineering. Deciphering it via laboratory experiments, however, is difficult and necessarily time- and resource-consuming. In this study, we propose a mathematically rigorous two-pronged in silico approach to identifying and classifying tRNA positions important for tRNA identity/recognition, rooted in machine learning and information-theoretic methodology. We apply Bayesian Network modeling to elucidate the structure of intra-tRNA-molecule relationships, and distribution divergence analysis to identify meaningful inter-molecule differences between various tRNA subclasses. We illustrate the complementary application of these two approaches using tRNA examples across the three domains of life, and identify and discuss important (informative) positions therein. In summary, we deliver to the tRNA research community a novel, comprehensive methodology for identifying the specific elements of interest in various tRNA molecules, which can be followed up by the corresponding experimental work and/or high-resolution position-specific statistical analyses.

[1]  Sergey Steinberg,et al.  Compilation of tRNA sequences and sequences of tRNA genes , 2004, Nucleic Acids Res..

[2]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[3]  Y. Benjamini,et al.  Identifying the ligated amino acid of archaeal tRNAs based on positions outside the anticodon , 2016, RNA.

[4]  Dieter Söll,et al.  Evolution of translation machinery in recoded bacteria enables multi-site incorporation of nonstandard amino acids , 2015, Nature Biotechnology.

[5]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.

[6]  Samuel S. Cho,et al.  MD Simulations of tRNA and Aminoacyl-tRNA Synthetases: Dynamics, Folding, Binding, and Allostery , 2015, International journal of molecular sciences.

[7]  Peter R Wills,et al.  Interdependence, Reflexivity, Fidelity, Impedance Matching, and the Evolution of Genetic Coding , 2017, bioRxiv.

[8]  Jeffrey R. Adrion,et al.  The Roles of Compensatory Evolution and Constraint in Aminoacyl tRNA Synthetase Evolution , 2015, Molecular biology and evolution.

[9]  Dean Laslett,et al.  ARAGORN, a program to detect tRNA genes and tmRNA genes in nucleotide sequences. , 2004, Nucleic acids research.

[10]  Andrew L. Lee,et al.  An Ancestral Tryptophanyl-tRNA Synthetase Precursor Achieves High Catalytic Rate Enhancement without Ordered Ground-State Tertiary Structures. , 2016, ACS chemical biology.

[12]  P. Fang,et al.  Structural characterization of human aminoacyl-tRNA synthetases for translational and nontranslational functions. , 2017, Methods.

[13]  Amit Sharma,et al.  Novel and unique domains in aminoacyl-tRNA synthetases from human fungal pathogens Aspergillus niger, Candida albicans and Cryptococcus neoformans , 2014, BMC Genomics.

[14]  Nancy Retzlaff,et al.  Orthologs, turn-over, and remolding of tRNAs in primates and fruit flies , 2016, BMC Genomics.

[15]  C. Brooks,et al.  Noncanonical secondary structure stabilizes mitochondrial tRNA(Ser(UCN)) by reducing the entropic cost of tertiary folding. , 2015, Journal of the American Chemical Society.

[16]  Eugene V Koonin,et al.  Origin and Evolution of the Universal Genetic Code. , 2017, Annual review of genetics.

[17]  Robert W. Taylor,et al.  Short peptides from leucyl-tRNA synthetase rescue disease-causing mitochondrial tRNA point mutations , 2015, Human molecular genetics.

[18]  A. Rich,et al.  Structural domains of transfer RNA molecules. , 1976, Science.

[19]  G. Eriani,et al.  MIST, a Novel Approach to Reveal Hidden Substrate Specificity in Aminoacyl-tRNA Synthetases , 2015, PloS one.

[20]  Andrei S. Rodin,et al.  New Algorithm and Software (BNOmics) for Inferring and Visualizing Bayesian Networks from Heterogeneous Big Biological and Genetic Data , 2017, J. Comput. Biol..

[21]  David Heckerman,et al.  A Tutorial on Learning with Bayesian Networks , 1998, Learning in Graphical Models.

[22]  P. Schimmel Development of tRNA synthetases and connection to genetic code and disease , 2008, Protein science : a publication of the Protein Society.

[23]  B. Kuhlman,et al.  Functional Class I and II amino acid-activating enzymes can be coded by opposite strands of the same gene. , 2016, The Journal of Biological Chemistry.

[24]  Hélène Touzet,et al.  Modeling Alternate RNA Structures in Genomic Sequences , 2015, J. Comput. Biol..

[25]  Andrei S. Rodin,et al.  Use of Wrapper Algorithms Coupled with a Random Forests Classifier for Variable Selection in Large-Scale Genomic Association Studies , 2009, J. Comput. Biol..

[26]  Identity Elements of tRNA as Derived from Information Analysis , 2018, Origins of Life and Evolution of Biospheres.

[27]  M. Di Giulio The aminoacyl-tRNA synthetases had only a marginal role in the origin of the organization of the genetic code: Evidence in favor of the coevolution theory. , 2017, Journal of theoretical biology.

[28]  Nevena Cvetesic,et al.  Synthetic and editing reactions of aminoacyl-tRNA synthetases using cognate and non-cognate amino acid substrates. , 2017, Methods.

[29]  R Giegé,et al.  Universal rules and idiosyncratic features in tRNA identity. , 1998, Nucleic acids research.

[30]  Peter F. Stadler,et al.  tRNAdb 2009: compilation of tRNA sequences and tRNA genes , 2008, Nucleic Acids Res..

[31]  Massimo Di Giulio,et al.  Some pungent arguments against the physico-chemical theories of the origin of the genetic code and corroborating the coevolution theory. , 2017, Journal of theoretical biology.

[32]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[33]  G. Eriani,et al.  Transfer RNA Recognition and Aminoacylation by Synthetases , 2021, eLS.

[34]  Andrei S. Rodin,et al.  Exploring Genetic Epidemiology Data with Bayesian Networks , 2012 .

[35]  H. Suga,et al.  Recent Developments of Engineered Translational Machineries for the Incorporation of Non-Canonical Amino Acids into Polypeptides , 2015, International journal of molecular sciences.

[36]  A. Riggs,et al.  Analysis of high-resolution 3D intrachromosomal interactions aided by Bayesian network modeling , 2017, Proceedings of the National Academy of Sciences.

[37]  Paul Schimmel,et al.  The emerging complexity of the tRNA world: mammalian tRNAs beyond protein synthesis , 2017, Nature Reviews Molecular Cell Biology.