Inferences on Mycobacterium Leprae Host Immune Response Escape and Antibiotic Resistance Using Genomic Data and GenomeFastScreen

The identification in bacteria, of the set of genes and amino acid positions showing evidence for positive selection, can give insight, among others, on which genes and amino acid positions are responsible for modulating the host immune response. However, such analyses are time consuming, and the frequency of genes showing evidence for positively selected amino acid sites (PSS) can be low. Therefore, the quick identification of the set of genes that likely show PSS can lead to great savings in both computational and research time. Here, we present GenomeFastScreen, a Compi-based pipeline distributed as a Docker image, that automates the process of identifying genes that likely show PSS, starting from a set of FASTA files, one per genome, containing all coding sequences. GenomeFastScreen automatically removes problematic sequences such as those showing ambiguous positions and identifies orthologous gene sets. It is also possible to identify the orthologous genes in an external reference species, a requirement for comparisons across species, or to conduct gene ontology enrichment analyses when there is no data for the species being analysed. An example of what can be achieved when using the GenomeFastScreen pipeline is given for Mycobacterium leprae that causes leprosy. In this species, after detailed analyses, PSS were found at 31 genes, including nine genes likely relevant in the context of leprosy. The orthologs of those genes in M. tuberculosum (the species used as external reference) are Rv3632 (a protein membrane gene), Rv0177 (a mce1 gene), PPE68 (a cell envelope protein), RpfB (a resuscitation-promoting factor), RecG (that provides protection against mitomycin C), lipQ and lipU (lipases) and Rv3220c and tesB1 (esterases). Therefore, the study of these genes may reveal interesting hints on the modulation of the different M. leprae phenotypes.

[1]  Hugo López-Fernández,et al.  Large Scale Analyses and Visualization of Adaptive Amino Acid Changes Projects , 2018, Interdisciplinary Sciences: Computational Life Sciences.

[2]  Ziheng Yang,et al.  PAML: a program package for phylogenetic analysis by maximum likelihood , 1997, Comput. Appl. Biosci..

[3]  Hugo López-Fernández,et al.  Bioinformatics Protocols for Quickly Obtaining Large-Scale Data Sets for Phylogenetic Inferences , 2018, Interdisciplinary Sciences: Computational Life Sciences.

[4]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[5]  Kumar Somyajit,et al.  Evidence for the role of Mycobacterium tuberculosis RecG helicase in DNA repair and recombination , 2013, The FEBS journal.

[6]  R. Berisio,et al.  Carbohydrate recognition by RpfB from Mycobacterium tuberculosis unveiled by crystallographic and molecular dynamics analyses. , 2013, Biophysical journal.

[7]  Sergei L. Kosakovsky Pond,et al.  FUBAR: a fast, unconstrained bayesian approximation for inferring selection. , 2013, Molecular biology and evolution.

[8]  Anushya Muruganujan,et al.  PANTHER version 11: expanded annotation data from Gene Ontology and Reactome pathways, and data analysis tool enhancements , 2016, Nucleic Acids Res..

[9]  Cell Envelope Protein PPE68 Contributes to Mycobacterium tuberculosis RD1 Immunogenicity Independently of a 10-Kilodalton Culture Filtrate Protein and ESAT-6 , 2004, Infection and Immunity.

[10]  Yan Li,et al.  SeqKit: A Cross-Platform and Ultrafast Toolkit for FASTA/Q File Manipulation , 2016, PloS one.

[11]  C. Soto,et al.  Mycobacterium leprae's evolution and environmental adaptation. , 2019, Acta tropica.

[12]  Jianping Xie,et al.  Characterization and function of Mycobacterium tuberculosis H37Rv Lipase Rv1076 (LipU). , 2017, Microbiological research.

[13]  Nuno A. Fonseca,et al.  ADOPS - Automatic Detection Of Positively Selected Sites , 2012, J. Integr. Bioinform..

[14]  Daniel J. Wilson,et al.  Estimating Diversifying Selection and Functional Constraint in the Presence of Recombination , 2006, Genetics.

[15]  J. Pedrosa,et al.  Evidence for diversifying selection in a set of Mycobacterium tuberculosis genes in response to antibiotic- and nonantibiotic-related pressure. , 2013, Molecular biology and evolution.

[16]  Maxim Teslenko,et al.  MrBayes 3.2: Efficient Bayesian Phylogenetic Inference and Model Choice Across a Large Model Space , 2012, Systematic biology.

[17]  Florentino Fernández Riverola,et al.  Inferring Positive Selection in Large Viral Datasets , 2019, PACBB.

[18]  N. Casali,et al.  Hypervirulent mutant of Mycobacterium tuberculosis resulting from disruption of the mce1 operon , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[19]  N. Casali,et al.  Regulation of the Mycobacterium tuberculosis mce1 Operon , 2006, Journal of bacteriology.