FastMLST: A Multi-core Tool for Multilocus Sequence Typing of Draft Genome Assemblies

Multilocus Sequence Typing (MLST) is a precise microbial typing approach at the intra-species level for epidemiological and evolutionary purposes. It operates by assigning a sequence type (ST) identifier to each specimen, based on a combination of allelic sequences obtained for multiple housekeeping genes included in a defined scheme. The use of MLST has multiplied due to the availability of large numbers of genomic sequences and epidemiological data in public repositories. However, data processing speed has become problematic due to datasets’ massive size. Here, we present FastMLST, a tool that is designed to perform PubMLST searches using BLASTn and a divide-and-conquer approach. Compared with mlst, CGE/MLST, MLSTar, and PubMLST, FastMLST takes advantage of current multi-core computers to simultaneously type thousands of genome assemblies in minutes, reducing processing times by at least 4-fold and with more than 99.95% consistency. Availability and Implementation The source code, installation instructions and documentation are available at https://github.com/EnzoAndree/FastMLST

[1]  Douglas R. Smith,et al.  The Design of Divide and Conquer Algorithms , 1985, Sci. Comput. Program..

[2]  Ning Ma,et al.  BLAST+: architecture and applications , 2009, BMC Bioinformatics.

[3]  M. Maiden,et al.  Multilocus Sequence Typing System forCampylobacter jejuni , 2001, Journal of Clinical Microbiology.

[4]  L. Lemée,et al.  Multilocus sequence typing for Clostridium difficile. , 2010, Methods in molecular biology.

[5]  B. Kimura Will the emergence of core genome MLST end the role of in silico MLST? , 2017, Food microbiology.

[6]  U. Römling,et al.  Multilocus sequence typing of Shewanella algae isolates identifies disease‐causing Shewanella chilikensis strain 6I4 , 2018, FEMS microbiology ecology.

[7]  T. Popović,et al.  Characterization of Encapsulated and Noncapsulated Haemophilus influenzae and Determination of Phylogenetic Relationships by Multilocus Sequence Typing , 2003, Journal of Clinical Microbiology.

[8]  Keith A Jolley,et al.  Open-access bacterial population genomics: BIGSdb software, the PubMLST.org website and their applications , 2018, Wellcome open research.

[9]  M. Achtman,et al.  Multilocus sequence typing: a portable approach to the identification of clones within populations of pathogenic microorganisms. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[10]  Peter Kämpfer,et al.  Multilocus sequence analysis (MLSA) in prokaryotic taxonomy. , 2015, Systematic and applied microbiology.

[11]  Ole Lund,et al.  Multilocus Sequence Typing of Total-Genome-Sequenced Bacteria , 2012, Journal of Clinical Microbiology.

[12]  Ignacio Ferrés,et al.  MLSTar: automatic multilocus sequence typing of bacterial genomes in R , 2018, PeerJ.