Bacterial species identification from MALDI-TOF mass spectra through data analysis and machine learning.

At present, there is much variability between MALDI-TOF MS methodology for the characterization of bacteria through differences in e.g., sample preparation methods, matrix solutions, organic solvents, acquisition methods and data analysis methods. After evaluation of the existing methods, a standard protocol was developed to generate MALDI-TOF mass spectra obtained from a collection of reference strains belonging to the genera Leuconostoc, Fructobacillus and Lactococcus. Bacterial cells were harvested after 24h of growth at 28°C on the media MRS or TSA. Mass spectra were generated, using the CHCA matrix combined with a 50:48:2 acetonitrile:water:trifluoroacetic acid matrix solution, and analyzed by the cell smear method and the cell extract method. After a data preprocessing step, the resulting high quality data set was used for PCA, distance calculation and multi-dimensional scaling. Using these analyses, species-specific information in the MALDI-TOF mass spectra could be demonstrated. As a next step, the spectra, as well as the binary character set derived from these spectra, were successfully used for species identification within the genera Leuconostoc, Fructobacillus, and Lactococcus. Using MALDI-TOF MS identification libraries for Leuconostoc and Fructobacillus strains, 84% of the MALDI-TOF mass spectra were correctly identified at the species level. Similarly, the same analysis strategy within the genus Lactococcus resulted in 94% correct identifications, taking species and subspecies levels into consideration. Finally, two machine learning techniques were evaluated as alternative species identification tools. The two techniques, support vector machines and random forests, resulted in accuracies between 94% and 98% for the identification of Leuconostoc and Fructobacillus species, respectively.

[1]  P. Vandamme,et al.  Leuconostoc holzapfelii sp. nov., isolated from Ethiopian coffee fermentation and assessment of sequence analysis of housekeeping genes for delineation of Leuconostoc species. , 2007, International journal of systematic and evolutionary microbiology.

[2]  Trinad Chakraborty,et al.  Rapid Identification and Typing of Listeria Species by Matrix-Assisted Laser Desorption Ionization-Time of Flight Mass Spectrometry , 2008, Applied and Environmental Microbiology.

[3]  D B Kell,et al.  Rapid identification of urinary tract infection bacteria using hyperspectral whole-organism fingerprinting and artificial neural networks. , 1998, Microbiology.

[4]  M. Karas,et al.  Comparison between vacuum sublimed matrices and conventional dried droplet preparation in MALDI-TOF mass spectrometry , 2009, Journal of the American Society for Mass Spectrometry.

[5]  B. Slabbinck,et al.  Towards large-scale FAME-based bacterial species identification using machine learning techniques. , 2009, Systematic and applied microbiology.

[6]  G. Satten,et al.  MALDI-TOF mass spectrometry as a tool for differentiation of invasive and noninvasive Streptococcus pyogenes isolates , 2008, FEMS immunology and medical microbiology.

[7]  C. Fagerquist,et al.  Composite sequence proteomic analysis of protein biomarkers of Campylobacter coli, C. lari and C. concisus for bacterial identification. , 2007, The Analyst.

[8]  J. Peter-Katalinic,et al.  MALDI MS : a practical guide to instrumentation, methods and applications , 2013 .

[9]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[10]  B. V. Baar Characterisation of bacteria by matrix-assisted laser desorption/ionisation and electrospray mass spectrometry. , 2000 .

[11]  D B Kell,et al.  Rapid identification of Streptococcus and Enterococcus species using diffuse reflectance-absorbance Fourier transform infrared spectroscopy and artificial neural networks. , 1996, FEMS microbiology letters.

[12]  David G. Stork,et al.  Pattern Classification , 1973 .

[13]  M. Erhard,et al.  Rapid Classification and Identification of Salmonellae at the Species and Subspecies Levels by Whole-Cell Matrix-Assisted Laser Desorption Ionization – Time of Flight Mass Spectrometry † , 2008 .

[14]  M Giacomini,et al.  Artificial neural network based identification of environmental bacteria by gas-chromatographic and electrophoretic data. , 2000, Journal of microbiological methods.

[15]  T. Hadfield,et al.  Repeatability and pattern recognition of bacterial fatty acid profiles generated by direct mass spectrometric analysis of in situ thermal hydrolysis/methylation of whole cells. , 2003, Talanta.

[16]  A. Konopka,et al.  Optimization of MALDI-TOF MS for strain level differentiation of Arthrobacter isolates. , 2006, Journal of Microbiological Methods.

[17]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[18]  Carol Iversen,et al.  Identification of Enterobacter sakazakii from closely related species: The use of Artificial Neural Networks in the analysis of biochemical and 16S rDNA data , 2006, BMC Microbiology.

[19]  Ruifu Yang,et al.  Universal Sample Preparation Method for Characterization of Bacteria by Matrix-Assisted Laser Desorption Ionization-Time of Flight Mass Spectrometry , 2007, Applied and Environmental Microbiology.

[20]  C. Vay,et al.  Fourier Transform Infrared Spectroscopy for Rapid Identification of Nonfermenting Gram-Negative Bacteria Isolated from Sputum Samples from Cystic Fibrosis Patients , 2008, Journal of Clinical Microbiology.

[21]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[22]  James E Slaven,et al.  Discrimination of intact mycobacteria at the strain level: A combined MALDI‐TOF MS and biostatistical analysis , 2006, Proteomics.

[23]  J. Lay,et al.  MALDI-TOF mass spectrometry of bacteria. , 2001, Mass spectrometry reviews.

[24]  S. N. Davey,et al.  The rapid identification of intact microorganisms using mass spectrometry , 1996, Nature Biotechnology.

[25]  P. Leopold,et al.  Sample preparation of Gram-positive bacteria for identification by matrix assisted laser desorption/ionization time-of-flight. , 2002, Journal of microbiological methods.

[26]  G. Salzano,et al.  Use of unsupervised and supervised artificial neural networks for the identification of lactic acid bacteria on the basis of SDS-PAGE patterns of whole cell proteins. , 2006, Journal of microbiological methods.

[27]  Markus Kostrzewa,et al.  Challenging the problem of clostridial identification with matrix-assisted laser desorption and ionization-time-of-flight mass spectrometry (MALDI-TOF MS). , 2008, Anaerobe.

[28]  D. Mouwen,et al.  Artificial neural network based identification of Campylobacter species by Fourier transform infrared spectroscopy. , 2006, Journal of microbiological methods.

[29]  E. Alocilja,et al.  Identification of bacterial rep-PCR genomic fingerprints using a backpropagation neural network , 1999 .

[30]  E. Pauw,et al.  Rapid identification of environmental bacterial strains by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry. , 2004, Rapid communications in mass spectrometry : RCM.

[31]  P. Demirev,et al.  Characterization of intact microorganisms by MALDI mass spectrometry. , 2001, Mass spectrometry reviews.

[32]  H. Zou,et al.  Recent developments in methods and technology for analysis of biological samples by MALDI-TOF-MS , 2006, Analytical and bioanalytical chemistry.

[33]  Richard Simon,et al.  Bias in error estimation when using cross-validation for model selection , 2006, BMC Bioinformatics.

[34]  Somnath Datta,et al.  Standardization and denoising algorithms for mass spectra to classify whole-organism bacterial specimens , 2004, Bioinform..

[35]  George M. Carlone,et al.  Differentiation of Streptococcus pneumoniae Conjunctivitis Outbreak Isolates by Matrix-Assisted Laser Desorption Ionization-Time of Flight Mass Spectrometry , 2008, Applied and Environmental Microbiology.

[36]  Application Note # MT-80 Microorganism Identification and Classification Based on MALDI-TOF MS Fingerprinting with MALDI Biotyper , 2008 .

[37]  Jonathan R. Iredell,et al.  Pathogen profiling for disease management and surveillance , 2007, Nature Reviews Microbiology.

[38]  Kristin H. Jarman,et al.  Bacterial analysis by MALDI-TOF mass spectrometry: An inter-laboratory comparison , 2005, Journal of the American Society for Mass Spectrometry.

[39]  S. Pfaller,et al.  The development of a matrix-assisted laser desorption/ionization mass spectrometry-based method for the protein fingerprinting and identification of Aeromonas species using whole cells. , 2006, Journal of microbiological methods.

[40]  M. Soufian,et al.  Rapid typing of bacteria using matrix-assisted laser desorption ionisation time-of-flight mass spectrometry and pattern recognition software. , 2002, Journal of microbiological methods.

[41]  J. Lay,et al.  Experimental factors affecting the quality and reproducibility of MALDI TOF mass spectra obtained from whole bacteria cells , 2003, Journal of the American Society for Mass Spectrometry.

[42]  A. Moya,et al.  Analysis of and function predictions for previously conserved hypothetical or putative proteins in Blochmannia floridanus , 2006, BMC Microbiology.

[43]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.