Distributed computing strategies for processing of FT-ICR MS imaging datasets for continuous mode data visualization

AbstractHigh-resolution Fourier transform ion cyclotron resonance (FT-ICR) mass spectrometry imaging enables the spatial mapping and identification of biomolecules from complex surfaces. The need for long time-domain transients, and thus large raw file sizes, results in a large amount of raw data (“big data”) that must be processed efficiently and rapidly. This can be compounded by large-area imaging and/or high spatial resolution imaging. For FT-ICR, data processing and data reduction must not compromise the high mass resolution afforded by the mass spectrometer. The continuous mode “Mosaic Datacube” approach allows high mass resolution visualization (0.001 Da) of mass spectrometry imaging data, but requires additional processing as compared to feature-based processing. We describe the use of distributed computing for processing of FT-ICR MS imaging datasets with generation of continuous mode Mosaic Datacubes for high mass resolution visualization. An eight-fold improvement in processing time is demonstrated using a Dutch nationally available cloud service. Graphical abstractᅟ

[1]  Henning Hermjakob,et al.  Hydra: a scalable proteomic search engine which utilizes the Hadoop distributed computing framework , 2012, BMC Bioinformatics.

[2]  Oliver Rübel,et al.  OpenMSI: a high-performance web-based platform for mass spectrometry imaging. , 2013, Analytical chemistry.

[3]  Daniel J. Blankenberg,et al.  Galaxy: A Web‐Based Genome Analysis Tool for Experimentalists , 2010, Current protocols in molecular biology.

[4]  Y. Qi,et al.  Autophaser: an algorithm for automated generation of absorption mode spectra for FT-ICR MS. , 2013, Analytical chemistry.

[5]  R. Heeren,et al.  Absorption mode FTICR mass spectrometry imaging. , 2013, Analytical chemistry.

[6]  Carole A. Goble,et al.  The Taverna workflow suite: designing and executing workflows of Web Services on the desktop, web or in the cloud , 2013, Nucleic Acids Res..

[7]  Lennart Martens,et al.  Distributed computing and data storage in proteomics: Many hands make light work, and a stronger memory , 2014, Proteomics.

[8]  Ivo Klinkert,et al.  Methods for full resolution data exploration and visualization for large 2D and 3D mass spectrometry imaging datasets , 2014 .

[9]  Yassene Mohammed,et al.  Cloud parallel processing of tandem mass spectrometry based proteomics data. , 2012, Journal of proteome research.

[10]  Jimmy K Eng,et al.  Fast parallel tandem mass spectral library searching using GPU hardware acceleration. , 2011, Journal of proteome research.

[11]  Liam A. McDonnell,et al.  Imaging mass spectrometry data reduction: Automated feature identification and extraction , 2010, Journal of the American Society for Mass Spectrometry.

[12]  Ljiljana Paša-Tolić,et al.  Advanced Mass Calibration and Visualization for FT-ICR Mass Spectrometry Imaging , 2012, Journal of The American Society for Mass Spectrometry.

[13]  Steven G. Johnson,et al.  The Design and Implementation of FFTW3 , 2005, Proceedings of the IEEE.

[14]  Andrew J Link,et al.  Parallel tandem: a program for parallel processing of tandem mass spectra using PVM or MPI and X!Tandem. , 2005, Journal of proteome research.

[15]  A. Nekrutenko,et al.  Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences , 2010, Genome Biology.

[16]  Daniel J. Blankenberg,et al.  Galaxy: a platform for interactive large-scale genome analysis. , 2005, Genome research.

[17]  R. Heeren,et al.  Mass spectrometric imaging for biomedical tissue analysis. , 2010, Chemical reviews.

[18]  J. Jeffry Howbert,et al.  MR-Tandem: parallel X!Tandem using Hadoop MapReduce on Amazon Web Services , 2012, Bioinform..

[19]  S. Beu,et al.  Automated broadband phase correction of Fourier transform ion cyclotron resonance mass spectra. , 2010, Analytical chemistry.

[20]  R. L. Hunter,et al.  Experimental determination of the effects of space charge on ion cyclotron resonance frequencies , 1983 .

[21]  Enis Afgan,et al.  CloudMan as a platform for tool, data, and analysis distribution , 2012, BMC Bioinformatics.

[22]  Liam A McDonnell,et al.  Imaging mass spectrometry. , 2007, Mass spectrometry reviews.

[23]  Matthew R. Pocock,et al.  Taverna: a tool for the composition and enactment of bioinformatics workflows , 2004, Bioinform..

[24]  Liam A. McDonnell,et al.  Imaging Mass Spectrometry , 2012 .

[25]  L. McDonnell,et al.  High Speed Data Processing for Imaging MS-Based Molecular Histology Using Graphical Processing Units , 2012, Journal of The American Society for Mass Spectrometry.

[26]  A. Marshall,et al.  Fourier transform ion cyclotron resonance mass spectrometry: a primer. , 1998, Mass spectrometry reviews.