Dimensionality reduction of mass spectrometry imaging data using autoencoders

The use of mass spectrometry imaging (MSI) techniques has become a powerful tool in the fields of biology, pharmacology and healthcare. Next generation experimental techniques are able to generate 100s of gigabytes of data from a single image acquisition and thus require advanced algorithms in order to analyse these data. At present, analytical work-flows begin with pre-processing of the data to reduce its size. However, the pre-processed data is also high in dimensionality and requires reduction techniques in order to analyse the data. At present, mostly linear dimensionality reduction techniques are used for hyper-spectral data. Here we successfully apply an autoencoder to MSI data with over 165,000 pixels and more than 7,000 spectral channels reducing it into a few core features. Our unsupervised method provides the MSI community with an effective non-linear dimensionality reduction technique which includes the mapping to and from the reduced dimensional space. This method has added benefits over methods such as PCA by removing the need to select meaningful features from the entire list of components, reducing subjectivity and significant human interaction from the analysis.

[1]  Liam A. McDonnell,et al.  Imaging of peptides in the rat brain using MALDI-FTICR mass spectrometry , 2007, Journal of the American Society for Mass Spectrometry.

[2]  Per E Andrén,et al.  msIQuant--Quantitation Software for Mass Spectrometry Imaging Enabling Fast Access, Visualization, and Analysis of Large Data Sets. , 2016, Analytical chemistry.

[3]  Giancarlo Mauri,et al.  Machine learning approaches in MALDI-MSI: clinical applications , 2016, Expert review of proteomics.

[4]  Josephine Bunch,et al.  Repeat MALDI MS imaging of a single tissue section using multiple matrices and tissue washes , 2013, Analytical and Bioanalytical Chemistry.

[5]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[6]  R. Tibshirani,et al.  Sparse Principal Component Analysis , 2006 .

[7]  Ian S. Gilmore,et al.  Quantification and methodology issues in multivariate analysis of ToF‐SIMS data for mixed organic systems , 2008 .

[8]  Theodore Alexandrov,et al.  Spatial segmentation of imaging mass spectrometry data with edge-preserving image denoising and clustering. , 2010, Journal of proteome research.

[9]  Robert van Liere,et al.  Extended data analysis strategies for high resolution imaging MS: New methods to deal with extremely large image hyperspectral datasets , 2007 .

[10]  Juan Antonio Vizcaíno,et al.  A public repository for mass spectrometry imaging data , 2014, Analytical and Bioanalytical Chemistry.

[11]  Kim-Kwang Raymond Choo,et al.  Spectral–spatial multi-feature-based deep learning for hyperspectral remote sensing image classification , 2016, Soft Computing.

[12]  Jody C. May,et al.  Advanced Multidimensional Separations in Mass Spectrometry: Navigating the Big Data Deluge. , 2016, Annual review of analytical chemistry.

[13]  Josephine Bunch,et al.  SpectralAnalysis: Software for the Masses. , 2016, Analytical chemistry.

[14]  Ian S. Gilmore,et al.  The matrix effect in organic secondary ion mass spectrometry , 2015 .

[15]  M. Stoeckli,et al.  Compound and metabolite distribution measured by MALDI mass spectrometric imaging in whole-body tissue sections , 2007 .

[16]  R. Cooks,et al.  Mass Spectrometry Sampling Under Ambient Conditions with Desorption Electrospray Ionization , 2004, Science.

[17]  Ela Claridge,et al.  Sucrose cryo-protection facilitates imaging of whole eye sections by MALDI mass spectrometry. , 2012, Journal of mass spectrometry : JMS.

[18]  Prabhat,et al.  Identifying important ions and positions in mass spectrometry imaging data using CUR matrix decompositions. , 2015, Analytical chemistry.

[19]  Allan R. Jones,et al.  Genome-wide atlas of gene expression in the adult mouse brain , 2007, Nature.

[20]  Shigeki Kajihara,et al.  Development of imaging mass spectrometry (IMS) dataset extractor software, IMS convolution , 2011, Analytical and bioanalytical chemistry.

[21]  Lisa H Cazares,et al.  Imaging Mass Spectrometry of a Specific Fragment of Mitogen-Activated Protein Kinase/Extracellular Signal-Regulated Kinase Kinase Kinase 2 Discriminates Cancer from Uninvolved Prostate Tissue , 2009, Clinical Cancer Research.

[22]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[23]  Josephine Bunch,et al.  Imaging of Phospholipids in Formalin Fixed Rat Brain Sections by Matrix Assisted Laser Desorption/Ionization Mass Spectrometry , 2011, Journal of the American Society for Mass Spectrometry.

[24]  Yansheng Li,et al.  Unsupervised Spectral–Spatial Feature Learning With Stacked Sparse Autoencoder for Hyperspectral Imagery Classification , 2015, IEEE Geoscience and Remote Sensing Letters.

[25]  Richard M. Caprioli,et al.  MALDI-FTICR imaging mass spectrometry of drugs and metabolites in tissue. , 2008, Analytical chemistry.

[26]  Jouke Dijkstra,et al.  Large-Scale Mass Spectrometry Imaging Investigation of Consequences of Cortical Spreading Depression in a Transgenic Mouse Model of Migraine , 2015, Journal of The American Society for Mass Spectrometry.

[27]  Iain B. Styles,et al.  Memory efficient principal component analysis for the dimensionality reduction of large mass spectrometry imaging data sets. , 2013, Analytical chemistry.

[28]  Iain B. Styles,et al.  The Use of Random Projections for the Analysis of Mass Spectrometry Imaging Data , 2014, Journal of The American Society for Mass Spectrometry.

[29]  C. Magee,et al.  Secondary ion quadrupole mass spectrometer for depth profiling--design and performance evaluation. , 1978, The Review of scientific instruments.

[30]  D. Shen,et al.  Computer-Aided Diagnosis with Deep Learning Architecture: Applications to Breast Lesions in US Images and Pulmonary Nodules in CT Scans , 2016, Scientific Reports.

[31]  Pierre P Massion,et al.  High‐throughput proteomic analysis of formalin‐fixed paraffin‐embedded tissue microarrays using MALDI imaging mass spectrometry , 2008, Proteomics.

[32]  Ian S. Gilmore,et al.  Multivariate image analysis strategies for ToF‐SIMS images with topography , 2009 .

[33]  Laurens van der Maaten,et al.  Barnes-Hut-SNE , 2013, ICLR.

[34]  Detlev Suckau,et al.  Classification of HER2 receptor status in breast cancer tissues by MALDI imaging mass spectrometry. , 2010, Journal of proteome research.

[35]  Genevera I. Allen,et al.  Sparse Higher-Order Principal Components Analysis , 2012, AISTATS.

[36]  R. Caprioli,et al.  Direct molecular analysis of whole-body animal tissue sections by imaging MALDI mass spectrometry. , 2006, Analytical chemistry.

[37]  Gary S. May,et al.  Neural network modeling of reactive ion etching using optical emission spectroscopy data , 2003 .

[38]  Theodore Alexandrov,et al.  MALDI imaging mass spectrometry: statistical data analysis and current computational challenges , 2012, BMC Bioinformatics.

[39]  Ian W. Fletcher,et al.  Multivariate analysis of extremely large ToFSIMS imaging datasets by a rapid PCA method , 2015 .

[40]  Boudewijn P F Lelieveldt,et al.  Automatic generic registration of mass spectrometry imaging data to histology using nonlinear stochastic embedding. , 2014, Analytical chemistry.

[41]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[42]  M. Karas,et al.  Influence of the wavelength in high-irradiance ultraviolet laser desorption mass spectrometry of organic molecules , 1985 .

[43]  John C. Lindon,et al.  Robust data processing and normalization strategy for MALDI mass spectrometric imaging. , 2012, Analytical chemistry.

[44]  Josephine Bunch,et al.  Optimisation of colour schemes to accurately display mass spectrometry imaging data based on human colour perception , 2015, Analytical and Bioanalytical Chemistry.

[45]  Yaozong Gao,et al.  Deformable MR Prostate Segmentation via Deep Feature Learning and Sparse Patch Matching , 2017, Deep Learning for Medical Image Analysis.

[46]  Simon J. Doran,et al.  Autoencoder in Time-Series Analysis for Unsupervised Tissues Characterisation in a Large Unlabelled Medical Image Dataset , 2011, 2011 10th International Conference on Machine Learning and Applications and Workshops.