Automated analysis of electrospray ionization fourier transform ion cyclotron resonance mass spectra of natural organic matter.

The advent of ultra-high-resolution mass spectrometry has revolutionized the ability of aquatic biogeochemists to examine molecular-level components of complex mixtures of organic matter. The ability to accurately assess the chemical composition, elemental formulas, or both of detected compounds is critical to these studies. Here we build on previous work that uses functional group relationships between compounds to extend elemental formulas of low molecular weight compounds to those of higher molecular weight. We propose an automated compound identification algorithm (CIA) for the analysis of ultra-high-resolution mass spectra of natural organic matter acquired by electrospray ionization Fourier transform ion cyclotron resonance mass spectrometry. This approach is benchmarked with synthetic data sets of compounds cited in the literature. The sensitivity of our results is examined for different sources of error, and CIA is applied to two previously published data sets. We find that CIA works well for data sets with high mass accuracy (<1 ppm) and can accurately determine the elemental formulas for >95% of all compounds composed of C, H, O, and N. Data with lower mass accuracy must be accompanied with additional knowledge of chemical structure, composition, or both in order to yield accurate elemental formulas.