Performance of combined fragmentation and retention prediction for the identification of organic micropollutants by LC-HRMS

AbstractIn nontarget screening, structure elucidation of small molecules from high resolution mass spectrometry (HRMS) data is challenging, particularly the selection of the most likely candidate structure among the many retrieved from compound databases. Several fragmentation and retention prediction methods have been developed to improve this candidate selection. In order to evaluate their performance, we compared two in silico fragmenters (MetFrag and CFM-ID) and two retention time prediction models (based on the chromatographic hydrophobicity index (CHI) and on log D). A set of 78 known organic micropollutants was analyzed by liquid chromatography coupled to a LTQ Orbitrap HRMS with electrospray ionization (ESI) in positive and negative mode using two fragmentation techniques with different collision energies. Both fragmenters (MetFrag and CFM-ID) performed well for most compounds, with average ranking the correct candidate structure within the top 25% and 22 to 37% for ESI+ and ESI− mode, respectively. The rank of the correct candidate structure slightly improved when MetFrag and CFM-ID were combined. For unknown compounds detected in both ESI+ and ESI−, generally positive mode mass spectra were better for further structure elucidation. Both retention prediction models performed reasonably well for more hydrophobic compounds but not for early eluting hydrophilic substances. The log D prediction showed a better accuracy than the CHI model. Although the two fragmentation prediction methods are more diagnostic and sensitive for candidate selection, the inclusion of retention prediction by calculating a consensus score with optimized weighting can improve the ranking of correct candidates as compared to the individual methods. Graphical abstractConsensus workflow for combining fragmentation and retention prediction in LC-HRMS-based micropollutant identification

[1]  Lowell H. Hall,et al.  Prediction of HPLC Retention Index Using Artificial Neural Networks and IGroup E-State Indices , 2009, J. Chem. Inf. Model..

[2]  Antony J. Williams,et al.  The CompTox Chemistry Dashboard: a community data resource for environmental chemistry , 2017, Journal of Cheminformatics.

[3]  Juho Rousu,et al.  Metabolite identification and molecular fingerprint prediction through machine learning , 2012, Bioinform..

[4]  W. Brack,et al.  Linear solvation energy relationships as classifier in non-target analysis--an approach for isocratic liquid chromatography. , 2014, Journal of chromatography. A.

[5]  M. Abraham,et al.  Relationships between the chromatographic hydrophobicity indices and solute descriptors obtained by using several reversed-phase, diol, nitrile, cyclodextrin and immobilised artificial membrane-bonded high-performance liquid chromatography columns , 1998 .

[6]  R. Abagyan,et al.  XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification. , 2006, Analytical chemistry.

[7]  Emma L. Schymanski,et al.  Similarity of High-Resolution Tandem Mass Spectrometry Spectra of Structurally Related Micropollutants and Transformation Products , 2017, Journal of The American Society for Mass Spectrometry.

[8]  Matthias Müller-Hannemann,et al.  In silico fragmentation for computer assisted identification of metabolite mass spectra , 2010, BMC Bioinformatics.

[9]  Janusz Pawliszyn,et al.  Quantitative structure-retention relationships models for prediction of high performance liquid chromatography retention time of small molecules: endogenous metabolites and banned compounds. , 2013, Analytica chimica acta.

[10]  Yang Xiang,et al.  Generalized Simulated Annealing for Global Optimization: The GenSA Package , 2013, R J..

[11]  Emma L. Schymanski,et al.  Automatic recalibration and processing of tandem mass spectra using formula annotation. , 2013, Journal of mass spectrometry : JMS.

[12]  W. Brack,et al.  Impact of untreated wastewater on a major European river evaluated with a combination of in vitro bioassays and chemical analysis. , 2017, Environmental pollution.

[13]  Bo Yang,et al.  Extended Product Function Modeling for Conceptual Design , 2013 .

[14]  S. Böcker,et al.  Searching molecular structure databases with tandem mass spectra using CSI:FingerID , 2015, Proceedings of the National Academy of Sciences of the United States of America.

[15]  Valery Tkachenko,et al.  Identification of “Known Unknowns” Utilizing Accurate Mass Data and ChemSpider , 2011, Journal of The American Society for Mass Spectrometry.

[16]  Emma L. Schymanski,et al.  Mass spectral databases for LC/MS- and GC/MS-based metabolomics: state of the field and future prospects , 2016 .

[17]  David S. Wishart,et al.  CFM-ID: a web server for annotation, spectrum prediction and metabolite identification from tandem mass spectra , 2014, Nucleic Acids Res..

[18]  Liliane Mouawad,et al.  vSDC: a method to improve early recognition in virtual screening when limited experimental resources are available , 2016, Journal of Cheminformatics.

[19]  Juho Rousu,et al.  Critical Assessment of Small Molecule Identification 2016: automated methods , 2017, Journal of Cheminformatics.

[20]  Reza Aalizadeh,et al.  Quantitative Structure-Retention Relationship Models To Support Nontarget High-Resolution Mass Spectrometric Screening of Emerging Contaminants in Environmental Samples , 2016, J. Chem. Inf. Model..

[21]  Chris Morley,et al.  Open Babel: An open chemical toolbox , 2011, J. Cheminformatics.

[22]  Russ Greiner,et al.  Competitive fragmentation modeling of ESI-MS/MS spectra for putative metabolite identification , 2013, Metabolomics.

[23]  A. Pelander,et al.  Prediction of liquid chromatographic retention for differentiation of structural isomers. , 2012, Analytica chimica acta.

[24]  Lubertus Bijlsma,et al.  Critical evaluation of a simple retention time predictor based on LogKow as a complementary tool in the identification of emerging contaminants in water. , 2015, Talanta.

[25]  Emma L. Schymanski,et al.  MetFrag relaunched: incorporating strategies beyond in silico fragmentation , 2016, Journal of Cheminformatics.

[26]  M. Abraham,et al.  Characterizing the selectivity of stationary phases and organic modifiers in reversed-phase high-performance liquid chromatographic systems by a general solvation equation using gradient elution. , 2000, Journal of chromatographic science.

[27]  Jon R. Sobus,et al.  Identifying known unknowns using the US EPA’s CompTox Chemistry Dashboard , 2017, Analytical and Bioanalytical Chemistry.

[28]  Werner Brack,et al.  Linear Solvation Energy Relationships as classifiers in non-target analysis--a capillary liquid chromatography approach. , 2011, Journal of chromatography. A.

[29]  Martin Krauss,et al.  Identification of novel micropollutants in wastewater by a combination of suspect and nontarget screening. , 2014, Environmental pollution.

[30]  Gillian L McEneff,et al.  Gradient liquid chromatographic retention time prediction for suspect screening applications: A critical assessment of a generalised artificial neural network-based approach across 10 multi-residue reversed-phase analytical methods. , 2016, Talanta.

[31]  Martin Krauss,et al.  LC–high resolution MS in environmental analysis: from target screening to the identification of unknowns , 2010, Analytical and bioanalytical chemistry.

[32]  Sebastian Böcker,et al.  Searching molecular structure databases using tandem MS data: are we there yet? , 2017, Current opinion in chemical biology.

[33]  R. Taft,et al.  Study of retention processes in reversed-phase high-performance liquid chromatography by the use of the solvatochromic comparison method. , 1985, Analytical chemistry.

[34]  Martin Krauss,et al.  Consensus structure elucidation combining GC/EI-MS, structure generation, and calculated properties. , 2012, Analytical chemistry.

[35]  Lars Ridder,et al.  Automatic chemical structure annotation of an LC-MS(n) based metabolic profile from green tea. , 2013, Analytical chemistry.

[36]  P. Haddad,et al.  Performance comparison of partial least squares-related variable selection methods for quantitative structure retention relationships modelling of retention times in reversed-phase liquid chromatography. , 2015, Journal of chromatography. A.