Towards Compound Identification of Synthetic Opioids in Non-targeted Screening Using Machine Learning Techniques.

The constant evolution of the illicit drug market makes the identification of unknown compounds problematic. Obtaining certified reference materials for a broad array of new analogues can be difficult and cost prohibitive. Machine learning provides a promising avenue to putatively identify a compound before confirmation against a standard. In this study, machine learning approaches were used to develop class prediction and retention time prediction models. The developed class prediction model used a Naïve Bayes architecture to classify opioids as belonging to either the fentanyl analogues, AH series or U series, with an accuracy of 89.5%. The model was most accurate for the fentanyl analogues, most likely due to their greater number in the training data. This classification model can provide guidance to an analyst when determining a suspected structure. A retention time prediction model was also trained for a wide array of synthetic opioids. This model utilised Gaussian Process Regression to predict the retention time of analytes based on multiple generated molecular features with 79.7% of the samples predicted within ± 0.1 min of their experimental retention time. Once the suspected structure of an unknown compound is determined, molecular features can be generated and input for the prediction model to compare with experimental retention time. The incorporation of machine learning prediction models into a compound identification workflow can assist putative identifications with greater confidence and ultimately save time and money in the purchase and/or production of superfluous certified reference materials.

[1]  Zonghai Chen,et al.  A novel Gaussian process regression model for state-of-health estimation of lithium-ion battery using charging curve , 2018 .

[2]  R. Smith,et al.  Characterization of 2C-phenethylamines using high-resolution mass spectrometry and Kendrick mass defect filters , 2018 .

[3]  Adam Cawley,et al.  Current applications of high-resolution mass spectrometry for the analysis of new psychoactive substances: a critical review , 2017, Analytical and Bioanalytical Chemistry.

[4]  S. van Calenbergh,et al.  Report on a novel emerging class of highly potent benzimidazole NPS opioids. , 2019, Drug testing and analysis.

[5]  Andreas Krause,et al.  A tutorial on Gaussian process regression: Modelling, exploring, and exploiting functions , 2016, bioRxiv.

[6]  K. Linnet,et al.  Application of a screening method for fentanyl and its analogues using UHPLC-QTOF-MS with data-independent acquisition (DIA) in MSE mode and retrospective analysis of authentic forensic blood samples. , 2018, Drug testing and analysis.

[7]  S. Fu,et al.  Characterization of hallucinogenic phenethylamines using high-resolution mass spectrometry for non-targeted screening purposes. , 2017, Drug testing and analysis.

[8]  Marie Mardal,et al.  Prediction of collision cross section and retention time for broad scope screening in gradient reversed-phase liquid chromatography-ion mobility-high resolution accurate mass spectrometry. , 2018, Journal of chromatography. A.

[9]  Leon P Barron,et al.  Prediction of chromatographic retention time in high-resolution anti-doping screening data using artificial neural networks. , 2013, Analytical chemistry.

[10]  D. Chicco,et al.  The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation , 2020, BMC Genomics.

[11]  Timothy Bollé,et al.  Machine learning & forensic science. , 2019, Forensic science international.

[12]  Thomas Hartung,et al.  Big-data and machine learning to revamp computational toxicology and its use in risk assessment. , 2018, Toxicology research.