Transfer learning for predicting human skin sensitizers

Computational prioritization of chemicals for potential skin sensitization risks plays essential roles in the risk assessment of environmental chemicals and drug development. Given the huge number of chemicals for testing, computational methods enable the fast identification of high-risk chemicals for experimental validation and design of safer alternatives. However, the development of robust prediction model requires a large dataset of tested chemicals that is usually not available for most toxicological endpoints, especially for human data. A small training dataset makes the development of effective models difficult with insufficient coverage and accuracy. In this study, an ensemble tree-based multitask learning method was developed incorporating three relevant tasks in the well-defined adverse outcome pathway (AOP) of skin sensitization to transfer shared knowledge to the major task of human sensitizers. The results show both largely improved coverage and accuracy compared with three state-of-the-art methods. A user-friendly prediction server was available at https://cwtung.kmu.edu.tw/skinsensdb/predict. As AOPs for various toxicity endpoints are being actively developed, the proposed method can be applied to develop prediction models for other endpoints.

[1]  Vinicius M. Alves,et al.  Alarms about structural alerts. , 2016, Green chemistry : an international journal and green chemistry resource : GC.

[2]  Alexander Tropsha,et al.  QSAR models of human data can enrich or replace LLNA testing for human skin sensitization. , 2016, Green chemistry : an international journal and green chemistry resource : GC.

[3]  G. Patlewicz,et al.  Further evaluation of quantitative structure–activity relationship models for the prediction of the skin sensitization potency of selected fragrance allergens , 2004, Contact dermatitis.

[4]  Sebastian Hoffmann,et al.  Integrated Testing Strategy (ITS) - Opportunities to better use existing data and guide future testing in toxicology. , 2010, ALTEX.

[5]  Grace Patlewicz,et al.  Non‐animal assessment of skin sensitization hazard: Is an integrated testing strategy needed, and if so what should be integrated? , 2018, Journal of applied toxicology : JAT.

[6]  A. Tropsha Alarms about Structural Alerts , 2016 .

[7]  Masashi Sugiyama,et al.  Tree-Based Ensemble Multi-Task Learning Method for Classification and Regression , 2014, IEICE Trans. Inf. Syst..

[8]  Peter Fantke,et al.  Exploring consumer exposure pathways and patterns of use for chemicals in the environment , 2015, Toxicology reports.

[9]  Scott D. Kahn,et al.  Current Status of Methods for Defining the Applicability Domain of (Quantitative) Structure-Activity Relationships , 2005, Alternatives to laboratory animals : ATLA.

[10]  J. Lepoittevin Metabolism versus chemical transformation or pro‐ versus prehaptens? , 2006, Contact dermatitis.

[11]  Chun-Wei Tung,et al.  Identification of consensus biomarkers for predicting non-genotoxic hepatocarcinogens , 2017, Scientific Reports.

[12]  Kristina Luthman,et al.  Allergic contact dermatitis--formation, structural requirements, and reactivity of skin sensitizers. , 2008, Chemical research in toxicology.

[13]  Igor V. Tetko,et al.  ToxAlerts: A Web Server of Structural Alerts for Toxic Chemicals and Compounds with Potential Adverse Reactions , 2012, J. Chem. Inf. Model..

[14]  Judy Strickland,et al.  Bayesian integrated testing strategy (ITS) for skin sensitization potency assessment: a decision support system for quantitative weight of evidence and adaptive testing strategy , 2015, Archives of Toxicology.

[15]  Petra S Kern,et al.  Assessing skin sensitization hazard in mice and men using non-animal test methods. , 2015, Regulatory toxicology and pharmacology : RTP.

[16]  Charles X. Ling,et al.  Using AUC and accuracy in evaluating learning algorithms , 2005, IEEE Transactions on Knowledge and Data Engineering.

[17]  Shinn-Ying Ho,et al.  Computational identification of ubiquitylation sites from protein sequences , 2008, BMC Bioinformatics.

[18]  Rajarshi Guha,et al.  Chemical Informatics Functionality in R , 2007 .

[19]  David Rogers,et al.  Extended-Connectivity Fingerprints , 2010, J. Chem. Inf. Model..

[20]  João Barroso,et al.  Categorization of Chemicals According to Their Relative Human Skin Sensitizing Potency , 2014, Dermatitis : contact, atopic, occupational, drug.

[21]  G. Patlewicz,et al.  An evaluation of the implementation of the Cramer classification scheme in the Toxtree software , 2008, SAR and QSAR in environmental research.

[22]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[23]  Janine Ezendam,et al.  State of the art in non-animal approaches for skin sensitization testing: from individual test methods towards testing strategies , 2016, Archives of Toxicology.

[24]  David W. Roberts,et al.  Chemical applicability domain of the local lymph node assay (LLNA) for skin sensitisation potency. Part 4. Quantitative correlation of LLNA potency with human potency , 2018, Regulatory toxicology and pharmacology : RTP.

[25]  C Barber,et al.  Applicability domain: towards a more formal definition$ , 2016, SAR and QSAR in environmental research.

[26]  R. L. Robinson,et al.  Quantitative structure-property relationship modeling of skin sensitization: a quantitative prediction. , 2009, Toxicology in vitro : an international journal published in association with BIBRA.

[27]  Yi Li,et al.  4D-fingerprint categorical QSAR models for skin sensitization based on the classification of local lymph node assay measures. , 2007, Chemical research in toxicology.

[28]  Chun-Wei Tung,et al.  SkinSensDB: a curated database for skin sensitization assays , 2017, Journal of Cheminformatics.

[29]  Robert Landsiedel,et al.  Putting the parts together: combining in vitro methods to test for skin sensitizing potentials. , 2012, Regulatory toxicology and pharmacology : RTP.

[30]  Alexander Tropsha,et al.  Pred-Skin: A Fast and Reliable Web Application to Assess Skin Sensitization Effect of Chemicals , 2017, J. Chem. Inf. Model..

[31]  Bertrand Desprez,et al.  Non-animal methods to predict skin sensitization (II): an assessment of defined approaches** , 2018, Critical reviews in toxicology.

[32]  Egon L. Willighagen,et al.  The Chemistry Development Kit (CDK) v2.0: atom typing, depiction, molecular formulas, and substructure searching , 2017, Journal of Cheminformatics.

[33]  C. Tung Prediction of pupylation sites using the composition of k-spaced amino acid pairs. , 2013, Journal of theoretical biology.

[34]  Aarti Desai,et al.  Integrated Computational Solution for Predicting Skin Sensitization Potential of Molecules , 2016, PloS one.

[35]  Silvia Casati,et al.  Can currently available non-animal methods detect pre and pro-haptens relevant for skin sensitization? , 2016, Regulatory toxicology and pharmacology : RTP.

[36]  Olga Tcheremenskaia,et al.  (Q)SAR Methods for Predicting Genotoxicity and Carcinogenicity: Scientific Rationale and Regulatory Frameworks. , 2018, Methods in molecular biology.

[37]  Chun-Wei Tung,et al.  Mechanism‐informed read‐across assessment of skin sensitizers based on SkinSensDB , 2018, Regulatory toxicology and pharmacology : RTP.

[38]  Md Taufeeq Uddin,et al.  Extremely randomized trees for Wi-Fi fingerprint-based indoor positioning , 2015, 2015 18th International Conference on Computer and Information Technology (ICCIT).

[39]  Yi Li,et al.  Categorical QSAR Models for skin sensitization based upon local lymph node assay classification measures part 2: 4D-fingerprint three-state and two-2-state logistic regression models. , 2007, Toxicological sciences : an official journal of the Society of Toxicology.

[40]  Hua Yuan,et al.  Prediction of Skin Sensitization with a Particle Swarm Optimized Support Vector Machine , 2009, International journal of molecular sciences.