Profiling Animal Toxicants by Automatically Mining Public Bioassay Data: A Big Data Approach for Computational Toxicology

In vitro bioassays have been developed and are currently being evaluated as potential alternatives to traditional animal toxicity models. Already, the progress of high throughput screening techniques has resulted in an enormous amount of publicly available bioassay data having been generated for a large collection of compounds. When a compound is tested using a collection of various bioassays, all the testing results can be considered as providing a unique bio-profile for this compound, which records the responses induced when the compound interacts with different cellular systems or biological targets. Profiling compounds of environmental or pharmaceutical interest using useful toxicity bioassay data is a promising method to study complex animal toxicity. In this study, we developed an automatic virtual profiling tool to evaluate potential animal toxicants. First, we automatically acquired all PubChem bioassay data for a set of 4,841 compounds with publicly available rat acute toxicity results. Next, we developed a scoring system to evaluate the relevance between these extracted bioassays and animal acute toxicity. Finally, the top ranked bioassays were selected to profile the compounds of interest. The resulting response profiles proved to be useful to prioritize untested compounds for their animal toxicity potentials and form a potential in vitro toxicity testing panel. The protocol developed in this study could be combined with structure-activity approaches and used to explore additional publicly available bioassay datasets for modeling a broader range of animal toxicities.

[1]  Y Chen,et al.  Mechanism of the cardiotoxic actions of terfenadine. , 1993, JAMA.

[2]  David M. Reif,et al.  Update on EPA's ToxCast program: providing high throughput decision support tools for chemical risk management. , 2012, Chemical research in toxicology.

[3]  E Walum,et al.  Acute oral toxicity. , 1998, Environmental health perspectives.

[4]  Hans Raabe,et al.  A Database of IC50 Values and Principal Component Analysis of Results from Six Basal Cytotoxicity Assays, for Use in the Modelling of the In Vivo and In Vitro Data of the EU ACuteTox Project a , 2008, Alternatives to laboratory animals : ATLA.

[5]  Jens Meiler,et al.  Benchmarking Ligand-Based Virtual High-Throughput Screening with the PubChem Database , 2013, Molecules.

[6]  Kellyn S. Betts,et al.  Tox21 to Date: Steps toward Modernizing Human Hazard Characterization , 2013, Environmental health perspectives.

[7]  Kanyawim Kirtikara,et al.  Sulforhodamine B colorimetric assay for cytotoxicity screening , 2006, Nature Protocols.

[8]  Jean-Loup Faulon,et al.  Data mining PubChem using a support vector machine with the Signature molecular descriptor: classification of factor XIa inhibitors. , 2008, Journal of molecular graphics & modelling.

[9]  Amanda C. Schierz Virtual screening of bioassay data , 2009, J. Cheminformatics.

[10]  G. Nolan,et al.  Computational solutions to large-scale data management and analysis , 2010, Nature Reviews Genetics.

[11]  G. Nolan,et al.  Cloud and heterogeneous computing solutions exist today for the emerging big data problems in biology , 2011, Nature Reviews Genetics.

[12]  Alexander Golbraikh,et al.  A Novel Two-Step Hierarchical Quantitative Structure–Activity Relationship Modeling Work Flow for Predicting Acute Toxicity of Chemicals in Rodents , 2009, Environmental health perspectives.

[13]  R. Shoemaker The NCI60 human tumour cell line anticancer drug screen , 2006, Nature Reviews Cancer.

[14]  David M. Reif,et al.  Profiling 976 ToxCast Chemicals across 331 Enzymatic and Receptor Signaling Assays , 2013, Chemical research in toxicology.

[15]  Sandra Coecke,et al.  Acutoxbase, an innovative database for in vitro acute toxicity studies. , 2009, Toxicology in vitro : an international journal published in association with BIBRA.

[16]  Evan Bolton,et al.  PubChem's BioAssay Database , 2011, Nucleic Acids Res..

[17]  Sebastian G. Rohrer,et al.  Maximum Unbiased Validation (MUV) Data Sets for Virtual Screening Based on PubChem Bioactivity Data , 2009, J. Chem. Inf. Model..

[18]  Xiang-Qun Xie,et al.  Exploiting PubChem for virtual screening , 2010, Expert opinion on drug discovery.

[19]  John P. Overington,et al.  ChEMBL: a large-scale bioactivity database for drug discovery , 2011, Nucleic Acids Res..

[20]  Bin Chen,et al.  PubChem BioAssays as a data source for predictive models. , 2010, Journal of molecular graphics & modelling.

[21]  I. Rusyn,et al.  Use of in Vitro HTS-Derived Concentration–Response Data as Biological Descriptors Improves the Accuracy of QSAR Models of in Vivo Toxicity , 2010, Environmental health perspectives.

[22]  Ji-Bo Wang,et al.  GPU Accelerated Support Vector Machines for Mining High-Throughput Screening Data , 2009, J. Chem. Inf. Model..

[23]  Yanli Wang,et al.  Developing and validating predictive decision tree models from mining chemical structural fingerprints and high–throughput screening data in PubChem , 2008, BMC Bioinformatics.

[24]  David M. Reif,et al.  In Vitro Screening of Environmental Chemicals for Targeted Testing Prioritization: The ToxCast Project , 2009, Environmental health perspectives.

[25]  Yanli Wang,et al.  PubChem: a public information system for analyzing bioactivities of small molecules , 2009, Nucleic Acids Res..

[26]  Ruili Huang,et al.  Compound Cytotoxicity Profiling Using Quantitative High-Throughput Screening , 2007, Environmental health perspectives.

[27]  Robert J Kavlock,et al.  Profiling the reproductive toxicity of chemicals from multigeneration studies in the toxicity reference database. , 2009, Toxicological sciences : an official journal of the Society of Toxicology.