An Overview of Machine Learning and Big Data for Drug Toxicity Evaluation.

Drug toxicity evaluation is an essential process of drug development as it is reportedly responsible for the attrition of approximately 30% of drug candidates. The rapid increase in the number and types of large toxicology datasets together with the advances in computational methods may be used to improve many steps in drug safety evaluation. The development of in silico models to screen and understand mechanisms of drug toxicity may be particularly beneficial in the early stages of drug development where early toxicity assessment can most reduce expenses and labor time. To facilitate this, machine learning methods have been employed to evaluate drug toxicity but are often limited by small and less diverse datasets. Recent advances in machine learning methods together with the rapid increase in big toxicity data such as molecular descriptors, toxicogenomics, and high-throughput bioactivity data may help alleviate some current challenges. In this article, the most common machine learning methods used in toxicity assessment are reviewed together with examples of toxicity studies that have used machine learning methodology. Furthermore, a comprehensive overview of the different types of toxicity tools and datasets available to build in silico toxicity prediction models has been provided to give an overview of the current big toxicity data landscape and highlight opportunities and challenges related to them.