A hierarchical anatomical classification schema for prediction of phenotypic side effects

Prediction of adverse drug reactions is an important problem in drug discovery endeavors which can be addressed with data-driven strategies. SIDER is one of the most reliable and frequently used datasets for identification of key features as well as building machine learning models for side effects prediction. The inherently unbalanced nature of this data presents with a difficult multi-label multi-class problem towards prediction of drug side effects. We highlight the intrinsic issue with SIDER data and methodological flaws in relying on performance measures such as AUC while attempting to predict side effects.We argue for the use of metrics that are robust to class imbalance for evaluation of classifiers. Importantly, we present a ‘hierarchical anatomical classification schema’ which aggregates side effects into organs, sub-systems, and systems. With the help of a weighted performance measure, using 5-fold cross-validation we show that this strategy facilitates biologically meaningful side effects prediction at different levels of anatomical hierarchy. By implementing various machine learning classifiers we show that Random Forest model yields best classification accuracy at each level of coarse-graining. The manually curated, hierarchical schema for side effects can also serve as the basis of future studies towards prediction of adverse reactions and identification of key features linked to specific organ systems. Our study provides a strategy for hierarchical classification of side effects rooted in the anatomy and can pave the way for calibrated expert systems for multi-level prediction of side effects.

[1]  R. Sharan,et al.  Metabolic Network Prediction of Drug Side Effects. , 2016, Cell systems.

[2]  Hua Xu,et al.  Large-scale prediction of adverse drug reactions using chemical, biological, and phenotypic properties of drugs , 2012, J. Am. Medical Informatics Assoc..

[3]  Jie Shen,et al.  Adverse Drug Events: Database Construction and in Silico Prediction , 2013, J. Chem. Inf. Model..

[4]  Jorge Cadima,et al.  Principal component analysis: a review and recent developments , 2016, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[5]  Feng Liu,et al.  Predicting drug side effects by multi-label learning and ensemble learning , 2015, BMC Bioinformatics.

[6]  J. Chen,et al.  Predicting adverse drug reaction profiles by integrating protein interaction networks with drug structures , 2013, Proteomics.

[7]  J. Chen,et al.  Predicting adverse side effects of drugs , 2011, BMC Genomics.

[8]  R. Altman,et al.  Data-Driven Prediction of Drug Effects and Interactions , 2012, Science Translational Medicine.

[9]  Roded Sharan,et al.  An Algorithmic Framework for Predicting Side-Effects of Drugs , 2010, RECOMB.

[10]  Yoshihiro Yamanishi,et al.  Drug Side-Effect Prediction Based on the Integration of Chemical and Biological Spaces , 2012, J. Chem. Inf. Model..

[11]  Yoshihiro Yamanishi,et al.  Relating drug–protein interaction network with drug side effects , 2012, Bioinform..

[12]  Qi Yaolong,et al.  Markov random field based method to predict side effects , 2016, 2016 35th Chinese Control Conference (CCC).

[13]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[14]  Peer Bork,et al.  The SIDER database of drugs and side effects , 2015, Nucleic Acids Res..

[15]  Kuo-Chen Chou,et al.  Predicting Drugs Side Effects Based on Chemical-Chemical Interactions and Protein-Chemical Interactions , 2013, BioMed research international.

[16]  Jian Luo,et al.  DSEP: A Tool Implementing Novel Method to Predict Side Effects of Drugs , 2015, J. Comput. Biol..

[17]  R. Sharan,et al.  PREDICT: a method for inferring novel drug indications with application to personalized medicine , 2011, Molecular systems biology.

[18]  Ganesh Bagler,et al.  Phenotypic side effects prediction by optimizing correlation with chemical and target profiles of drugs. , 2015, Molecular bioSystems.

[19]  Quan Xu,et al.  ADReCS: an ontology database for aiding standardization and hierarchical classification of adverse drug reaction terms , 2014, Nucleic Acids Res..

[20]  Mark Goadrich,et al.  The relationship between Precision-Recall and ROC curves , 2006, ICML.

[21]  Yoshihiro Yamanishi,et al.  Drug target prediction using adverse event report systems: a pharmacogenomic approach , 2012, Bioinform..

[22]  Youngmi Yoon,et al.  Extraction of specific common genetic network of side effect pair, and prediction of side effects for a drug based on PPI network , 2016 .

[23]  P. Bork,et al.  A side effect resource to capture phenotypic effects of drugs , 2010, Molecular systems biology.

[24]  Rui Li,et al.  Markov random field based method to predict side effects , 2016, CCC 2016.

[25]  Hong Wang,et al.  Differences in irradiated lung gene transcription between fibrosis-prone C57BL/6NHsd and fibrosis-resistant C3H/HeNHsd mice. , 2014, In vivo.