DrugECs: An Ensemble System with Feature Subspaces for Accurate Drug-Target Interaction Prediction

Background Drug-target interaction is key in drug discovery, especially in the design of new lead compound. However, the work to find a new lead compound for a specific target is complicated and hard, and it always leads to many mistakes. Therefore computational techniques are commonly adopted in drug design, which can save time and costs to a significant extent. Results To address the issue, a new prediction system is proposed in this work to identify drug-target interaction. First, drug-target pairs are encoded with a fragment technique and the software “PaDEL-Descriptor.” The fragment technique is for encoding target proteins, which divides each protein sequence into several fragments in order and encodes each fragment with several physiochemical properties of amino acids. The software “PaDEL-Descriptor” creates encoding vectors for drug molecules. Second, the dataset of drug-target pairs is resampled and several overlapped subsets are obtained, which are then input into kNN (k-Nearest Neighbor) classifier to build an ensemble system. Conclusion Experimental results on the drug-target dataset showed that our method performs better and runs faster than the state-of-the-art predictors.

[1]  Kuo-Chen Chou,et al.  Pharmacogenomics and personalized use of drugs. , 2008, Current topics in medicinal chemistry.

[2]  Daniel R. Caffrey,et al.  Structure-based maximal affinity model predicts small-molecule druggability , 2007, Nature Biotechnology.

[3]  K. Chou,et al.  Predicting Drug-Target Interaction Networks Based on Functional Groups and Biological Features , 2010, PloS one.

[4]  Kuo-Chen Chou,et al.  Molecular modeling of two CYP2C19 SNPs and its implications for personalized drug design. , 2008, Protein and peptide letters.

[5]  Lei Chen,et al.  Using compound similarity and functional domain composition for prediction of drug-target interaction networks. , 2010, Medicinal chemistry (Shariqah (United Arab Emirates)).

[6]  Yoshihiro Yamanishi,et al.  Prediction of drug–target interaction networks from the integration of chemical and genomic spaces , 2008, ISMB.

[7]  Thomas Lengauer,et al.  A fast flexible docking method using an incremental construction algorithm. , 1996, Journal of molecular biology.

[8]  Panos Kalnis,et al.  DASPfind: new efficient method to predict drug–target interactions , 2016, Journal of Cheminformatics.

[9]  Lemont B. Kier,et al.  An Electrotopological-State Index for Atoms in Molecules , 1990, Pharmaceutical Research.

[10]  Ivan G. Costa,et al.  A multiple kernel learning algorithm for drug-target interaction prediction , 2016, BMC Bioinformatics.

[11]  Chen Chu,et al.  Gene Ontology and KEGG Pathway Enrichment Analysis of a Drug Target-Based Classification System , 2015, PloS one.

[12]  K. Chou,et al.  iGPCR-Drug: A Web Server for Predicting Interaction between GPCRs and Drugs in Cellular Networking , 2013, PloS one.

[13]  Shi-Hua Zhang,et al.  DrugE-Rank: improving drug–target interaction prediction of new candidate drugs or targets by ensemble learning to rank , 2016, Bioinform..

[14]  Kuo-Chen Chou,et al.  iNR-Drug: Predicting the Interaction of Drugs with Nuclear Receptors in Cellular Networking , 2014, International journal of molecular sciences.

[15]  Yanli Wang,et al.  Predicting drug-target interactions by dual-network integrated logistic matrix factorization , 2017, Scientific Reports.

[16]  K. Chou,et al.  Prediction of membrane protein types and subcellular locations , 1999, Proteins.

[17]  K. Chou,et al.  A vectorized sequence-coupling model for predicting HIV protease cleavage sites in proteins. , 1993, The Journal of biological chemistry.

[18]  K. Chou,et al.  Cell-PLoc: a package of Web servers for predicting subcellular localization of proteins in various organisms , 2008, Nature Protocols.

[19]  Jinyan Li,et al.  Sequence-based identification of interface residues by an integrative profile combining hydrophobic and evolutionary information , 2010, BMC Bioinformatics.

[20]  Howard L McLeod,et al.  Pharmacogenomics--drug disposition, drug targets, and side effects. , 2003, The New England journal of medicine.

[21]  K. Chou,et al.  Predict drug-protein interaction in cellular networking. , 2013, Current topics in medicinal chemistry.

[22]  Peng Chen,et al.  Prediction of protein long-range contacts using an ensemble of genetic algorithm classifiers with sequence profile centers , 2010, BMC Structural Biology.

[23]  Yoshihiro Yamanishi,et al.  Relating drug–protein interaction network with drug side effects , 2012, Bioinform..

[24]  K. Chou,et al.  iEzy-Drug: A Web Server for Identifying the Interaction between Enzymes and Drugs in Cellular Networking , 2013, BioMed research international.

[25]  CHUN WEI YAP,et al.  PaDEL‐descriptor: An open source software to calculate molecular descriptors and fingerprints , 2011, J. Comput. Chem..

[26]  Johnson,et al.  Predicting human safety: screening and computational approaches. , 2000, Drug discovery today.

[27]  Jinyan Li,et al.  DomSVR: domain boundary prediction with support vector regression from sequence information alone , 2010, Amino Acids.

[28]  Chunyan Miao,et al.  Neighborhood Regularized Logistic Matrix Factorization for Drug-Target Interaction Prediction , 2016, PLoS Comput. Biol..

[29]  K. Chou,et al.  REVIEW : Recent advances in developing web-servers for predicting protein attributes , 2009 .

[30]  Kuo-Chen Chou,et al.  GPCR‐CA: A cellular automaton image approach for predicting G‐protein–coupled receptor functional classes , 2009, J. Comput. Chem..

[31]  Hiroshi Mamitsuka,et al.  A probabilistic model for mining implicit 'chemical compound-gene' relations from literature , 2005, ECCB/JBI.

[32]  K. Chou,et al.  iCDI-PseFpt: identify the channel-drug interaction in cellular networking with PseAAC and molecular fingerprints. , 2013, Journal of theoretical biology.

[33]  Peng Chen,et al.  Predicting protein interaction sites from residue spatial sequence profile and evolution rate , 2006, FEBS Letters.

[34]  Kuo-Chen Chou,et al.  Assessment of chemical libraries for their druggability , 2005, Comput. Biol. Chem..

[35]  Chunhua Zhang,et al.  Kernel-based data fusion improves the drug-protein interaction prediction , 2011, Comput. Biol. Chem..

[36]  Robert P. W. Duin,et al.  Limits on the majority vote accuracy in classifier fusion , 2003, Pattern Analysis & Applications.