Rcpi: R/Bioconductor package to generate various descriptors of proteins, compounds and their interactions

UNLABELLED In chemoinformatics and bioinformatics fields, one of the main computational challenges in various predictive modeling is to find a suitable way to effectively represent the molecules under investigation, such as small molecules, proteins and even complex interactions. To solve this problem, we developed a freely available R/Bioconductor package, called Compound-Protein Interaction with R (Rcpi), for complex molecular representation from drugs, proteins and more complex interactions, including protein-protein and compound-protein interactions. Rcpi could calculate a large number of structural and physicochemical features of proteins and peptides from amino acid sequences, molecular descriptors of small molecules from their topology and protein-protein interaction and compound-protein interaction descriptors. In addition to main functionalities, Rcpi could also provide a number of useful auxiliary utilities to facilitate the user's need. With the descriptors calculated by this package, the users could conveniently apply various statistical machine learning methods in R to solve various biological and drug research questions in computational biology and drug discovery. AVAILABILITY AND IMPLEMENTATION Rcpi is freely available from the Bioconductor site (http://bioconductor.org/packages/release/bioc/html/Rcpi.html).

[1]  Bartek Wilczynski,et al.  Biopython: freely available Python tools for computational molecular biology and bioinformatics , 2009, Bioinform..

[2]  Dong-Sheng Cao,et al.  propy: a tool to generate various modes of Chou's PseAAC , 2013, Bioinform..

[3]  Dong-Sheng Cao,et al.  In silico classification of human maximum recommended daily dose based on modified random forest and substructure fingerprint. , 2011, Analytica chimica acta.

[4]  Z. R. Li,et al.  Update of PROFEAT: a web server for computing structural and physicochemical features of proteins and peptides from amino acid sequence , 2006, Nucleic Acids Res..

[5]  Roberto Todeschini,et al.  Molecular descriptors for chemoinformatics , 2009 .

[6]  L. Jiang,et al.  PROFEAT: a web server for computing structural and physicochemical features of proteins and peptides from amino acid sequence , 2006, Nucleic Acids Res..

[7]  B. Honig,et al.  Structure-based prediction of protein-protein interactions on a genome-wide scale , 2012, Nature.

[8]  Andreas Prlic,et al.  BioJava: an open-source framework for bioinformatics in 2012 , 2012, Bioinform..

[9]  William J. Welsh,et al.  Improved method for predicting ?-turn using support vector machine , 2005, Bioinform..

[10]  Yu-Yen Ou,et al.  Protein disorder prediction by condensed PSSM considering propensity for order or disorder , 2006, BMC Bioinformatics.

[11]  Matthew D. Segall,et al.  ADMET Property Prediction: The State of the Art and Current Challenges , 2006 .

[12]  K. Chou,et al.  Cell-PLoc: a package of Web servers for predicting subcellular localization of proteins in various organisms , 2008, Nature Protocols.

[13]  Xugang Ye,et al.  An assessment of substitution scores for protein profile-profile comparison , 2011, Bioinform..

[14]  Dong-Sheng Cao,et al.  ChemoPy: freely available python package for computational biology and chemoinformatics , 2013, Bioinform..

[15]  Dong-Sheng Cao,et al.  Genome-Scale Screening of Drug-Target Associations Relevant to Ki Using a Chemogenomics Approach , 2013, PloS one.

[16]  George Karypis,et al.  Profile-based direct kernels for remote homology detection and fold recognition , 2005, Bioinform..

[17]  Andreas Prlic,et al.  Sequence analysis , 2003 .

[18]  Humberto González-Díaz,et al.  2D MI-DRAGON: a new predictor for protein-ligands interactions and theoretic-experimental studies of US FDA drug-target network, oxoisoaporphine inhibitors for MAO-A and human parasite proteins. , 2011, European journal of medicinal chemistry.

[19]  Xin Wang,et al.  PseAAC-Builder: a cross-platform stand-alone program for generating various special Chou's pseudo-amino acid compositions. , 2012, Analytical biochemistry.

[20]  Juwen Shen,et al.  Predicting protein–protein interactions based only on sequences information , 2007, Proceedings of the National Academy of Sciences.

[21]  Maykel Pérez González,et al.  TOPS-MODE Based QSARs Derived from Heterogeneous Series of Compounds. Applications to the Design of New Herbicides. , 2003 .

[22]  K. Chou,et al.  Predicting Drug-Target Interaction Networks Based on Functional Groups and Biological Features , 2010, PloS one.

[23]  Noel M. O'Boyle,et al.  Cinfony – combining Open Source cheminformatics toolkits behind a common interface , 2008, Chemistry Central journal.

[24]  J. Dearden,et al.  QSAR modeling: where have you been? Where are you going to? , 2014, Journal of medicinal chemistry.

[25]  Peter Willett,et al.  The Calculation of Molecular Structural Similarity: Principles and Practice , 2014, Molecular informatics.