FIS-PNN: A hybrid computational method for protein-protein interaction prediction

The study of protein-protein interactions (PPI) is an active area of research in biology as it mediates most of the biological functions in any organism. Although, there are no concrete properties in predicting PPI, extensive wet-lab experiments suggest (with a high probability) that interacting proteins in the fine level share similar functions, cellular roles and sub-cellular locations. In this study, we developed a technique to predict PPI based on their secondary structures, co-localization, and function annotation. We proposed our approach, namely FIS-PNN, to predict the interacting proteins in yeast using hybrid machine learning algorithms. FIS-PNN has been trained and tested using 1029 proteins with 2965 known positive interactions; it could successfully predict PPI with 96% of accuracy — a level that is significantly greater than all other existing sequence-based prediction methods.

[1]  Jean-Loup Faulon,et al.  Predicting protein-protein interactions using signature products , 2005, Bioinform..

[2]  J. Skolnick,et al.  Prediction of physical protein–protein interactions , 2005, Physical biology.

[3]  Zhen Liu,et al.  Refined phylogenetic profiles method for predicting protein-protein interactions , 2005, Bioinform..

[4]  Chi-Bin Cheng,et al.  Neuro-fuzzy and genetic algorithm in multiple response optimization , 2002 .

[5]  Ian T. Jolliffe,et al.  Principal Component Analysis , 2002, International Encyclopedia of Statistical Science.

[6]  David A. Gough,et al.  Predicting protein-protein interactions from primary structure , 2001, Bioinform..

[7]  B. Snel,et al.  Conservation of gene order: a fingerprint of proteins that physically interact. , 1998, Trends in biochemical sciences.

[8]  Donald F. Specht,et al.  Probabilistic neural networks and the polynomial Adaline as complementary techniques for classification , 1990, IEEE Trans. Neural Networks.

[9]  Javid Taheri,et al.  Artificial Neural Networks , 2006, Handbook of Nature-Inspired and Innovative Computing.

[10]  Hui Lu,et al.  Multimeric threading-based prediction of protein-protein interactions on a genomic scale: application to the Saccharomyces cerevisiae proteome. , 2003, Genome research.

[11]  Reza Langari,et al.  Multiple fuzzy systems for function approximation , 1997, 1997 Annual Meeting of the North American Fuzzy Information Processing Society - NAFIPS (Cat. No.97TH8297).

[12]  Ioannis Xenarios,et al.  DIP: the Database of Interacting Proteins , 2000, Nucleic Acids Res..

[13]  R. Karp,et al.  From the Cover : Conserved patterns of protein interaction in multiple species , 2005 .

[14]  Marco Botta,et al.  Large Scale Prediction of Protein Interactions by a SVM-Based Method , 2003, WIRN.

[15]  Anna Tramontano,et al.  The ten most wanted solutions in protein bioinformatics , 2005 .

[16]  Baldomero Oliva,et al.  Structure-based evaluation of in silico predictions of protein-protein interactions using Comparative Docking , 2007, Bioinform..

[17]  F. Hospital,et al.  A general algorithm to compute multilocus genotype frequencies under various mating systems , 1996, Comput. Appl. Biosci..

[18]  Min Kyung Kim,et al.  A protein interaction verification system based on a neural network algorithm , 2005, 2005 IEEE Computational Systems Bioinformatics Conference - Workshops (CSBW'05).

[19]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[20]  Baldomero Oliva,et al.  Prediction of protein-protein interactions using distant conservation of sequence patterns and structure relationships , 2005, Bioinform..

[21]  I. Jolliffe Principal Component Analysis , 2002 .

[22]  Anton J. Enright,et al.  Protein interaction maps for complete genomes based on gene fusion events , 1999, Nature.

[23]  D. F. Specht,et al.  Probabilistic neural networks for classification, mapping, or associative memory , 1988, IEEE 1988 International Conference on Neural Networks.

[24]  A. Valencia,et al.  Conserved Clusters of Functionally Related Genes in Two Bacterial Genomes , 1997, Journal of Molecular Evolution.

[25]  Albert Y. Zomaya,et al.  RBT-L: A location based approach for solving the Multiple Sequence Alignment problem , 2010, Int. J. Bioinform. Res. Appl..

[26]  Albert Y. Zomaya,et al.  Fuzzy systems modeling for protein-protein interaction prediction in Saccharomyces cerevisie , 2009 .

[27]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[28]  H. Mewes,et al.  The FunCat, a functional annotation scheme for systematic classification of proteins from whole genomes. , 2004, Nucleic acids research.

[29]  Lotfi A. Zadeh,et al.  Fuzzy Logic , 2009, Encyclopedia of Complexity and Systems Science.

[30]  A. Valencia,et al.  Similarity of phylogenetic trees as indicator of protein-protein interaction. , 2001, Protein engineering.

[31]  M. Kanehisa,et al.  Prediction of Protein-Protein Interactions from Phylogenetic Trees Using Partial Correlation Coefficient , 2003 .

[32]  Shengrui Wang,et al.  CLUSS: Clustering of protein sequences based on a new similarity measure , 2007, BMC Bioinformatics.

[33]  Mei Liu,et al.  Prediction of protein-protein interactions using random decision forest framework , 2005, Bioinform..

[34]  Pierre Baldi,et al.  SCRATCH: a protein structure and structural feature prediction server , 2005, Nucleic Acids Res..

[35]  Li Liao,et al.  Phylogenetic tree information aids supervised learning for predicting protein-protein interaction based on distance matrices , 2007, BMC Bioinformatics.