Inferring protein-protein interactions using a hybrid genetic algorithm/support vector machine method.

Identifying protein-protein interaction is crucial for understanding the biological systems and processes, as well as mutant design. This paper proposes a novel hybrid Genetic Algorithm/Support Vector Machine (GA/SVM) method to predict the interactions between proteins intermediated by the protein-domain relations. A protein domain is a structural and/or functional unit of the protein. Every protein can be characterized by a distinct domain or a sequential combination of multiple domains. In our method, the protein was first represented by its domains where the effects of domain duplication were also considered. Transformation of the domain composition was taken to simulate the combination of different domains using genetic algorithm (GA). The optimal transformation was discovered using a predictor constructed by a support vector machines (SVM) method. Compared with random predictor, the prediction performance of our method is more effective and efficient with 0.85 sensitivity, 0.90 specificity and 0.88 accuracy.

[1]  C. Deane,et al.  Protein Interactions , 2002, Molecular & Cellular Proteomics.

[2]  A. Rzhetsky,et al.  Probabilistic prediction of unknown metabolic and signal-transduction networks. , 2001, Genetics.

[3]  Rolf Apweiler,et al.  InterProScan: protein domains identifier , 2005, Nucleic Acids Res..

[4]  Kyungsook Han,et al.  Sequence-based prediction of protein-protein interactions by means of rotation forest and autocorrelation descriptor. , 2010, Protein and peptide letters.

[5]  William Stafford Noble,et al.  Learning to predict protein-protein interactions from protein sequences , 2003, Bioinform..

[6]  Wan Kyu Kim,et al.  Large scale statistical prediction of protein-protein interaction by potentially interacting domain (PID) pair. , 2002, Genome informatics. International Conference on Genome Informatics.

[7]  D. Eisenberg,et al.  Detecting protein function and protein-protein interactions from genome sequences. , 1999, Science.

[8]  De-Shuang Huang,et al.  A protein interaction network analysis for yeast integral membrane protein. , 2008, Protein and peptide letters.

[9]  Gary D Bader,et al.  Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry , 2002, Nature.

[10]  P. Bork,et al.  Functional organization of the yeast proteome by systematic analysis of protein complexes , 2002, Nature.

[11]  L. Holm,et al.  The Pfam protein families database , 2005, Nucleic Acids Res..

[12]  Olivier Lichtarge,et al.  Accurate and scalable identification of functional sites by evolutionary tracing , 2004, Journal of Structural and Functional Genomics.

[13]  D. Eisenberg,et al.  A combined algorithm for genome-wide prediction of protein function , 1999, Nature.

[14]  Minghua Deng,et al.  Inferring Domain–Domain Interactions From Protein–Protein Interactions , 2002 .

[15]  See-Kiong Ng,et al.  Integrative Approach for Computationally Inferring Protein Domain Interactions , 2003, Bioinform..

[16]  James R. Knight,et al.  A comprehensive analysis of protein–protein interactions in Saccharomyces cerevisiae , 2000, Nature.

[17]  T. Ito,et al.  Toward a protein-protein interaction map of the budding yeast: A comprehensive system to examine two-hybrid interactions in all possible combinations between the yeast proteins. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[18]  Dmitrij Frishman,et al.  MIPS: analysis and annotation of genome information in 2007 , 2007, Nucleic Acids Res..

[19]  Dongsoo Han,et al.  A domain combination based probabilistic framework for protein-protein interaction prediction. , 2003, Genome informatics. International Conference on Genome Informatics.

[20]  Xing-Ming Zhao,et al.  A novel hybrid GA/RBFNN technique for protein sequences classification. , 2005, Protein and peptide letters.

[21]  Hau-San Wong,et al.  Predicting Protein-Protein Interaction Sites using Radial Basis Function Neural Networks , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[22]  Kyungsook Han,et al.  Predicting key long-range interaction sites by B-factors. , 2008, Protein and peptide letters.

[23]  Adam J. Smith,et al.  The Database of Interacting Proteins: 2004 update , 2004, Nucleic Acids Res..

[24]  Mark Gerstein,et al.  Bridging structural biology and genomics: assessing protein interaction data with known complexes. , 2002, Drug discovery today.

[25]  J. Wojcik,et al.  The protein–protein interaction map of Helicobacter pylori , 2001, Nature.

[26]  T. Chiba,et al.  Exploring the protein interactome using comprehensive two-hybrid projects. , 2001, Trends in biotechnology.

[27]  P. Legrain,et al.  Genome‐wide protein interaction maps using two‐hybrid systems , 2000, FEBS letters.

[28]  Anton J. Enright,et al.  Protein interaction maps for complete genomes based on gene fusion events , 1999, Nature.

[29]  R. Ozawa,et al.  A comprehensive two-hybrid analysis to explore the yeast protein interactome , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[30]  E. Sprinzak,et al.  Correlated sequence-signatures as markers of protein-protein interaction. , 2001, Journal of molecular biology.

[31]  Peng Chen,et al.  Predicting protein interaction sites from residue spatial sequence profile and evolution rate , 2006, FEBS Letters.

[32]  B. Wang,et al.  Inferring protein-protein interacting sites using residue conservation and evolutionary information. , 2006, Protein and peptide letters.