The construction of protein-protein interaction network based on machine learning method

Protein-protein interactions (PPIs) are central to most biological processes. Although efforts have been devoted to the development of methodology for predicting PPIs and to construct protein interaction networks, the application of most existing methods is limited because of less and incomplete information. In the present work, we integrate multi-databases which contain protein information and apply for PPI prediction and construction of the interaction network. This process was based on several protein databases and a learning algorithm-support vector machine. The result suggests that the integrating multi-databases could be applied to the exploration of networks for any newly discovered protein with unknown biological relativity. In addition, supplementary experimental information can be added into the multi-databases and enhance the constructing ability of interaction network.

[1]  M. Gerstein,et al.  Analyzing protein function on a genomic scale: the importance of gold-standard positives and negatives for network prediction. , 2004, Current opinion in microbiology.

[2]  Melanie L. Mayer,et al.  Protein networks—built by association , 2000, Nature Biotechnology.

[3]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[4]  Juwen Shen,et al.  Predicting protein–protein interactions based only on sequences information , 2007, Proceedings of the National Academy of Sciences.

[5]  Xiaomei Wu,et al.  Genome-wide inference of protein interaction sites: lessons from the yeast high-quality negative protein–protein interaction dataset , 2008, Nucleic acids research.

[6]  Rafael C. Jimenez,et al.  The IntAct molecular interaction database in 2012 , 2011, Nucleic Acids Res..

[7]  David A. Gough,et al.  Predicting protein-protein interactions from primary structure , 2001, Bioinform..

[8]  Xiaomei Wu,et al.  Prediction of yeast protein–protein interaction network: insights from the Gene Ontology and annotations , 2006, Nucleic acids research.

[9]  Henning Hermjakob,et al.  Mapping Plant Interactomes Using Literature Curated and Predicted Protein–Protein Interaction Data Sets[W] , 2010, Plant Cell.

[10]  M. Gerstein,et al.  A Bayesian Networks Approach for Predicting Protein-Protein Interactions from Genomic Data , 2003, Science.

[11]  P. Aloy,et al.  Predicting protein-protein interaction specificity through the integration of three-dimensional structural information and the evolutionary record of protein domains. , 2010, Molecular bioSystems.

[12]  Gene Ontology Consortium,et al.  The Gene Ontology (GO) project in 2006 , 2005, Nucleic Acids Res..

[13]  Huiru Zheng,et al.  Predictive Integration of Gene Ontology-Driven Similarity and Functional Interactions , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[14]  Yanjun Qi,et al.  Random Forest Similarity for Protein-Protein Interaction Prediction from Multiple Sources , 2004, Pacific Symposium on Biocomputing.

[15]  William Stafford Noble,et al.  Large-scale prediction of protein-protein interactions from structures , 2010, BMC Bioinformatics.

[16]  Cathy H. Wu,et al.  The Universal Protein Resource (UniProt) , 2006, Nucleic Acids Research.

[17]  Doheon Lee,et al.  Modularized learning of genetic interaction networks from biological annotations and mRNA expression data , 2005, Bioinform..