Protein Interaction Prediction by Integrating Genomic Features and Protein Interaction Network Analysis

The recent explosion of genomic-scale protein interaction screens has made it possible to study protein interactions on a level of interactome and networks. In this chapter, we begin with an introduction of a novel approach that probabilistically combines multiple information sources to predict protein interactions in yeast. Specifically, Section 5.2 describes the sources of genomic features. Section 5.3 provides a basic tutorial on machine-learning approaches and

[1]  Michael Krauthammer,et al.  GENIES: a natural-language processing system for the extraction of molecular pathways from journal articles , 2001, ISMB.

[2]  G. Church,et al.  Correlation between transcriptome and interactome mapping data from Saccharomyces cerevisiae , 2001, Nature Genetics.

[3]  Alan Saghatelian,et al.  Erratum: Corrigendum: Assignment of protein function in the postgenomic era , 2005, Nature chemical biology.

[4]  Alessandro Vespignani,et al.  Global protein function prediction from protein-protein interaction networks , 2003, Nature Biotechnology.

[5]  Dmitrij Frishman,et al.  MIPS: a database for genomes and protein sequences , 1999, Nucleic Acids Res..

[6]  Alan M. Frieze,et al.  Random graphs , 2006, SODA '06.

[7]  Yudong D. He,et al.  Functional Discovery via a Compendium of Expression Profiles , 2000, Cell.

[8]  B. Snel,et al.  Comparative assessment of large-scale data sets of protein–protein interactions , 2002, Nature.

[9]  Hui Lu,et al.  Multimeric threading-based prediction of protein-protein interactions on a genomic scale: application to the Saccharomyces cerevisiae proteome. , 2003, Genome research.

[10]  M. Gerstein,et al.  A Bayesian system integrating expression data with sequence patterns for localizing proteins: comprehensive application to the yeast genome. , 2000, Journal of molecular biology.

[11]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[12]  D. Eisenberg,et al.  Protein function in the post-genomic era , 2000, Nature.

[13]  Nicola J. Rinaldi,et al.  Transcriptional Regulatory Networks in Saccharomyces cerevisiae , 2002, Science.

[14]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[15]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[16]  M. Gerstein,et al.  A Bayesian Networks Approach for Predicting Protein-Protein Interactions from Genomic Data , 2003, Science.

[17]  M. Gerstein,et al.  Protein family and fold occurrence in genomes: power-law behaviour and evolutionary model. , 2001, Journal of molecular biology.

[18]  Russ B. Altman,et al.  Missing value estimation methods for DNA microarrays , 2001, Bioinform..

[19]  J. Hopfield,et al.  From molecular to modular cell biology , 1999, Nature.

[20]  Alexander Rives,et al.  Modular organization of cellular networks , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[21]  B. Schwikowski,et al.  A network of protein–protein interactions in yeast , 2000, Nature Biotechnology.

[22]  Ian M. Donaldson,et al.  BIND: the Biomolecular Interaction Network Database , 2001, Nucleic Acids Res..

[23]  D. Eisenberg,et al.  A combined algorithm for genome-wide prediction of protein function , 1999, Nature.

[24]  M. Gerstein,et al.  TopNet: a tool for comparing biological sub-networks, correlating protein properties with topological statistics. , 2004, Nucleic acids research.

[25]  Mark Gerstein,et al.  Bridging structural biology and genomics: assessing protein interaction data with known complexes. , 2002, Drug discovery today.

[26]  Ioannis Xenarios,et al.  DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions , 2002, Nucleic Acids Res..

[27]  Gary D Bader,et al.  Global Mapping of the Yeast Genetic Interaction Network , 2004, Science.

[28]  A. Valencia,et al.  Computational methods for the prediction of protein interactions. , 2002, Current opinion in structural biology.

[29]  David E. Booth,et al.  Analysis of Incomplete Multivariate Data , 2000, Technometrics.

[30]  Andrey Rzhetsky,et al.  Birth of scale-free molecular networks and the number of distinct DNA and protein domains per genome , 2001, Bioinform..

[31]  Gary D Bader,et al.  Analyzing yeast protein–protein interaction data obtained from different sources , 2002, Nature Biotechnology.

[32]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[33]  E. Koonin,et al.  The structure of the protein universe and genome evolution , 2002, Nature.

[34]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[35]  Hui Lu,et al.  MULTIPROSPECTOR: An algorithm for the prediction of protein–protein interactions by multimeric threading , 2002, Proteins.

[36]  M. Gerstein,et al.  Integration of genomic datasets to predict protein complexes in yeast , 2004, Journal of Structural and Functional Genomics.

[37]  Nicole A. Lazar,et al.  Statistical Analysis With Missing Data , 2003, Technometrics.

[38]  Ronald W. Davis,et al.  A genome-wide transcriptional analysis of the mitotic cell cycle. , 1998, Molecular cell.