Combining functional and topological properties to identify core modules in protein interaction networks

Advances in large‐scale technologies in proteomics, such as yeast two‐hybrid screening and mass spectrometry, have made it possible to generate large Protein Interaction Networks (PINs). Recent methods for identifying dense sub‐graphs in such networks have been based solely on graph theoretic properties. Therefore, there is a need for an approach that will allow us to combine domain‐specific knowledge with topological properties to generate functionally relevant sub‐graphs from large networks. This article describes two alternative network measures for analysis of PINs, which combine functional information with topological properties of the networks. These measures, called weighted clustering coefficient and weighted average nearest‐neighbors degree, use weights representing the strengths of interactions between the proteins, calculated according to their semantic similarity, which is based on the Gene Ontology terms of the proteins. We perform a global analysis of the yeast PIN by systematically comparing the weighted measures with their topological counterparts. To show the usefulness of the weighted measures, we develop an algorithm for identification of functional modules, called SWEMODE (Semantic WEights for MODule Elucidation), that identifies dense sub‐graphs containing functionally similar proteins. The proposed method is based on the ranking of nodes, i.e., proteins, according to their weighted neighborhood cohesiveness. The highest ranked nodes are considered as seeds for candidate modules. The algorithm then iterates through the neighborhood of each seed protein, to identify densely connected proteins with high functional similarity, according to the chosen parameters. Using a yeast two‐hybrid data set of experimentally determined protein–protein interactions, we demonstrate that SWEMODE is able to identify dense clusters containing proteins that are functionally similar. Many of the identified modules correspond to known complexes or subunits of these complexes. Proteins 2006. © 2006 Wiley‐Liss, Inc.

[1]  Kenneth H. Rosen,et al.  Discrete Mathematics and its applications , 2000 .

[2]  K Nasmyth,et al.  Identification of Subunits of the Anaphase-Promoting Complex of Saccharomyces cerevisiae , 1996, Science.

[3]  David W. Conrath,et al.  Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy , 1997, ROCLING/IJCLCLP.

[4]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[5]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[6]  Philip Resnik,et al.  Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language , 1999, J. Artif. Intell. Res..

[7]  S. Stevens,et al.  Purification of the yeast U4/U6.U5 small nuclear ribonucleoprotein particle and identification of its proteins. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[8]  J. Hopfield,et al.  From molecular to modular cell biology , 1999, Nature.

[9]  L. Lehle,et al.  The Oligosaccharyltransferase Complex from Saccharomyces cerevisiae , 1999, The Journal of Biological Chemistry.

[10]  Ian Dix,et al.  Yeast Yeast 2000; 17: 95±110. Research Article , 2000 .

[11]  Ioannis Xenarios,et al.  DIP: the Database of Interacting Proteins , 2000, Nucleic Acids Res..

[12]  Jean D. Beggs,et al.  Yeast Sm-like proteins function in mRNA decapping and decay , 2000, Nature.

[13]  R. Parker,et al.  Functions of Lsm proteins in mRNA degradation and splicing. , 2000, Current opinion in cell biology.

[14]  A. Barabasi,et al.  Weighted evolving networks. , 2001, Physical review letters.

[15]  J. Blake,et al.  Creating the Gene Ontology Resource : Design and Implementation The Gene Ontology Consortium 2 , 2001 .

[16]  R. Ozawa,et al.  A comprehensive two-hybrid analysis to explore the yeast protein interactome , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[17]  K. Sneppen,et al.  Specificity and Stability in Topology of Protein Networks , 2002, Science.

[18]  M E J Newman Assortative mixing in networks. , 2002, Physical review letters.

[19]  A. Barabasi,et al.  Hierarchical Organization of Modularity in Metabolic Networks , 2002, Science.

[20]  Kara Dolinski,et al.  Saccharomyces Genome Database (SGD) provides secondary gene annotation using the Gene Ontology (GO) , 2002, Nucleic Acids Res..

[21]  M. Longtine,et al.  Bni5p, a Septin-Interacting Protein, Is Required for Normal Septin Function and Cytokinesis in Saccharomyces cerevisiae , 2002, Molecular and Cellular Biology.

[22]  C. Guthrie,et al.  A conserved Lsm-interaction motif in Prp24 required for efficient U4/U6 di-snRNP formation. , 2002, RNA.

[23]  Gary D. Bader,et al.  An automated method for finding molecular complexes in large protein interaction networks , 2003, BMC Bioinformatics.

[24]  L. Mirny,et al.  Protein complexes and functional modules in molecular networks , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[25]  Albert-László Barabási,et al.  Hierarchical organization in complex networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[26]  Alexander Rives,et al.  Modular organization of cellular networks , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[27]  Carole A. Goble,et al.  Investigating Semantic Similarity Measures Across the Gene Ontology: The Relationship Between Sequence and Annotation , 2003, Bioinform..

[28]  J. Gallagher,et al.  The Small-Subunit Processome Is a Ribosome Assembly Intermediate , 2004, Eukaryotic Cell.

[29]  Anton J. Enright,et al.  Detection of functional modules from protein interaction networks , 2003, Proteins.

[30]  J. Thorner,et al.  Septin collar formation in budding yeast requires GTP binding and direct phosphorylation by the PAK, Cla4 , 2004, The Journal of cell biology.

[31]  J. F. Poyatos,et al.  How biologically relevant are interaction-based modules in protein networks? , 2004, Genome Biology.

[32]  A. Barabasi,et al.  Functional and topological characterization of protein interaction networks , 2004, Proteomics.

[33]  A. Barabasi,et al.  Network biology: understanding the cell's functional organization , 2004, Nature Reviews Genetics.

[34]  A. Vespignani,et al.  The architecture of complex weighted networks. , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[35]  S. Kasif,et al.  Whole-genome annotation by using evidence integration in functional-linkage networks. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[36]  Ting Chen,et al.  Mapping gene ontology to proteins based on protein-protein interaction data , 2004, Bioinform..

[37]  Wim Van Criekinge,et al.  Yeast Two-Hybrid: State of the Art , 1999, Biological Procedures Online.