High Functional Coherence in k-Partite Protein Cliques of Protein Interaction Networks

We introduce a new topological concept called k-partite protein cliques to study protein interaction (PPI) networks.In particular, we examine functional coherence of proteins in k-partite protein cliques. A k-partite protein clique is a k-partite maximal clique comprising two or more nonoverlapping protein subsets between any two of which full interactions are exhibited. In the detection of PPI’s k-partite maximal cliques, we propose to transform PPI networks into induced K-partite graphs with proteins as vertices where edges only exist among the graph’s partites. Then, we present a k-partite maximal clique mining (MaCMik) algorithm to enumerate k-partite maximal cliques from K-partite graphs. Our MaCMik algorithm is applied to a yeast PPI network. We observe that there does exist interesting and unusually high functional coherence in k-partite proteincliques—most proteins in k-partite protein cliques, especially those in the same partites, share the same functions. Therefore, the idea of k-partite protein cliques suggests a novel approach to characterizing PPI networks, and may help function prediction for unknown proteins.

[1]  B. Schwikowski,et al.  A network of protein–protein interactions in yeast , 2000, Nature Biotechnology.

[2]  See-Kiong Ng,et al.  Discovering protein complexes in dense reliable neighborhoods of protein interaction networks. , 2007, Computational systems bioinformatics. Computational Systems Bioinformatics Conference.

[3]  L. Mirny,et al.  Protein complexes and functional modules in molecular networks , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[4]  W. Wong,et al.  Transitive functional annotation by shortest-path analysis of gene expression data , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[5]  D. Bu,et al.  Topological structure analysis of the protein-protein interaction network in budding yeast. , 2003, Nucleic acids research.

[6]  Mona Singh,et al.  Whole-proteome prediction of protein function via graph-theoretic analysis of interaction maps , 2005, ISMB.

[7]  Alain Guénoche,et al.  Clustering proteins from interaction networks for the prediction of cellular functions , 2004, BMC Bioinformatics.

[8]  T. Takagi,et al.  Assessment of prediction accuracy of protein function from protein–protein interaction data , 2001, Yeast.

[9]  Philip S. Yu,et al.  Unsupervised learning on k-partite graphs , 2006, KDD '06.

[10]  Alessandro Vespignani,et al.  Global protein function prediction from protein-protein interaction networks , 2003, Nature Biotechnology.

[11]  Yasuko Mori,et al.  Discovery of a Second Form of Tripartite Complex Containing gH-gL of Human Herpesvirus 6 and Observations on CD46 , 2004, Journal of Virology.

[12]  Mohammed J. Zaki,et al.  CLICKS: Mining Subspace Clusters in Categorical Data via K-Partite Maximal Cliques , 2005, 21st International Conference on Data Engineering (ICDE'05).

[13]  Jinyan Li,et al.  Maximal Quasi-Bicliques with Balanced Noise Tolerance: Concepts and Co-clustering Applications , 2008, SDM.

[14]  Illés J. Farkas,et al.  CFinder: locating cliques and overlapping modules in biological networks , 2006, Bioinform..

[15]  Jinyan Li,et al.  Maximal Biclique Subgraphs and Closed Pattern Pairs of the Adjacency Matrix: A One-to-One Correspondence and Mining Algorithms , 2007, IEEE Transactions on Knowledge and Data Engineering.

[16]  Amanda Clare,et al.  Machine learning of functional class from phenotype data , 2002, Bioinform..

[17]  Hongjun Lu,et al.  ReCoM: reinforcement clustering of multi-type interrelated data objects , 2003, SIGIR.

[18]  Kyungsook Han,et al.  Identifying Functional Groups by Finding Cliques and Near-Cliques in Protein Interaction Networks , 2007, 2007 Frontiers in the Convergence of Bioscience and Information Technologies.

[19]  D. Eisenberg,et al.  Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[20]  Arthur Brady,et al.  Fault Tolerance in Protein Interaction Networks: Stable Bipartite Subgraphs and Redundant Pathways , 2009, PloS one.

[21]  Tie-Yan Liu,et al.  Star-Structured High-Order Heterogeneous Data Co-clustering Based on Consistent Information Theory , 2006, Sixth International Conference on Data Mining (ICDM'06).

[22]  T. Südhof,et al.  CASK Participates in Alternative Tripartite Complexes in which Mint 1 Competes for Binding with Caskin 1, a Novel CASK-Binding Protein , 2002, The Journal of Neuroscience.

[23]  Desmond J. Higham,et al.  A lock-and-key model for protein-protein interactions , 2006, Bioinform..

[24]  Amanda Clare,et al.  The utility of different representations of protein sequence for predicting functional class , 2001, Bioinform..

[25]  Anton J. Enright,et al.  Protein interaction maps for complete genomes based on gene fusion events , 1999, Nature.

[26]  Limsoon Wong,et al.  Exploiting Indirect Neighbours and Topological Weight to Predict Protein Function from Protein-Protein Interactions , 2006, BioDM.

[27]  Ananth Grama,et al.  Functional coherence in domain interaction networks , 2008, ECCB.

[28]  H. Mewes,et al.  The FunCat, a functional annotation scheme for systematic classification of proteins from whole genomes. , 2004, Nucleic acids research.

[29]  C. Cannings,et al.  On the structure of protein-protein interaction networks. , 2003, Biochemical Society transactions.

[30]  René Peeters,et al.  The maximum edge biclique problem is NP-complete , 2003, Discret. Appl. Math..