K-means Clustering: An Efficient Algorithm for Protein Complex Detection

The protein complexes have significant biological functions of proteins and nucleic acids dense from the molecular interaction network in cells. Several computational methods are developed to detect protein complexes from the protein–protein interaction (PPI) networks. The existing algorithms do not predict better complex, and it also provides low performance values. In this research, K-means algorithm has been proposed for protein complex detection and compared with the existing algorithms such as MCODE and SPICi. The protein interaction and gene expression benchmark datasets such as Collins, DIP, Krogan, Krogan Extended, PPI-D1, PPI-D2, GSE12220, GSE12221, GSE12442, and GSE17716 have been used for comparing the performance of the existing and proposed algorithms. From this experimental analysis, it is inferred that the proposed K-means clustering algorithm outperforms the other existing methods.

[1]  Peng Jiang,et al.  SPICi: a fast clustering algorithm for large biological networks , 2010, Bioinform..

[2]  D. Lahiri,et al.  Electrophoretic mobility shift assay for the detection of specific DNA-protein complex in nuclear extracts from the cultured cells and frozen autopsy human brain tissue. , 2000, Brain research. Brain research protocols.

[3]  D.M. Mount,et al.  An Efficient k-Means Clustering Algorithm: Analysis and Implementation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[5]  Gary D Bader,et al.  A Combined Experimental and Computational Strategy to Define Protein Interaction Networks for Peptide Recognition Modules , 2001, Science.

[6]  Gary D. Bader,et al.  An automated method for finding molecular complexes in large protein interaction networks , 2003, BMC Bioinformatics.

[7]  Anton J. Enright,et al.  An efficient algorithm for large-scale detection of protein families. , 2002, Nucleic acids research.

[8]  Jacques van Helden,et al.  Evaluation of clustering algorithms for protein-protein interaction networks , 2006, BMC Bioinformatics.

[9]  Peter J. Rousseeuw,et al.  Finding Groups in Data: An Introduction to Cluster Analysis , 1990 .

[10]  Nazar Zaki,et al.  Detecting Protein Complexes in Protein Interaction Networks Modeled as Gene Expression Biclusters , 2015, PloS one.

[11]  Min Wu,et al.  A two-layer integration framework for protein complex detection , 2016, BMC Bioinformatics.

[12]  Olvi L. Mangasarian,et al.  Mathematical Programming in Data Mining , 1997, Data Mining and Knowledge Discovery.

[13]  Shigehiko Kanaya,et al.  Development and implementation of an algorithm for detection of protein complexes in large interaction networks , 2006, BMC Bioinformatics.

[14]  Sanjay Ranka,et al.  An effic ient k-means clustering algorithm , 1997 .

[15]  Hong Yan,et al.  Protein complex detection based on partially shared multi-view clustering , 2016, BMC Bioinformatics.

[16]  Yi Pan,et al.  Identification of protein complexes from multi-relationship protein interaction networks , 2016, Human Genomics.

[17]  Igor Jurisica,et al.  Protein complex prediction via cost-based clustering , 2004, Bioinform..