Adaptive Neural Network-Based Clustering of Yeast Protein-Protein Interactions

In this paper, we presents an adaptive neural network based clustering method to group protein–protein interaction data according to their functional categories for new protein interaction prediction in conjunction with information theory based feature selection. Our technique for grouping protein interaction is based on ART-1 neural network. The cluster prototype constructed with existing protein interaction data is used to predict the class of new protein interactions. The protein interaction data of S.cerevisiae (bakers yeast) from MIPS and SGD are used. The clustering performance was compared with traditional k-means clustering method in terms of cluster distance. According to the experimental results, the proposed method shows about 89.7% clustering accuracy and the feature selection filter boosted overall performances about 14.8%. Also, inter-cluster distances of cluster constructed with ART-1 based clustering method have shown high cluster quality.

[1]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[2]  See-Kiong Ng,et al.  Integrative Approach for Computationally Inferring Protein Domain Interactions , 2003, Bioinform..

[3]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[4]  Stephen Grossberg,et al.  A massively parallel architecture for a self-organizing neural pattern recognition machine , 1988, Comput. Vis. Graph. Image Process..

[5]  Kenji Satou,et al.  Extraction of knowledge on protein-protein interaction by association rule discovery , 2002, Bioinform..

[6]  Michael Krauthammer,et al.  Probabilistic inference of molecular networks from noisy data sources , 2004, Bioinform..

[7]  Daniel R. Tauritz,et al.  Adaptive Resonance Theory (ART): An Introduction , 1995 .

[8]  Jason Weston,et al.  Gene functional classification from heterogeneous data , 2001, RECOMB.

[9]  Dmitrij Frishman,et al.  MIPS: analysis and annotation of proteins from whole genomes in 2005 , 2006, Nucleic Acids Res..

[10]  Richard M. Everson,et al.  Intelligent Data Engineering and Automated Learning – IDEAL 2004 , 2004, Lecture Notes in Computer Science.

[11]  Joan Brooks,et al.  Three yeast proteome databases: YPD, PombePD, and CalPD (MycoPathPD). , 2002, Methods in enzymology.

[12]  B. Moore,et al.  ART1 and pattern clustering , 1989 .

[13]  Kara Dolinski,et al.  Saccharomyces Genome Database (SGD) provides tools to identify and analyze sequences from Saccharomyces cerevisiae and related sequences from other organisms , 2004, Nucleic Acids Res..

[14]  William H. Press,et al.  Numerical recipes in C , 2002 .

[15]  Huan Liu,et al.  Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution , 2003, ICML.

[16]  Jong H. Park,et al.  Mapping protein family interactions: intramolecular and intermolecular protein family interaction repertoires in the PDB and yeast. , 2001, Journal of molecular biology.

[17]  B. Barrell,et al.  Life with 6000 Genes , 1996, Science.

[18]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[19]  James R. Knight,et al.  A comprehensive analysis of protein–protein interactions in Saccharomyces cerevisiae , 2000, Nature.

[20]  R. Ozawa,et al.  A comprehensive two-hybrid analysis to explore the yeast protein interactome , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[21]  S. Sitharama Iyengar,et al.  Adaptive neural network clustering of Web users , 2004, Computer.

[22]  Byoung-Tak Zhang,et al.  Prediction of Implicit Protein-Protein Interaction by Optimal Associative Feature Mining , 2004, IDEAL.