Iterative Cluster Analysis of Protein Interaction Data

MOTIVATION Generation of fast tools of hierarchical clustering to be applied when distances among elements of a set are constrained, causing frequent distance ties, as happens in protein interaction data. RESULTS We present in this work the program UVCLUSTER, that iteratively explores distance datasets using hierarchical clustering. Once the user selects a group of proteins, UVCLUSTER converts the set of primary distances among them (i.e. the minimum number of steps, or interactions, required to connect two proteins) into secondary distances that measure the strength of the connection between each pair of proteins when the interactions for all the proteins in the group are considered. We show that this novel strategy has advantages over conventional clustering methods to explore protein-protein interaction data. UVCLUSTER easily incorporates the information of the largest available interaction datasets to generate comprehensive primary distance tables. The versatility, simplicity of use and high speed of UVCLUSTER on standard personal computers suggest that it can be a benchmark analytical tool for interactome data analysis. AVAILABILITY The program is available upon request from the authors, free for academic users. Additional information available at http://www.uv.es/genomica/UVCLUSTER.

[1]  Igor Jurisica,et al.  Functional topology in a network of protein interactions , 2004, Bioinform..

[2]  D. Goldberg,et al.  Assessing experimentally derived interactions in a small world , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Alexander Rives,et al.  Modular organization of cellular networks , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Gary D. Bader,et al.  An automated method for finding molecular complexes in large protein interaction networks , 2003, BMC Bioinformatics.

[5]  Christos A. Nicolaou,et al.  Ties in Proximity and Clustering Compounds , 2001, J. Chem. Inf. Comput. Sci..

[6]  Gary D Bader,et al.  Analyzing yeast protein–protein interaction data obtained from different sources , 2002, Nature Biotechnology.

[7]  A. Barabasi,et al.  Network biology: understanding the cell's functional organization , 2004, Nature Reviews Genetics.

[8]  B. Schwikowski,et al.  A network of protein–protein interactions in yeast , 2000, Nature Biotechnology.

[9]  Peer Bork,et al.  Predicting protein cellular localization using a domain projection method. , 2002, Genome research.

[10]  Yaning Yang,et al.  Statistical significance for hierarchical clustering in genetic association and microarray expression studies , 2003, BMC Bioinformatics.

[11]  Sudhir Kumar,et al.  MEGA2: molecular evolutionary genetics analysis software , 2001, Bioinform..

[12]  Adam Godzik,et al.  Comparative analysis of protein domain organization. , 2004, Genome research.

[13]  N Takezaki,et al.  Tie trees generated by distance methods of phylogenetic reconstruction. , 1998, Molecular biology and evolution.

[14]  Gary D Bader,et al.  Functional genomics and proteomics: charting a multidimensional map of the yeast cell. , 2003, Trends in cell biology.

[15]  Iliana Avila-Campillo,et al.  Control of yeast filamentous-form growth by modules in an integrated molecular network. , 2004, Genome research.

[16]  Ignacio Marín,et al.  A Hierarchical Clustering Strategy and Its Application to Proteomic Interaction Data , 2003, IbPRIA.

[17]  S. Bergmann,et al.  Similarities and Differences in Genome-Wide Expression Data of Six Organisms , 2003, PLoS biology.

[18]  Anton J. Enright,et al.  Detection of functional modules from protein interaction networks , 2003, Proteins.

[19]  M. Nei,et al.  Molecular Evolution and Phylogenetics , 2000 .

[20]  Ioannis Xenarios,et al.  DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions , 2002, Nucleic Acids Res..

[21]  P. Bork,et al.  Functional organization of the yeast proteome by systematic analysis of protein complexes , 2002, Nature.

[22]  Dmitrij Frishman,et al.  MIPS: a database for genomes and protein sequences , 1999, Nucleic Acids Res..

[23]  S. Fields,et al.  A protein interaction map for cell polarity development , 2001, The Journal of cell biology.

[24]  D. Eisenberg,et al.  Computational methods of analysis of protein-protein interactions. , 2003, Current opinion in structural biology.

[25]  Ron Shamir,et al.  PIVOT: Protein Interacions VisualizatiOn Tool , 2004, Bioinform..

[26]  James R. Knight,et al.  A comprehensive analysis of protein–protein interactions in Saccharomyces cerevisiae , 2000, Nature.

[27]  E. O’Shea,et al.  Global analysis of protein localization in budding yeast , 2003, Nature.

[28]  Thomas Wilhelm,et al.  Physical and Functional Modularity of the Protein Network in Yeast* , 2003, Molecular & Cellular Proteomics.

[29]  A. Wagner The yeast protein interaction network evolves rapidly and contains few redundant duplicate genes. , 2001, Molecular biology and evolution.

[30]  Kurt Jordaens,et al.  Multiple UPGMA and Neighbor-joining Trees and the Performance of Some Computer Packages , 1996 .

[31]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[32]  Francis D. Gibbons,et al.  Judging the quality of gene expression-based clustering methods using gene annotation. , 2002, Genome research.

[33]  Gary D Bader,et al.  Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry , 2002, Nature.

[34]  P. Kemmeren,et al.  Protein interaction verification and functional annotation by integrated analysis of genome-scale data. , 2002, Molecular cell.

[35]  B. Snel,et al.  Comparative assessment of large-scale data sets of protein–protein interactions , 2002, Nature.

[36]  James R. Knight,et al.  A Protein Interaction Map of Drosophila melanogaster , 2003, Science.

[37]  L. Mirny,et al.  Protein complexes and functional modules in molecular networks , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[38]  John Quackenbush,et al.  Computational genetics: Computational analysis of microarray data , 2001, Nature Reviews Genetics.

[39]  D. Bu,et al.  Topological structure analysis of the protein-protein interaction network in budding yeast. , 2003, Nucleic acids research.

[40]  S. L. Wong,et al.  A Map of the Interactome Network of the Metazoan C. elegans , 2004, Science.

[41]  Gary D Bader,et al.  Global Mapping of the Yeast Genetic Interaction Network , 2004, Science.

[42]  R. Ozawa,et al.  A comprehensive two-hybrid analysis to explore the yeast protein interactome , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[43]  Julien Gagneur,et al.  Modular decomposition of protein-protein interaction networks , 2004, Genome Biology.