Cluster analysis of networks generated through homology: automatic identification of important protein communities involved in cancer metastasis

BackgroundProtein-protein interactions have traditionally been studied on a small scale, using classical biochemical methods to investigate the proteins of interest. More recently large-scale methods, such as two-hybrid screens, have been utilised to survey extensive portions of genomes. Current high-throughput approaches have a relatively high rate of errors, whereas in-depth biochemical studies are too expensive and time-consuming to be practical for extensive studies. As a result, there are gaps in our knowledge of many key biological networks, for which computational approaches are particularly suitable.ResultsWe constructed networks, or 'interactomes', of putative protein-protein interactions in the rat proteome – the rat being an organism extensively used for cancer studies. This was achieved by integrating experimental protein-protein interaction data from many species and translating this data into the reference frame of the rat. The putative rat protein interactions were given confidence scores based on their homology to proteins that have been experimentally observed to interact. The confidence score was furthermore weighted according to the extent of the experimental evidence, giving a higher weight to more frequently observed interactions. The scoring function was subsequently validated and networks constructed around key proteins, identified as being highly up- or down-regulated in rat cell lines of high metastatic potential. Using clustering methods on the networks, we have identified key protein communities involved in cancer metastasis.ConclusionThe protein network generation and subsequent network analysis used here, were shown to be useful for highlighting key proteins involved in metastasis. This approach, in conjunction with microarray expression data, can be extended to other species, thereby suggesting possible pathways around proteins of interest.

[1]  M. Shibuya,et al.  VEGF activates protein kinase C-dependent, but Ras-independent Raf-MEK-MAP kinase pathway for DNA synthesis in primary endothelial cells , 1999, Oncogene.

[2]  Adam J. Smith,et al.  The Database of Interacting Proteins: 2004 update , 2004, Nucleic Acids Res..

[3]  Dmitrij Frishman,et al.  The MIPS mammalian protein?Cprotein interaction database , 2005, Bioinform..

[4]  T. Vicsek,et al.  Uncovering the overlapping community structure of complex networks in nature and society , 2005, Nature.

[5]  C. Niehrs,et al.  Synexpression groups in eukaryotes , 1999, Nature.

[6]  Jong H. Park,et al.  Mapping protein family interactions: intramolecular and intermolecular protein family interaction repertoires in the PDB and yeast. , 2001, Journal of molecular biology.

[7]  M. Gerstein,et al.  Assessing the limits of genomic data integration for predicting protein networks. , 2005, Genome research.

[8]  Christian V. Forst,et al.  Differential network expression during drug and stress response , 2005, Bioinform..

[9]  Florian Iragne,et al.  IPPRED: Server for Proteins Interactions Inference , 2003, Bioinform..

[10]  P. Bork,et al.  Functional organization of the yeast proteome by systematic analysis of protein complexes , 2002, Nature.

[11]  J. Rothberg,et al.  Gaining confidence in high-throughput protein interaction networks , 2004, Nature Biotechnology.

[12]  Gary D Bader,et al.  Analyzing yeast protein–protein interaction data obtained from different sources , 2002, Nature Biotechnology.

[13]  Paul A. Bates,et al.  Domain Fishing: a first step in protein comparative modelling , 2002, Bioinform..

[14]  A. Barabasi,et al.  Lethality and centrality in protein networks , 2001, Nature.

[15]  Assam El-Osta,et al.  RNA interference and potential therapeutic applications of short interfering RNAs , 2005, Cancer Gene Therapy.

[16]  C Collins,et al.  Insulin‐like growth factor I receptor primary structure: comparison with insulin receptor suggests structural determinants that define functional specificity. , 1986, The EMBO journal.

[17]  Tatiana Tatusova,et al.  NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins , 2004, Nucleic Acids Res..

[18]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[19]  Marc Vidal,et al.  Predictive models of molecular machines involved in Caenorhabditis elegans early embryogenesis , 2005, Nature.

[20]  Gary D Bader,et al.  Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry , 2002, Nature.

[21]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[22]  S. Ali,et al.  SHP-2 Regulates SOCS-1-mediated Janus Kinase-2 Ubiquitination/Degradation Downstream of the Prolactin Receptor* , 2003, Journal of Biological Chemistry.

[23]  C T Roberts,et al.  Molecular and cellular aspects of the insulin-like growth factor I receptor. , 1995, Endocrine reviews.

[24]  A. Valencia,et al.  Computational methods for the prediction of protein interactions. , 2002, Current opinion in structural biology.

[25]  Frans van Roy,et al.  Alpha-catenin is required for IGF-I-induced cellular migration but not invasion in human colonic cancer cells , 2004, Oncogene.

[26]  M. Mann,et al.  Analysis of proteins and proteomes by mass spectrometry. , 2001, Annual review of biochemistry.

[27]  W. Alexander,et al.  The role of suppressors of cytokine signaling (SOCS) proteins in regulation of the immune response. , 2004, Annual review of immunology.

[28]  Francesco Hofmann,et al.  Blocking the insulin-like growth factor-I receptor as a strategy for targeting cancer. , 2005, Drug discovery today.

[29]  Benno Schwikowski,et al.  Discovering regulatory and signalling circuits in molecular interaction networks , 2002, ISMB.

[30]  T. Vicsek,et al.  Clique percolation in random networks. , 2005, Physical review letters.

[31]  Shmuel Sattath,et al.  How reliable are experimental protein-protein interaction data? , 2003, Journal of molecular biology.

[32]  Mark A. van de Wiel,et al.  Microarray Data Analysis: From Hypotheses to Conclusions Using Gene Expression Data , 2004, Cellular oncology : the official journal of the International Society for Cellular Oncology.

[33]  Tatiana A. Tatusova,et al.  NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins , 2004, Nucleic Acids Res..

[34]  Daniel Hanisch,et al.  New methods for joint analysis of biological networks and expression data , 2004, German Conference on Bioinformatics.

[35]  M. Gerstein,et al.  A Bayesian Networks Approach for Predicting Protein-Protein Interactions from Genomic Data , 2003, Science.

[36]  Martine J Jager,et al.  Insulin-like growth factor-1 receptor in uveal melanoma: a predictor for metastatic disease and a potential therapeutic target. , 2002, Investigative ophthalmology & visual science.

[37]  J. Massagué,et al.  The subunit structures of two distinct receptors for insulin-like growth factors I and II and their relationship to the insulin receptor. , 1982, The Journal of biological chemistry.

[38]  M. White,et al.  The IRS‐signalling system during insulin and cytokine action , 1997, BioEssays : news and reviews in molecular, cellular and developmental biology.

[39]  M. Gerstein,et al.  Integration of genomic datasets to predict protein complexes in yeast , 2004, Journal of Structural and Functional Genomics.

[40]  Sandra,et al.  A dominant negative mutant of the insulin-like growth factor-I receptor inhibits the adhesion, invasion, and metastasis of breast cancer. , 1998, Cancer research.

[41]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[42]  Ioannis Xenarios,et al.  DIP: The Database of Interacting Proteins: 2001 update , 2001, Nucleic Acids Res..

[43]  P. Brazhnik,et al.  Gene networks: how to put the function in genomics. , 2002, Trends in biotechnology.

[44]  John D. Storey,et al.  A network-based analysis of systemic inflammation in humans , 2005, Nature.

[45]  Guy Mouchiroud,et al.  Suppressor of Cytokine Signaling 1 Interacts with the Macrophage Colony-stimulating Factor Receptor and Negatively Regulates Its Proliferation Signal* , 2001, The Journal of Biological Chemistry.

[46]  N. Ferrara,et al.  The biology of VEGF and its receptors , 2003, Nature Medicine.

[47]  Haidong Wang,et al.  Discovering molecular pathways from protein interaction and gene expression data , 2003, ISMB.

[48]  T. Barrette,et al.  Probabilistic model of the human protein-protein interaction network , 2005, Nature Biotechnology.

[49]  R. Russell,et al.  Protein complexes: structure prediction challenges for the 21st century. , 2005, Current opinion in structural biology.

[50]  M. Raffeld,et al.  Increased Expression of Insulin-Like Growth Factor I and/or Its Receptor in Gastrinomas Is Associated with Low Curability, Increased Growth, and Development of Metastases , 2005, Clinical Cancer Research.

[51]  Satoshi Hirakawa,et al.  VEGF-A induces tumor and sentinel lymph node lymphangiogenesis and promotes lymphatic metastasis , 2005, The Journal of experimental medicine.

[52]  Simon Rogers,et al.  A Bayesian regression approach to the inference of regulatory networks from gene expression data , 2005, Bioinform..

[53]  Soo Young Park,et al.  15d-PGJ2 and Rosiglitazone Suppress Janus Kinase-STAT Inflammatory Signaling through Induction of Suppressor of Cytokine Signaling 1 (SOCS1) and SOCS3 in Glia* , 2003, The Journal of Biological Chemistry.

[54]  B. Rost,et al.  Analysing six types of protein-protein interfaces. , 2003, Journal of molecular biology.

[55]  Michael Boutros,et al.  Identification of JAK/STAT signalling components by genome-wide RNA interference , 2005, Nature.

[56]  James R. Knight,et al.  A comprehensive analysis of protein–protein interactions in Saccharomyces cerevisiae , 2000, Nature.

[57]  P. Bork,et al.  Dynamic Complex Formation During the Yeast Cell Cycle , 2005, Science.