Features analysis for identification of date and party hubs in protein interaction network of Saccharomyces Cerevisiae

BackgroundIt has been understood that biological networks have modular organizations which are the sources of their observed complexity. Analysis of networks and motifs has shown that two types of hubs, party hubs and date hubs, are responsible for this complexity. Party hubs are local coordinators because of their high co-expressions with their partners, whereas date hubs display low co-expressions and are assumed as global connectors. However there is no mutual agreement on these concepts in related literature with different studies reporting their results on different data sets. We investigated whether there is a relation between the biological features of Saccharomyces Cerevisiae's proteins and their roles as non-hubs, intermediately connected, party hubs, and date hubs. We propose a classifier that separates these four classes.ResultsWe extracted different biological characteristics including amino acid sequences, domain contents, repeated domains, functional categories, biological processes, cellular compartments, disordered regions, and position specific scoring matrix from various sources. Several classifiers are examined and the best feature-sets based on average correct classification rate and correlation coefficients of the results are selected. We show that fusion of five feature-sets including domains, Position Specific Scoring Matrix-400, cellular compartments level one, and composition pairs with two and one gaps provide the best discrimination with an average correct classification rate of 77%.ConclusionsWe study a variety of known biological feature-sets of the proteins and show that there is a relation between domains, Position Specific Scoring Matrix-400, cellular compartments level one, composition pairs with two and one gaps of Saccharomyces Cerevisiae' s proteins, and their roles in the protein interaction network as non-hubs, intermediately connected, party hubs and date hubs. This study also confirms the possibility of predicting non-hubs, party hubs and date hubs based on their biological features with acceptable accuracy. If such a hypothesis is correct for other species as well, similar methods can be applied to predict the roles of proteins in those species.

[1]  Gary D Bader,et al.  Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry , 2002, Nature.

[2]  A. Barabasi,et al.  Network biology: understanding the cell's functional organization , 2004, Nature Reviews Genetics.

[3]  Hui Lu,et al.  MULTIPROSPECTOR: An algorithm for the prediction of protein–protein interactions by multimeric threading , 2002, Proteins.

[4]  Erkki Oja,et al.  Independent component analysis: algorithms and applications , 2000, Neural Networks.

[5]  James R. Knight,et al.  A Protein Interaction Map of Drosophila melanogaster , 2003, Science.

[6]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[7]  H. Lehrach,et al.  A Human Protein-Protein Interaction Network: A Resource for Annotating the Proteome , 2005, Cell.

[8]  Albert-László Barabási,et al.  Error and attack tolerance of complex networks , 2000, Nature.

[9]  Andrei Zinovyev,et al.  Principal Manifolds for Data Visualization and Dimension Reduction , 2007 .

[10]  David G. Stork,et al.  Pattern Classification , 1973 .

[11]  Christian Blaschke,et al.  Text Mining for Metabolic Pathways, Signaling Cascades, and Protein Networks , 2005, Science's STKE.

[12]  Xinhua Zhuang,et al.  Gaussian mixture density modeling, decomposition, and applications , 1996, IEEE Trans. Image Process..

[13]  Anna Tramontano,et al.  The ten most wanted solutions in protein bioinformatics , 2005 .

[14]  K. Chou Prediction of protein cellular attributes using pseudo‐amino acid composition , 2001, Proteins.

[15]  Artem Cherkasov,et al.  Predicting highly-connected hubs in protein interaction networks by QSAR and biological data descriptors , 2009, Bioinformation.

[16]  Bhaskar D. Kulkarni,et al.  Using pseudo amino acid composition to predict protein subnuclear localization: Approached with PSSM , 2007, Pattern Recognit. Lett..

[17]  Mamoon Rashid,et al.  Support Vector Machine-based method for predicting subcellular localization of mycobacterial proteins using evolutionary information and motifs , 2007, BMC Bioinformatics.

[18]  Xiang-Sun Zhang,et al.  Hubs with Network Motifs Organize Modularity Dynamically in the Protein-Protein Interaction Network of Yeast , 2007, PloS one.

[19]  Ziv Bar-Joseph,et al.  Evaluation of different biological data and computational classification methods for use in protein interaction prediction , 2006, Proteins.

[20]  R. Albert Scale-free networks in cell biology , 2005, Journal of Cell Science.

[21]  K. Dolinski,et al.  Use and misuse of the gene ontology annotations , 2008, Nature Reviews Genetics.

[22]  Sean R. Collins,et al.  Global landscape of protein complexes in the yeast Saccharomyces cerevisiae , 2006, Nature.

[23]  James R. Knight,et al.  A comprehensive analysis of protein–protein interactions in Saccharomyces cerevisiae , 2000, Nature.

[24]  P. Bork,et al.  Proteome survey reveals modularity of the yeast cell machinery , 2006, Nature.

[25]  Qianzhong Li,et al.  Using pseudo amino acid composition to predict protein structural class: Approached by incorporating 400 dipeptide components , 2007, J. Comput. Chem..

[26]  R. Russell,et al.  Illuminating drug discovery with biological pathways , 2005, FEBS letters.

[27]  Chung-Yen Lin,et al.  Hubba: hub objects analyzer—a framework of interactome hubs identification for network biology , 2008, Nucleic Acids Res..

[28]  Matthias Scholz,et al.  Nonlinear Principal Component Analysis: Neural Network Models and Applications , 2008 .

[29]  S. L. Wong,et al.  Towards a proteome-scale map of the human protein–protein interaction network , 2005, Nature.

[30]  See-Kiong Ng,et al.  Integrative approach for computationally inferring protein domain interactions , 2003, SAC '03.

[31]  D. Eisenberg,et al.  Detecting protein function and protein-protein interactions from genome sequences. , 1999, Science.

[32]  J. S. Sodhi,et al.  Prediction and functional analysis of native disorder in proteins from the three kingdoms of life. , 2004, Journal of molecular biology.

[33]  Patrick Aloy,et al.  Interrogating protein interaction networks through structural biology , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[34]  A. Barabasi,et al.  Lethality and centrality in protein networks , 2001, Nature.

[35]  Martin Vingron,et al.  IntAct: an open source molecular interaction database , 2004, Nucleic Acids Res..

[36]  Artem Cherkasov,et al.  The use of Gene Ontology terms for predicting highly-connected 'hub' nodes in protein-protein interaction networks , 2008, BMC Systems Biology.

[37]  Benno Schwikowski,et al.  Predicting protein-peptide interactions via a network-based motif sampler , 2004, ISMB/ECCB.

[38]  Gajendra Pal Singh Raghava,et al.  Prediction of β‐turns in proteins from multiple alignment using neural network , 2003, Protein science : a publication of the Protein Society.

[39]  Lan V. Zhang,et al.  Evidence for dynamically organized modularity in the yeast protein–protein interaction network , 2004, Nature.

[40]  Ao Li,et al.  LOCSVMPSI: a web server for subcellular localization of eukaryotic proteins using SVM and profile of PSI-BLAST , 2005, Nucleic Acids Res..

[41]  M. Tyers,et al.  Stratus Not Altocumulus: A New View of the Yeast Protein Interaction Network , 2006, PLoS biology.

[42]  J. Doyle,et al.  Some protein interaction data do not exhibit power law statistics , 2005, FEBS letters.

[43]  Artem Cherkasov,et al.  The Use of Sequence‐Derived QSPR Descriptors for Predicting Highly Connected Proteins (Hubs) in Protein–Protein Interactions , 2009 .

[44]  Robert M. Haralick,et al.  Textural Features for Image Classification , 1973, IEEE Trans. Syst. Man Cybern..

[45]  G P S Raghava,et al.  A neural-network based method for prediction of gamma-turns in proteins from multiple sequence alignment. , 2003, Protein science : a publication of the Protein Society.

[46]  David G. Stork,et al.  Pattern classification, 2nd Edition , 2000 .

[47]  S. L. Wong,et al.  A Map of the Interactome Network of the Metazoan C. elegans , 2004, Science.

[48]  R. Overbeek,et al.  The use of gene clusters to infer functional coupling. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[49]  Jianzhi Zhang,et al.  Why Do Hubs Tend to Be Essential in Protein Networks? , 2006, PLoS genetics.

[50]  Alex W. Wilkinson,et al.  Computational prediction of protein-protein interactions , 2012 .

[51]  Gajendra P. S. Raghava,et al.  A neural‐network based method for prediction of γ‐turns in proteins from multiple sequence alignment , 2003, Protein science : a publication of the Protein Society.

[52]  A. Elofsson,et al.  What properties characterize the hub proteins of the protein-protein interaction network of Saccharomyces cerevisiae? , 2006, Genome Biology.

[53]  A. Grigoriev A relationship between gene expression and protein interactions on the proteome scale: analysis of the bacteriophage T7 and the yeast Saccharomyces cerevisiae. , 2001, Nucleic acids research.

[54]  Emily Dimmer,et al.  The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene Ontology , 2004, Nucleic Acids Res..

[55]  Andrey Rzhetsky,et al.  Towards the Prediction of Complete Protein-Protein Interaction Networks , 2001, Pacific Symposium on Biocomputing.

[56]  C. Daub,et al.  BMC Systems Biology , 2007 .

[57]  Charlotte M. Deane,et al.  Revisiting Date and Party Hubs: Novel Approaches to Role Assignment in Protein Interaction Networks , 2009, PLoS Comput. Biol..

[58]  Kristine M. Yu,et al.  Theoretical Determination of Amino Acid Substitution Groups based on Qualitative Physicochemical Properties , 2001 .

[59]  A. Emili,et al.  Interaction network containing conserved and essential protein complexes in Escherichia coli , 2005, Nature.

[60]  D. Eisenberg,et al.  Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[61]  Andrew R. Webb,et al.  Statistical Pattern Recognition , 1999 .

[62]  B. Snel,et al.  Conservation of gene order: a fingerprint of proteins that physically interact. , 1998, Trends in biochemical sciences.

[63]  Grigorios Tsoumakas,et al.  Protein Classification with Multiple Algorithms , 2005, Panhellenic Conference on Informatics.

[64]  Anton Yuryev,et al.  Extracting human protein interactions from MEDLINE using a full-sentence parser , 2004, Bioinform..

[65]  Minoru Kanehisa,et al.  Prediction of protein subcellular locations by support vector machines using compositions of amino acids and amino acid pairs , 2003, Bioinform..

[66]  M. Gerstein,et al.  Relating whole-genome expression data with protein-protein interactions. , 2002, Genome research.

[67]  M. Vidal,et al.  Identification of potential interaction networks using sequence-based searches for conserved protein-protein interactions or "interologs". , 2001, Genome research.

[68]  G. Church,et al.  Correlation between transcriptome and interactome mapping data from Saccharomyces cerevisiae , 2001, Nature Genetics.

[69]  R. Agarwala,et al.  Protein database searches using compositionally adjusted substitution matrices , 2005, The FEBS journal.

[70]  R. Ozawa,et al.  A comprehensive two-hybrid analysis to explore the yeast protein interactome , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[71]  Anton J. Enright,et al.  Protein interaction maps for complete genomes based on gene fusion events , 1999, Nature.

[72]  K. Kinoshita,et al.  Hub Promiscuity in Protein-Protein Interaction Networks , 2010, International journal of molecular sciences.

[73]  O. Keskin,et al.  Structural properties of hub proteins , 2010, 2010 5th International Symposium on Health Informatics and Bioinformatics.