Inferring domain-domain interactions from protein-protein interactions in the complex network conformation

BackgroundAs protein domains are functional and structural units of proteins, a large proportion of protein-protein interactions (PPIs) are achieved by domain-domain interactions (DDIs), many computational efforts have been made to identify DDIs from experimental PPIs since high throughput technologies have produced a large number of PPIs for different species. These methods can be separated into two categories: deterministic and probabilistic. In deterministic methods, parsimony assumption has been utilized. Parsimony principle has been widely used in computational biology as the evolution of the nature is considered as a continuous optimization process. In the context of identifying DDIs, parsimony methods try to find a minimal set of DDIs that can explain the observed PPIs. This category of methods are promising since they can be formulated and solved easily. Besides, researches have shown that they can detect specific DDIs, which is often hard for many probabilistic methods. We notice that existing methods just view PPI networks as simply assembled by single interactions, but there is now ample evidence that PPI networks should be considered in a global (systematic) point of view for it exhibits general properties of complex networks, such as 'scale-free' and 'small-world'.ResultsIn this work, we integrate this global point of view into the parsimony-based model. Particularly, prior knowledge is extracted from these global properties by plausible reasoning and then taken as input. We investigate the role of the added information extensively through numerical experiments. Results show that the proposed method has improved performance, which confirms the biological meanings of the extracted prior knowledge.ConclusionsThis work provides us some clues for using these properties of complex networks in computational models and to some extent reveals the biological meanings underlying these general network properties.

[1]  G. Church,et al.  Correlation between transcriptome and interactome mapping data from Saccharomyces cerevisiae , 2001, Nature Genetics.

[2]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[3]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[4]  Luonan Chen,et al.  A parsimonious tree-grow method for haplotype inference , 2005, Bioinform..

[5]  Robert D. Finn,et al.  iPfam: visualization of protein?Cprotein interactions in PDB at domain and amino acid resolutions , 2005, Bioinform..

[6]  R. Ozawa,et al.  A comprehensive two-hybrid analysis to explore the yeast protein interactome , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[7]  S. Fields,et al.  Networking proteins in yeast , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[8]  T. Chiba,et al.  Exploring the protein interactome using comprehensive two-hybrid projects. , 2001, Trends in biotechnology.

[9]  P. Legrain,et al.  Genome‐wide protein interaction maps using two‐hybrid systems , 2000, FEBS letters.

[10]  R. Rosenfeld Nature , 2009, Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery.

[11]  Sarah A Teichmann,et al.  Novel specificities emerge by stepwise duplication of functional modules. , 2005, Genome research.

[12]  M. Newman,et al.  Mixing patterns in networks. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[13]  E. Sprinzak,et al.  Correlated sequence-signatures as markers of protein-protein interaction. , 2001, Journal of molecular biology.

[14]  D. Eisenberg,et al.  A combined algorithm for genome-wide prediction of protein function , 1999, Nature.

[15]  Teresa M. Przytycka,et al.  Interrogating domain-domain interactions with parsimony based approaches , 2008, BMC Bioinformatics.

[16]  Rolf Apweiler,et al.  The SWISS-PROT protein sequence data bank and its supplement TrEMBL , 1997, Nucleic Acids Res..

[17]  Lusheng Wang,et al.  Haplotype inference by maximum parsimony , 2003, Bioinform..

[18]  Luonan Chen,et al.  Inferring Protein-Protein Interactions by Combinatorial Models , 2007 .

[19]  D. Eisenberg,et al.  Detecting protein function and protein-protein interactions from genome sequences. , 1999, Science.

[20]  P. Bork,et al.  Functional organization of the yeast proteome by systematic analysis of protein complexes , 2002, Nature.

[21]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[22]  T. Ito,et al.  Toward a protein-protein interaction map of the budding yeast: A comprehensive system to examine two-hybrid interactions in all possible combinations between the yeast proteins. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[23]  Alan M. Frieze,et al.  Random graphs , 2006, SODA '06.

[24]  Robert Fredriksson,et al.  Genetic algorithm for large-scale maximum parsimony phylogenetic analysis of proteins. , 2005, Biochimica et biophysica acta.

[25]  Jérôme Wojcik,et al.  Protein-protein interaction map inference using interacting domain profile pairs , 2001, ISMB.

[26]  K. Guimaraes,et al.  Predicting domain-domain interactions using a parsimony approach , 2006, Genome Biology.

[27]  Joel Best,et al.  Damned Lies And Statistics , 2012 .

[28]  Ioannis Xenarios,et al.  DIP: the Database of Interacting Proteins , 2000, Nucleic Acids Res..

[29]  Ratko Magjarević,et al.  World Congress on Medical Physics and Biomedical Engineering 2006 , 2007 .

[30]  Hans-Werner Mewes,et al.  MIPS: a database for protein sequences, homology data and yeast genome information , 1997, Nucleic Acids Res..

[31]  B. Barrell,et al.  Life with 6000 Genes , 1996, Science.

[32]  Andrew E. Firth,et al.  GLUE-IT and PEDEL-AA: new programmes for analyzing protein diversity in randomized libraries , 2008, Nucleic Acids Res..

[33]  David A. Gough,et al.  Predicting protein-protein interactions from primary structure , 2001, Bioinform..

[34]  D. Eisenberg,et al.  Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[35]  Anton J. Enright,et al.  Protein interaction maps for complete genomes based on gene fusion events , 1999, Nature.

[36]  Christopher J. Lee,et al.  Inferring protein domain interactions from databases of interacting proteins , 2005, Genome Biology.

[37]  L. Holm,et al.  The Pfam protein families database , 2005, Nucleic Acids Res..

[38]  Alessandro Vespignani,et al.  Detecting rich-club ordering in complex networks , 2006, physics/0602134.

[39]  James R. Knight,et al.  A comprehensive analysis of protein–protein interactions in Saccharomyces cerevisiae , 2000, Nature.

[40]  Rolf Apweiler,et al.  The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000 , 2000, Nucleic Acids Res..

[41]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[42]  Gary D Bader,et al.  Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry , 2002, Nature.

[43]  Robert D. Finn,et al.  InterPro: the integrative protein signature database , 2008, Nucleic Acids Res..

[44]  Holger Fröhlich,et al.  GOSim – an R-package for computation of information theoretic GO similarities between terms and gene products , 2007, BMC Bioinformatics.

[45]  E. Birney,et al.  Pfam: the protein families database , 2013, Nucleic Acids Res..

[46]  BMC Systems Biology , 2007 .

[47]  Robert B. Russell,et al.  3did: interacting protein domains of known three-dimensional structure , 2004, Nucleic Acids Res..