A novel subgradient-based optimization algorithm for blockmodel functional module identification

Functional module identification in biological networks may provide new insights into the complex interactions among biomolecules for a better understanding of cellular functional organization. Most of existing functional module identification methods are based on the optimization of network modularity and cluster networks into groups of nodes within which there are a higher-than-expectation number of edges. However, module identification simply based on this topological criterion may not discover certain kinds of biologically meaningful modules within which nodes are sparsely connected but have similar interaction patterns with the rest of the network. In order to unearth more biologically meaningful functional modules, we propose a novel efficient convex programming algorithm based on the subgradient method with heuristic path generation to solve the problem in a recently proposed framework of blockmodel module identification. We have implemented our algorithm for large-scale protein-protein interaction (PPI) networks, including Saccharomyces cerevisia and Homo sapien PPI networks collected from the Database of Interaction Proteins (DIP) and Human Protein Reference Database (HPRD). Our experimental results have shown that our algorithm achieves comparable network clustering performance in comparison to the more time-consuming simulated annealing (SA) optimization. Furthermore, preliminary results for identifying fine-grained functional modules in both biological networks and the comparison with the commonly adopted Markov Clustering (MCL) algorithm have demonstrated the potential of our algorithm to discover new types of modules, within which proteins are sparsely connected but with significantly enriched biological functionalities.

[1]  David Botstein,et al.  GO: : TermFinder--open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes , 2004, Bioinform..

[2]  Marcus Kaiser Mean clustering coefficients: the role of isolated nodes and leafs on clustering measures for small-world networks , 2008, 0802.2512.

[3]  Julien Mairal,et al.  Optimization with Sparsity-Inducing Penalties , 2011, Found. Trends Mach. Learn..

[4]  International Human Genome Sequencing Consortium Initial sequencing and analysis of the human genome , 2001, Nature.

[5]  Chris H Wiggins,et al.  Bayesian approach to network modularity. , 2007, Physical review letters.

[6]  Douglas R. White,et al.  Role models for complex networks , 2007, 0708.0958.

[7]  Mauricio G. C. Resende,et al.  GRASP with path-relinking for the generalized quadratic assignment problem , 2011, J. Heuristics.

[8]  Stefan Bornholdt,et al.  Structure in Networks , 2010 .

[9]  Adam J. Smith,et al.  The Database of Interacting Proteins: 2004 update , 2004, Nucleic Acids Res..

[10]  angesichts der Corona-Pandemie,et al.  UPDATE , 1973, The Lancet.

[11]  Sandhya Rani,et al.  Human Protein Reference Database—2009 update , 2008, Nucleic Acids Res..

[12]  E. Ziv,et al.  Information-theoretic approach to network modularity. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[13]  Ziv Bar-Joseph,et al.  Biological interaction networks are conserved at the module level , 2011, BMC Systems Biology.

[14]  Yijie Wang,et al.  Functional module identification by block modeling using simulated annealing with path relinking , 2012, BCB.

[15]  E A Leicht,et al.  Mixture models and exploratory analysis in networks , 2006, Proceedings of the National Academy of Sciences.

[16]  Rainer Storn,et al.  Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces , 1997, J. Glob. Optim..

[17]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[18]  Julien Mairal,et al.  Convex optimization with sparsity-inducing norms , 2011 .

[19]  G van der Pluijm,et al.  Smad2 and Smad3 have opposing roles in breast cancer bone metastasis by differentially affecting tumor angiogenesis , 2010, Oncogene.

[20]  J. Hopfield,et al.  From molecular to modular cell biology , 1999, Nature.

[21]  Christopher C. Moser,et al.  Natural engineering principles of electron tunnelling in biological oxidation–reduction , 1999, Nature.

[22]  Jörg Schultz,et al.  Protein Interaction Networks—More Than Mere Modules , 2008, PLoS Comput. Biol..

[23]  R. Sharan,et al.  Network-based prediction of protein function , 2007, Molecular systems biology.

[24]  S. Dongen A cluster algorithm for graphs , 2000 .

[25]  Reza Akbari,et al.  A multilevel evolutionary algorithm for optimizing numerical functions , 2011 .

[26]  Dimitri P. Bertsekas,et al.  Nonlinear Programming , 1997 .