Network Alignment Using Graphlet Signature and High Order Proximity

Network alignment problem arises in graph-based problem formulation of many computer science and biological problems. The alignment task is to identify the best one-to-one matching between vertices for a pair of networks by considering the local topology or vertex attributes or both. Existing algorithms for network alignment uses a diverse set of methodologies, such as, Eigen-decomposition of a similarity matrix, solving a quadratic assignment problem through subgradiant optimization, or heuristic-based iterative greedy matching. However, these methods are either too slow, or they have poor matching performance. Some existing methods also require extensive external node attributes as prior information for the purpose of node matching. In this paper, we develop a novel topology-based network alignment approach which we call GraphletAlign. The proposed method uses graphlet signature as node attributes and then uses a bi-partite matching algorithm for obtaining an initial alignment, which is later refined by considering higher-order matching. Our results on large real-life networks show the superiority of GraphletAlign over the existing methods; specifically, GraphletAlign’s accuracy improvement is up to 20%–72% compared to existing network alignment methods over six large real-life networks.

[1]  O. Kuchaiev,et al.  Topological network alignment uncovers biological function and phylogeny , 2008, Journal of The Royal Society Interface.

[2]  Christoph Schnörr,et al.  Probabilistic Subgraph Matching Based on Convex Relaxation , 2005, EMMCVPR.

[3]  Gunnar W. Klau,et al.  A new graph-based method for pairwise global network alignment , 2009, BMC Bioinformatics.

[4]  Serafim Batzoglou,et al.  Integrated Protein Interaction Networks for 11 Microbes , 2006, RECOMB.

[5]  Mario Vento,et al.  Thirty Years Of Graph Matching In Pattern Recognition , 2004, Int. J. Pattern Recognit. Artif. Intell..

[6]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[7]  Mohammad Al Hasan,et al.  Graft: An Efficient Graphlet Counting Method for Large Graph Analysis , 2014, IEEE Transactions on Knowledge and Data Engineering.

[8]  Janez Demsar,et al.  A combinatorial approach to graphlet counting , 2014, Bioinform..

[9]  Mark Heimann,et al.  REGAL: Representation Learning-based Graph Alignment , 2018, CIKM.

[10]  H. Kuhn The Hungarian method for the assignment problem , 1955 .

[11]  Bonnie Berger,et al.  Pairwise Global Alignment of Protein Interaction Networks by Matching Neighborhood Topology , 2007, RECOMB.

[12]  Ramayya Krishnan,et al.  HYDRA: large-scale social identity linkage via heterogeneous behavior modeling , 2014, SIGMOD Conference.

[13]  Erhard Rahm,et al.  Similarity flooding: a versatile graph matching algorithm and its application to schema matching , 2002, Proceedings 18th International Conference on Data Engineering.

[14]  Ying Wang,et al.  Message-Passing Algorithms for Sparse Network Alignment , 2009, TKDD.

[15]  Ben Taskar,et al.  Word Alignment via Quadratic Assignment , 2006, NAACL.

[16]  Hanghang Tong,et al.  FINAL: Fast Attributed Network Alignment , 2016, KDD.

[17]  Mingzhe Wang,et al.  LINE: Large-scale Information Network Embedding , 2015, WWW.