Unsupervised Network Alignment

Identifying the common users shared by different online social sites is a very hard task even for humans. Manually labeling of the anchor links can be extremely challenging, expensive (in human efforts, time, and money costs), and tedious, and the scale of the real-world online social networks involving millions even billions of users also renders the training data labeling much more difficult. In this chapter, we will introduce several approaches to resolve the network alignment problem based on the unsupervised learning setting instead, where no labeled training data will be needed in model building.

[1]  Martin G. Everett,et al.  A Graph-theoretic perspective on centrality , 2006, Soc. Networks.

[2]  Guoliang Li,et al.  String similarity search and join: a survey , 2016, Frontiers of Computer Science.

[3]  Christian Bettstetter,et al.  On the minimum node degree and connectivity of a wireless multihop network , 2002, MobiHoc '02.

[4]  Philip S. Yu,et al.  Inferring anchor links across multiple heterogeneous social networks , 2013, CIKM.

[5]  Wael Hassan Gomaa,et al.  A Survey of Text Similarity Approaches , 2013 .

[6]  Keikichi Hirose,et al.  A measure of phonetic similarity to quantify pronunciation variation by using ASR technology , 2015, ICPhS.

[7]  Michael J. Fischer,et al.  The String-to-String Correction Problem , 1974, JACM.

[8]  Kaare Brandt Petersen,et al.  The Matrix Cookbook , 2006 .

[9]  Y. Aflalo,et al.  On convex relaxation of graph isomorphism , 2015, Proceedings of the National Academy of Sciences.

[10]  Matthew A. Jaro,et al.  Advances in Record-Linkage Methodology as Applied to Matching the 1985 Census of Tampa, Florida , 1989 .

[11]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[12]  B. Rao,et al.  ℓâ‚€-norm Minimization for Basis Selection , 2004, NIPS 2004.

[13]  Pradeep Ravikumar,et al.  A Comparison of String Distance Metrics for Name-Matching Tasks , 2003, IIWeb.

[14]  W. B. Cavnar,et al.  N-gram-based text categorization , 1994 .

[15]  Richard M. Karp,et al.  Theoretical Improvements in Algorithmic Efficiency for Network Flow Problems , 1972, Combinatorial Optimization.

[16]  D. Corneil,et al.  An Efficient Algorithm for Graph Isomorphism , 1970, JACM.

[17]  Chandler Davis The norm of the Schur product operation , 1962 .

[18]  Zellig S. Harris,et al.  Distributional Structure , 1954 .

[19]  Philip S. Yu,et al.  MCD: Mutual Clustering across Multiple Social Networks , 2015, 2015 IEEE International Congress on Big Data.

[20]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[21]  William E. Winkler,et al.  String Comparator Metrics and Enhanced Decision Rules in the Fellegi-Sunter Model of Record Linkage. , 1990 .

[22]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[23]  Jeong-Hoon Lee,et al.  An In-depth Comparison of Subgraph Isomorphism Algorithms in Graph Databases , 2012, Proc. VLDB Endow..

[24]  Lada A. Adamic,et al.  Search in Power-Law Networks , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[25]  Philip S. Yu,et al.  Meta-path based multi-network collective link prediction , 2014, KDD.

[26]  Philip S. Yu,et al.  PCT: Partial Co-Alignment of Social Networks , 2016, WWW.

[27]  Reza Zafarani,et al.  Connecting users across social media sites: a behavioral-modeling approach , 2013, KDD.

[28]  Andrew V. Goldberg,et al.  Beyond the flow decomposition barrier , 1998, JACM.

[29]  Daniel S. Hirschberg,et al.  Algorithms for the Longest Common Subsequence Problem , 1977, JACM.

[30]  T. E. Harris,et al.  Fundamentals of a Method for Evaluating Rail Net Capacities , 1955 .

[31]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[32]  Bonnie Berger,et al.  Global alignment of multiple protein interaction networks with application to functional orthology detection , 2008, Proceedings of the National Academy of Sciences.

[33]  Moses Charikar,et al.  Similarity estimation techniques from rounding algorithms , 2002, STOC '02.

[34]  Brendan D. McKay,et al.  Practical graph isomorphism, II , 2013, J. Symb. Comput..

[35]  Shinji Umeyama,et al.  An Eigendecomposition Approach to Weighted Graph Matching Problems , 1988, IEEE Trans. Pattern Anal. Mach. Intell..

[36]  Thorsten Joachims,et al.  A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization , 1997, ICML.

[37]  Bonnie Berger,et al.  IsoRankN: spectral methods for global alignment of multiple protein networks , 2009, Bioinform..

[38]  Peter N. Yianilos,et al.  Learning String-Edit Distance , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[39]  Philip S. Yu,et al.  Multiple Anonymized Social Networks Alignment , 2015, 2015 IEEE International Conference on Data Mining.

[40]  M. Newman Analysis of weighted networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[41]  László Babai,et al.  Graph isomorphism in quasipolynomial time [extended abstract] , 2015, STOC.

[42]  Danai Koutra,et al.  BIG-ALIGN: Fast Bipartite Graph Alignment , 2013, 2013 IEEE 13th International Conference on Data Mining.

[43]  Gonzalo Navarro,et al.  A guided tour to approximate string matching , 2001, CSUR.