Social Network Fusion and Mining: A Survey

Looking from a global perspective, the landscape of online social networks is highly fragmented. A large number of online social networks have appeared, which can provide users with various types of services. Generally, the information available in these online social networks is of diverse categories, which can be represented as heterogeneous social networks (HSN) formally. Meanwhile, in such an age of online social media, users usually participate in multiple online social networks simultaneously to enjoy more social networks services, who can act as bridges connecting different networks together. So multiple HSNs not only represent information in single network, but also fuse information from multiple networks. Formally, the online social networks sharing common users are named as the aligned social networks, and these shared users who act like anchors aligning the networks are called the anchor users. The heterogeneous information generated by users' social activities in the multiple aligned social networks provides social network practitioners and researchers with the opportunities to study individual user's social behaviors across multiple social platforms simultaneously. This paper presents a comprehensive survey about the latest research works on multiple aligned HSNs studies based on the broad learning setting, which covers 5 major research tasks, i.e., network alignment, link prediction, community detection, information diffusion and network embedding respectively.

[1]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[2]  E. H. Simpson Measurement of Diversity , 1949, Nature.

[3]  Bonnie Berger,et al.  IsoRankN: spectral methods for global alignment of multiple protein networks , 2009, Bioinform..

[4]  Charu C. Aggarwal,et al.  Heterogeneous Network Embedding via Deep Architectures , 2015, KDD.

[5]  Dit-Yan Yeung,et al.  Overlapping community detection via bounded nonnegative matrix tri-factorization , 2012, KDD.

[6]  Srinivasan Parthasarathy,et al.  Scalable global alignment for multiple biological networks , 2012, BMC Bioinformatics.

[7]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[8]  Jon M. Kleinberg,et al.  Inferring Web communities from link topology , 1998, HYPERTEXT '98.

[9]  Philip S. Yu,et al.  Transferring heterogeneous links across location-based social networks , 2014, WSDM.

[10]  Jure Leskovec,et al.  Empirical comparison of algorithms for network community detection , 2010, WWW '10.

[11]  Philip S. Yu,et al.  Link Prediction with Cardinality Constraint , 2017, WSDM.

[12]  Philip S. Yu,et al.  PNA: Partial Network Alignment with Generic Stable Matching , 2015, 2015 IEEE International Conference on Information Reuse and Integration.

[13]  Éva Tardos,et al.  Influential Nodes in a Diffusion Model for Social Networks , 2005, ICALP.

[14]  Yizhou Sun,et al.  Task-Guided and Path-Augmented Heterogeneous Network Embedding for Author Identification , 2016, WSDM.

[15]  Philip S. Yu,et al.  GConnect: A Connectivity Index for Massive Disk-resident Graphs , 2009, Proc. VLDB Endow..

[16]  Yizhou Sun,et al.  Ranking-based clustering of heterogeneous information networks with star network schema , 2009, KDD.

[17]  Kristina Lerman,et al.  Non-Conservative Diffusion and its Application to Social Network Analysis , 2011, ArXiv.

[18]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[19]  Karrie Karahalios,et al.  Friend Grouping Algorithms for Online Social Networks: Preference, Bias, and Implications , 2014, SocInfo.

[20]  Philip S. Yu,et al.  Organizational Chart Inference , 2015, KDD.

[21]  George Karypis,et al.  Multilevel k-way Partitioning Scheme for Irregular Graphs , 1998, J. Parallel Distributed Comput..

[22]  François Fouss,et al.  Random-Walk Computation of Similarities between Nodes of a Graph with Application to Collaborative Recommendation , 2007, IEEE Transactions on Knowledge and Data Engineering.

[23]  Yifei Yuan,et al.  Influence Maximization in Social Networks When Negative Opinions May Emerge and Propagate , 2011, SDM.

[24]  Philip S. Yu,et al.  Synergistic partitioning in multiple large scale social networks , 2014, 2014 IEEE International Conference on Big Data (Big Data).

[25]  Bonnie Berger,et al.  Global alignment of multiple protein interaction networks with application to functional orthology detection , 2008, Proceedings of the National Academy of Sciences.

[26]  Ramasuri Narayanam,et al.  Viral Marketing for Product Cross-Sell through Social Networks , 2012, ECML/PKDD.

[27]  Feng Xu,et al.  MATRI: a multi-aspect and transitive trust inference model , 2013, WWW.

[28]  Philip S. Yu,et al.  Trust Hole Identification in Signed Networks , 2016, ECML/PKDD.

[29]  Alan L. Yuille,et al.  The Concave-Convex Procedure , 2003, Neural Computation.

[30]  Bianca Zadrozny,et al.  Transforming classifier scores into accurate multiclass probability estimates , 2002, KDD.

[31]  Devavrat Shah,et al.  Rumors in a Network: Who's the Culprit? , 2009, IEEE Transactions on Information Theory.

[32]  Ingemar Nåsell,et al.  Stochastic models of some endemic infections. , 2002, Mathematical biosciences.

[33]  David Liben-Nowell,et al.  The link-prediction problem for social networks , 2007 .

[34]  Mahantesh Halappanavar,et al.  New Effective Multithreaded Matching Algorithms , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.

[35]  Nicolas Vayatis,et al.  Estimation of Simultaneously Sparse and Low Rank Matrices , 2012, ICML.

[36]  Shishir Bharathi,et al.  Competitive Influence Maximization in Social Networks , 2007, WINE.

[37]  Roded Sharan,et al.  Fast and Accurate Alignment of Multiple Protein Networks , 2009, J. Comput. Biol..

[38]  M. McPherson,et al.  Birds of a Feather: Homophily in Social Networks , 2001 .

[39]  W. Zangwill Convergence Conditions for Nonlinear Programming Algorithms , 1969 .

[40]  Philip S. Yu,et al.  Community Detection for Emerging Networks , 2015, SDM.

[41]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[42]  Charles Elkan,et al.  Learning classifiers from only positive and unlabeled data , 2008, KDD.

[43]  Huan Liu,et al.  Exploiting homophily effect for trust prediction , 2013, WSDM.

[44]  R. Karp,et al.  From the Cover : Conserved patterns of protein interaction in multiple species , 2005 .

[45]  Linyuan Lü,et al.  Predicting missing links via local information , 2009, 0901.0553.

[46]  Éva Tardos,et al.  Maximizing the Spread of Influence through a Social Network , 2015, Theory Comput..

[47]  Philip S. Yu,et al.  Enterprise Employee Training via Project Team Formation , 2017, WSDM.

[48]  Andrew T. Stephen,et al.  Are Close Friends the Enemy? Online Social Networks, Narcissism, and Self-Control , 2012 .

[49]  Philip S. Yu,et al.  Broad Learning based Multi-Source Collaborative Recommendation , 2017, CIKM.

[50]  Philip S. Yu,et al.  Inferring anchor links across multiple heterogeneous social networks , 2013, CIKM.

[51]  A. Arenas,et al.  Community analysis in social networks , 2004 .

[52]  Piet Van Mieghem,et al.  Epidemic processes in complex networks , 2014, ArXiv.

[53]  Vipin Kumar,et al.  Analysis of Multilevel Graph Partitioning , 1995, Proceedings of the IEEE/ACM SC95 Conference.

[54]  Pedro M. Domingos,et al.  Ontology Matching: A Machine Learning Approach , 2004, Handbook on Ontologies.

[55]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[56]  Philip S. Yu,et al.  Rumor Initiator Detection in Infected Signed Networks , 2017, 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS).

[57]  Jure Leskovec,et al.  Predicting positive and negative links in online social networks , 2010, WWW '10.

[58]  Vipin Kumar,et al.  Parallel Multilevel k-way Partitioning Scheme for Irregular Graphs , 1996, Proceedings of the 1996 ACM/IEEE Conference on Supercomputing.

[59]  Charu C. Aggarwal,et al.  Relation Strength-Aware Clustering of Heterogeneous Information Networks with Incomplete Attributes , 2012, Proc. VLDB Endow..

[60]  Dimitri P. Bertsekas,et al.  Constrained Optimization and Lagrange Multiplier Methods , 1982 .

[61]  Leo Katz,et al.  A new status index derived from sociometric analysis , 1953 .

[62]  A. Barabasi,et al.  Evolution of the social network of scientific collaborations , 2001, cond-mat/0104162.

[63]  Philip S. Yu,et al.  BL-ECD: Broad Learning based Enterprise Community Detection via Hierarchical Structure Fusion , 2017, CIKM.

[64]  Jure Leskovec,et al.  Signed networks in social media , 2010, CHI.

[65]  Charu C. Aggarwal,et al.  A Survey of Signed Network Mining in Social Media , 2015, ACM Comput. Surv..

[66]  Shinji Umeyama,et al.  An Eigendecomposition Approach to Weighted Graph Matching Problems , 1988, IEEE Trans. Pattern Anal. Mach. Intell..

[67]  Hairong Kuang,et al.  The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[68]  Philip S. Yu,et al.  BL-MNE: Emerging Heterogeneous Social Network Embedding Through Broad Learning with Aligned Autoencoder , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[69]  Mohammad Al Hasan,et al.  Link prediction using supervised learning , 2006 .

[70]  Bonnie Berger,et al.  Pairwise Global Alignment of Protein Interaction Networks by Matching Neighborhood Topology , 2007, RECOMB.

[71]  Philip S. Yu,et al.  Information Diffusion at Workplace , 2016, CIKM.

[72]  Roger Wattenhofer,et al.  Word of Mouth: Rumor Dissemination in Social Networks , 2008, SIROCCO.

[73]  Guanrong Chen,et al.  Complex networks: small-world, scale-free and beyond , 2003 .

[74]  Scott Fortin The Graph Isomorphism Problem , 1996 .

[75]  E. Levina,et al.  Community extraction for social networks , 2010, Proceedings of the National Academy of Sciences.

[76]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[77]  Ioannis Konstas,et al.  On social networks and collaborative recommendation , 2009, SIGIR.

[78]  Philip S. Yu,et al.  Multiple Anonymized Social Networks Alignment , 2015, 2015 IEEE International Conference on Data Mining.

[79]  Philip S. Yu,et al.  Link Prediction across Aligned Networks with Sparse and Low Rank Matrix Estimation , 2017, 2017 IEEE 33rd International Conference on Data Engineering (ICDE).

[80]  Philip S. Yu,et al.  PathSim , 2011, Proc. VLDB Endow..

[81]  Wotao Yin,et al.  A feasible method for optimization with orthogonality constraints , 2013, Math. Program..

[82]  Tamara G. Kolda,et al.  Temporal Link Prediction Using Matrix and Tensor Factorizations , 2010, TKDD.

[83]  Patrick L. Combettes,et al.  Signal Recovery by Proximal Forward-Backward Splitting , 2005, Multiscale Model. Simul..

[84]  Mingzhe Wang,et al.  LINE: Large-scale Information Network Embedding , 2015, WWW.

[85]  Mario Vento,et al.  Thirty Years Of Graph Matching In Pattern Recognition , 2004, Int. J. Pattern Recognit. Artif. Intell..

[86]  Jure Leskovec,et al.  Community Detection in Networks with Node Attributes , 2013, 2013 IEEE 13th International Conference on Data Mining.

[87]  Jure Leskovec,et al.  Exploiting Social Network Structure for Person-to-Person Sentiment Analysis , 2014, TACL.

[88]  Derek G. Corneil,et al.  The graph isomorphism disease , 1977, J. Graph Theory.

[89]  Jeong-Hoon Lee,et al.  An In-depth Comparison of Subgraph Isomorphism Algorithms in Graph Databases , 2012, Proc. VLDB Endow..

[90]  Philip S. Yu,et al.  Meta-path based multi-network collective link prediction , 2014, KDD.

[91]  Michalis Vazirgiannis,et al.  Clustering and Community Detection in Directed Networks: A Survey , 2013, ArXiv.

[92]  Philip S. Yu,et al.  MCD: Mutual Clustering across Multiple Social Networks , 2015, 2015 IEEE International Congress on Big Data.

[93]  Philip S. Yu,et al.  PCT: Partial Co-Alignment of Social Networks , 2016, WWW.

[94]  Matthew Richardson,et al.  Mining the network value of customers , 2001, KDD '01.

[95]  Philip S. Yu,et al.  HeteroSales: Utilizing Heterogeneous Social Networks to Identify the Next Enterprise Customer , 2016, WWW.

[96]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[97]  Philip S. Yu,et al.  Inferring Social Influence of Anti-Tobacco Mass Media Campaign , 2017, IEEE Transactions on NanoBioscience.

[98]  Christos Faloutsos,et al.  Fast Random Walk with Restart and Its Applications , 2006, Sixth International Conference on Data Mining (ICDM'06).

[99]  Reza Zafarani,et al.  Connecting users across social media sites: a behavioral-modeling approach , 2013, KDD.

[100]  K. Sycara,et al.  Polarity Related Influence Maximization in Signed Social Networks , 2014, PloS one.

[101]  Philip S. Yu,et al.  Inferring Social Influence of anti-Tobacco mass media campaigns , 2016, 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[102]  Stefan M. Wild,et al.  Maximizing influence in a competitive social network: a follower's perspective , 2007, ICEC.

[103]  Roxana Ioana Roman Community-based Recommendations to Improve Intranet Users' Productivity Universe Community Recommender , 2016 .

[104]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[105]  David A. Freedman,et al.  Machiavelli and the Gale-Shapley Algorithm , 1981 .

[106]  Philip S. Yu,et al.  Intertwined viral marketing in social networks , 2016, 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[107]  Michael Trusov,et al.  Determining Influential Users in Internet Social Networks , 2010 .

[108]  T. N. Narasimhan Fourier’s heat conduction equation: History, influence, and connections , 1999 .

[109]  Philip S. Yu,et al.  Discover Tipping Users For Cross Network Influencing , 2016 .

[110]  Philip S. Yu,et al.  Integrated Anchor and Social Link Predictions across Social Networks , 2015, IJCAI.

[111]  Philip S. Yu,et al.  Predicting Social Links for New Users across Aligned Heterogeneous Social Networks , 2013, 2013 IEEE 13th International Conference on Data Mining.

[112]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[113]  Zhiyuan Liu,et al.  Learning Entity and Relation Embeddings for Knowledge Graph Completion , 2015, AAAI.

[114]  Philip S. Yu,et al.  Bicycle-sharing systems expansion: station re-deployment through crowd planning , 2016, SIGSPATIAL/GIS.

[115]  Bonnie Berger,et al.  IsoBase: a database of functionally related proteins across PPI networks , 2010, Nucleic Acids Res..

[116]  Linyuan Lu,et al.  Link Prediction in Complex Networks: A Survey , 2010, ArXiv.

[117]  Charles Elkan,et al.  Link Prediction via Matrix Factorization , 2011, ECML/PKDD.

[118]  Pernilla Qvarfordt,et al.  Exploring the workplace communication ecology , 2010, CHI.

[119]  Minghua Chen,et al.  Predicting positive and negative links in signed social networks by transfer learning , 2013, WWW.

[120]  Huiling Zhang,et al.  Least Cost Influence Maximization Across Multiple Social Networks , 2016, IEEE/ACM Transactions on Networking.

[121]  Robert M. Goodman,et al.  Systemic spread of an RNA insect virus in plants expressing plant viral movement protein genes , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[122]  Philip S. Yu,et al.  Building text classifiers using positive and unlabeled examples , 2003, Third IEEE International Conference on Data Mining.

[123]  Philip S. Yu,et al.  Enterprise Community Detection , 2017, 2017 IEEE 33rd International Conference on Data Engineering (ICDE).

[124]  Antal F. Novak,et al.  networks Græmlin : General and robust alignment of multiple large interaction data , 2006 .

[125]  Zhen Wang,et al.  Knowledge Graph Embedding by Translating on Hyperplanes , 2014, AAAI.

[126]  Erhard Rahm,et al.  Similarity flooding: a versatile graph matching algorithm and its application to schema matching , 2002, Proceedings 18th International Conference on Data Engineering.

[127]  E B Keeler,et al.  The value of remaining lifetime is close to estimated values of life. , 2001, Journal of health economics.

[128]  Yinglian Xie,et al.  How user behavior is related to social affinity , 2012, WSDM '12.

[129]  Matthew Richardson,et al.  Mining knowledge-sharing sites for viral marketing , 2002, KDD.

[130]  Jiawei Han,et al.  Large-Scale Embedding Learning in Heterogeneous Event Data , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[131]  Jure Leskovec,et al.  Supervised random walks: predicting and recommending links in social networks , 2010, WSDM '11.

[132]  Philip S. Yu,et al.  Link Prediction across Heterogeneous Social Networks: A Survey , 2014 .

[133]  Gert R. G. Lanckriet,et al.  On the Convergence of the Concave-Convex Procedure , 2009, NIPS.

[134]  Danai Koutra,et al.  BIG-ALIGN: Fast Bipartite Graph Alignment , 2013, 2013 IEEE 13th International Conference on Data Mining.

[135]  Jun Huan,et al.  GPM: A graph pattern matching kernel with diffusion for chemical compound classification , 2008, 2008 8th IEEE International Conference on BioInformatics and BioEngineering.

[136]  Lada A. Adamic,et al.  Friends and neighbors on the Web , 2003, Soc. Networks.

[137]  Robert A. Stine,et al.  A Note on Deriving the Information Matrix for a Logistic Distribution , 1986 .

[138]  Philip S. Yu,et al.  Influence Maximization Across Partially Aligned Heterogenous Social Networks , 2015, PAKDD.

[139]  Philip S. Yu,et al.  Multi-source Multi-view Clustering via discrepancy penalty , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[140]  Ning Chen,et al.  On the approximability of influence in social networks , 2008, SODA '08.

[141]  Nisheeth Shrivastava,et al.  Viral Marketing for Multiple Products , 2010, 2010 IEEE International Conference on Data Mining.

[142]  Curt Jones,et al.  A Heuristic for Reducing Fill-In in Sparse Matrix Factorization , 1993, PPSC.

[143]  Mohamed-Jalal Fadili,et al.  A Generalized Forward-Backward Splitting , 2011, SIAM J. Imaging Sci..

[144]  Ying Wang,et al.  Algorithms for Large, Sparse Network Alignment Problems , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[145]  J. McCormick,et al.  Ebola virus disease in southern Sudan: hospital dissemination and intrafamilial spread. , 1983, Bulletin of the World Health Organization.

[146]  Philip S. Yu,et al.  Bicycle-Sharing System Analysis and Trip Prediction , 2016, 2016 17th IEEE International Conference on Mobile Data Management (MDM).

[147]  T. Blomberg Heat conduction in two and three dimensions : computer modelling of building physics applications , 1996 .

[148]  Dimitri P. Bertsekas,et al.  Nonlinear Programming , 1997 .