Clustering social audiences in business information networks

Abstract Business information networks involve diverse users and rich content and have emerged as important platforms for enabling business intelligence and business decision making. A key step in an organizations business intelligence process is to cluster users with similar interests into social audiences and discover the roles they play within a business network. In this article, we propose a novel machine-learning approach, called CBIN, that co-clusters business information networks to discover and understand these audiences. The CBIN framework is based on co-factorization. The audience clusters are discovered from a combination of network structures and rich contextual information, such as node interactions and node-content correlations. Since what defines an audience cluster is data-driven, plus they often overlap, pre-determining the number of clusters is usually very difficult. Therefore, we have based CBIN on an overlapping clustering paradigm with a hold-out strategy to discover the optimal number of clusters given the underlying data. Experiments validate an outstanding performance by CBIN compared to other state-of-the-art algorithms on 13 real-world enterprise datasets.

[1]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[2]  Chris H. Q. Ding,et al.  Convex and Semi-Nonnegative Matrix Factorizations , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Xin Liu,et al.  Document clustering based on non-negative matrix factorization , 2003, SIGIR.

[4]  Hong Yang,et al.  Diffusion network embedding , 2019, Pattern Recognit..

[5]  Chris H. Q. Ding,et al.  Orthogonal nonnegative matrix t-factorizations for clustering , 2006, KDD '06.

[6]  William W. Cohen,et al.  Block-LDA: Jointly Modeling Entity-Annotated Text and Entity-Entity Links , 2014, Handbook of Mixed Membership Models and Their Applications.

[7]  Chengqi Zhang,et al.  Learning Graph Embedding With Adversarial Training Methods , 2019, IEEE Transactions on Cybernetics.

[8]  Xiaofei He,et al.  Locality Preserving Projections , 2003, NIPS.

[9]  Chengqi Zhang,et al.  CFOND: Consensus Factorization for Co-Clustering Networked Data , 2019, IEEE Transactions on Knowledge and Data Engineering.

[10]  Jure Leskovec,et al.  Community-Affiliation Graph Model for Overlapping Network Community Detection , 2012, 2012 IEEE 12th International Conference on Data Mining.

[11]  Wan-Shiou Yang,et al.  Mining Social Networks for Targeted Advertising , 2006, Proceedings of the 39th Annual Hawaii International Conference on System Sciences (HICSS'06).

[12]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[13]  Efraim Turban,et al.  Enterprise Social Networking: Opportunities, Adoption, and Risk Mitigation , 2011, J. Organ. Comput. Electron. Commer..

[14]  C. Baird,et al.  From social media to social customer relationship management , 2013, IEEE Engineering Management Review.

[15]  Lihua Zhang,et al.  A Unified Joint Matrix Factorization Framework for Data Integration , 2017, ArXiv.

[16]  Lin Wu,et al.  A Fast Algorithm for Nonnegative Matrix Factorization and Its Convergence , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[17]  Georg von Krogh,et al.  Fortune 500 companies in Second Life - Activities, their success measurement and the satisfaction level of their projects , 2009 .

[18]  Chih-Jen Lin,et al.  Combining SVMs with Various Feature Selection Strategies , 2006, Feature Extraction.

[19]  William W. Cohen,et al.  Block-LDA: Jointly Modeling Entity-Annotated Text and Entity-Entity Links , 2014, Handbook of Mixed Membership Models and Their Applications.

[20]  Jiawei Han,et al.  Non-negative Matrix Factorization on Manifold , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[21]  Eric Butow,et al.  How to Succeed in Business Using LinkedIn: Making Connections and Capturing Opportunities on the World's #1 Business Networking Site , 2008 .

[22]  Michel van de Velden,et al.  Online profiling and clustering of Facebook users , 2015, Decis. Support Syst..

[23]  Chih-Jen Lin,et al.  On the Convergence of Multiplicative Update Algorithms for Nonnegative Matrix Factorization , 2007, IEEE Transactions on Neural Networks.

[24]  Jure Leskovec,et al.  Learning to Discover Social Circles in Ego Networks , 2012, NIPS.

[25]  Jure Leskovec,et al.  Community Detection in Networks with Node Attributes , 2013, 2013 IEEE 13th International Conference on Data Mining.

[26]  David Cornforth,et al.  Ranking of high-value social audiences on Twitter , 2016, Decis. Support Syst..

[27]  K. Pauwels,et al.  Effects of Word-of-Mouth versus Traditional Marketing: Findings from an Internet Social Networking Site , 2009 .

[28]  Zhiwen Yu,et al.  Influence Spread in Geo-Social Networks: A Multiobjective Optimization Perspective , 2019, IEEE Transactions on Cybernetics.

[29]  Tracy L. Tuten Advertising 2.0 , 2008 .

[30]  Philip S. Yu,et al.  A Comprehensive Survey on Graph Neural Networks , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[31]  Ernest Valveny,et al.  Graph embedding in vector spaces by node attribute statistics , 2012, Pattern Recognit..

[32]  Maoguo Gong,et al.  Greedy discrete particle swarm optimization for large-scale social network clustering , 2015, Inf. Sci..

[33]  Chun Wang,et al.  MGAE: Marginalized Graph Autoencoder for Graph Clustering , 2017, CIKM.

[34]  G. Drury,et al.  Opinion piece: Social media: Should marketers engage and how can it be done effectively? , 2008 .

[35]  Quanquan Gu,et al.  Co-clustering on manifolds , 2009, KDD.

[36]  Jure Leskovec,et al.  Overlapping community detection at scale: a nonnegative matrix factorization approach , 2013, WSDM.

[37]  David M. W. Powers,et al.  Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation , 2011, ArXiv.

[38]  Fillia Makedon,et al.  Fast Nonnegative Matrix Tri-Factorization for Large-Scale Data Co-Clustering , 2011, IJCAI.