Contextual Correlation Preserving Multiview Featured Graph Clustering

Graph clustering, which aims at discovering sets of related vertices in graph-structured data, plays a crucial role in various applications, such as social community detection and biological module discovery. With the huge increase in the volume of data in recent years, graph clustering is used in an increasing number of real-life scenarios. However, the classical and state-of-the-art methods, which consider only single-view features or a single vector concatenating features from different views and neglect the contextual correlation between pairwise features, are insufficient for the task, as features that characterize vertices in a graph are usually from multiple views and the contextual correlation between pairwise features may influence the cluster preference for vertices. To address this challenging problem, we introduce in this paper, a novel graph clustering model, dubbed contextual correlation preserving multiview featured graph clustering (CCPMVFGC) for discovering clusters in graphs with multiview vertex features. Unlike most of the aforementioned approaches, CCPMVFGC is capable of learning a shared latent space from multiview features as the cluster preference for each vertex and making use of this latent space to model the inter-relationship between pairwise vertices. CCPMVFGC uses an effective method to compute the degree of contextual correlation between pairwise vertex features and utilizes view-wise latent space representing the feature–cluster preference to model the computed correlation. Thus, the cluster preference learned by CCPMVFGC is jointly inferred by multiview features, view-wise correlations of pairwise features, and the graph topology. Accordingly, we propose a unified objective function for CCPMVFGC and develop an iterative strategy to solve the formulated optimization problem. We also provide the theoretical analysis of the proposed model, including convergence proof and computational complexity analysis. In our experiments, we extensively compare the proposed CCPMVFGC with both classical and state-of-the-art graph clustering methods on eight standard graph datasets (six multiview and two single-view datasets). The results show that CCPMVFGC achieves competitive performance on all eight datasets, which validates the effectiveness of the proposed model.

[1]  Lin Wu,et al.  Exploiting Attribute Correlations: A Novel Trace Lasso-Based Weakly Supervised Dictionary Learning Method , 2017, IEEE Transactions on Cybernetics.

[2]  Lin Wu,et al.  Beyond Low-Rank Representations: Orthogonal Clustering Basis Reconstruction with Optimized Graph Structure for Multi-view Spectral Clustering , 2017, Neural Networks.

[3]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[4]  Joachim M. Buhmann,et al.  Multi-assignment clustering for Boolean data , 2009, ICML '09.

[5]  Fei Wang,et al.  Community discovery using nonnegative matrix factorization , 2011, Data Mining and Knowledge Discovery.

[6]  Hyunsoo Kim,et al.  Sparse Non-negative Matrix Factorizations via Alternating Non-negativity-constrained Least Squares , 2006 .

[7]  Jure Leskovec,et al.  Detecting cohesive and 2-mode communities indirected and undirected networks , 2014, WSDM.

[8]  Ling Shao,et al.  Cycle-Consistent Deep Generative Hashing for Cross-Modal Retrieval , 2018, IEEE Transactions on Image Processing.

[9]  Lin Wu,et al.  Robust Subspace Clustering for Multi-View Data by Exploiting Correlation Consensus , 2015, IEEE Transactions on Image Processing.

[10]  Christian Bauckhage,et al.  Non-negative Matrix Factorization in Multimodality Data for Segmentation and Label Prediction , 2011 .

[11]  S. Pu,et al.  Up-to-date catalogues of yeast protein complexes , 2008, Nucleic acids research.

[12]  Mason A. Porter,et al.  Comparing Community Structure to Characteristics in Online Collegiate Social Networks , 2008, SIAM Rev..

[13]  Jure Leskovec,et al.  Discovering social circles in ego networks , 2012, ACM Trans. Knowl. Discov. Data.

[14]  Max Welling,et al.  Variational Graph Auto-Encoders , 2016, ArXiv.

[15]  Hong Cheng,et al.  GBAGC: A General Bayesian Framework for Attributed Graph Clustering , 2014, TKDD.

[16]  Yizhou Sun,et al.  Ranking-based clustering of heterogeneous information networks with star network schema , 2009, KDD.

[17]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[18]  Hong Cheng,et al.  Clustering Large Attributed Graphs: An Efficient Incremental Approach , 2010, 2010 IEEE International Conference on Data Mining.

[19]  Omer Levy,et al.  Neural Word Embedding as Implicit Matrix Factorization , 2014, NIPS.

[20]  Pradipta Maji,et al.  FaRoC: Fast and Robust Supervised Canonical Correlation Analysis for Multimodal Omics Data , 2018, IEEE Transactions on Cybernetics.

[21]  Zhen Wang,et al.  Community Detection Based on Structure and Content: A Content Propagation Perspective , 2015, 2015 IEEE International Conference on Data Mining.

[22]  Jure Leskovec,et al.  Empirical comparison of algorithms for network community detection , 2010, WWW '10.

[23]  Jiming Liu,et al.  A Component-Based Diffusion Model With Structural Diversity for Social Networks , 2017, IEEE Transactions on Cybernetics.

[24]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[25]  Clara Pizzuti,et al.  Multiobjective Optimization and Local Merge for Clustering Attributed Graphs , 2020, IEEE Transactions on Cybernetics.

[26]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[27]  Lin Wu,et al.  Unsupervised Metric Fusion Over Multiview Data by Graph Random Walk-Based Cross-View Diffusion , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[28]  Jure Leskovec,et al.  Community Detection in Networks with Node Attributes , 2013, 2013 IEEE 13th International Conference on Data Mining.

[29]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[30]  Jun Yu,et al.  Adapting Stochastic Block Models to Power-Law Degree Distributions , 2019, IEEE Transactions on Cybernetics.

[31]  Zan Gao,et al.  Multi-view discriminative and structured dictionary learning with group sparsity for human action recognition , 2015, Signal Process..

[32]  Jie Cao,et al.  Dynamic Cluster Formation Game for Attributed Graph Clustering , 2019, IEEE Transactions on Cybernetics.

[33]  Keith C. C. Chan,et al.  Discovering Fuzzy Structural Patterns for Graph Analytics , 2018, IEEE Transactions on Fuzzy Systems.

[34]  Dacheng Tao,et al.  A Survey on Multi-view Learning , 2013, ArXiv.

[35]  Ioannis Xenarios,et al.  DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions , 2002, Nucleic Acids Res..

[36]  Lin Wu,et al.  Multiview Spectral Clustering via Structured Low-Rank Matrix Factorization , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[37]  Lin Wu,et al.  Iterative Views Agreement: An Iterative Low-Rank Based Structured Optimization Method to Multi-View Spectral Clustering , 2016, IJCAI.

[38]  Zhangtao Li,et al.  A Multiobjective Evolutionary Algorithm Based on Structural and Attribute Similarities for Community Detection in Attributed Networks , 2018, IEEE Transactions on Cybernetics.

[39]  Keith C. C. Chan,et al.  MISAGA: An Algorithm for Mining Interesting Subgraphs in Attributed Graphs , 2018, IEEE Transactions on Cybernetics.

[40]  William W. Cohen,et al.  Block-LDA: Jointly Modeling Entity-Annotated Text and Entity-Entity Links , 2014, Handbook of Mixed Membership Models and Their Applications.

[41]  Weiwei Liu,et al.  Discrete Network Embedding , 2018, IJCAI.

[42]  Xiaochun Cao,et al.  A Unified Semi-Supervised Community Detection Framework Using Latent Space Graph Regularization , 2015, IEEE Transactions on Cybernetics.

[43]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[44]  Kun He,et al.  Hidden Community Detection in Social Networks , 2017, Inf. Sci..

[45]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[46]  Dorothy M. Fragaszy,et al.  On the relation between social dynamics and social learning , 1995, Animal Behaviour.

[47]  H. Zhang,et al.  3D object recognition based on pairwise Multi-view Convolutional Neural Networks , 2018, J. Vis. Commun. Image Represent..

[48]  Mathias Niepert,et al.  Learning Graph Representations with Embedding Propagation , 2017, NIPS.

[49]  Maoguo Gong,et al.  Reliable Link Inference for Network Data With Community Structures , 2019, IEEE Transactions on Cybernetics.

[50]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[51]  Zhonglei Gu,et al.  Who is the Mr. Right for Your Brand?: -- Discovering Brand Key Assets via Multi-modal Asset-aware Projection , 2018, SIGIR.

[52]  Yang Liu,et al.  Identifying Key Opinion Leaders in Social Media via Modality-Consistent Harmonized Discriminant Embedding , 2020, IEEE Transactions on Cybernetics.

[53]  Mike Tyers,et al.  BioGRID: a general repository for interaction datasets , 2005, Nucleic Acids Res..

[54]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[55]  Charu C. Aggarwal,et al.  Community Detection with Edge Content in Social Media Networks , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[56]  William W. Cohen,et al.  Block-LDA: Jointly Modeling Entity-Annotated Text and Entity-Entity Links , 2014, Handbook of Mixed Membership Models and Their Applications.

[57]  David M. Blei,et al.  Relational Topic Models for Document Networks , 2009, AISTATS.