Lime: Low-Cost and Incremental Learning for Dynamic Heterogeneous Information Networks

Understanding the interconnected relationships of large-scale information networks like social, scholar and Internet of Things networks is vital for tasks like recommendation and fraud detection. The vast majority of the real-world networks are inherently heterogeneous and dynamic, containing many different types of nodes and edges and can change drastically over time. The dynamicity and heterogeneity make it extremely challenging to reason about the network structure. Unfortunately, existing approaches are inadequate in modeling real-life networks as they require extensive computational resources and do not scale well to large, dynamically evolving networks. We introduce LIME, a better approach for modeling dynamic and heterogeneous information networks. LIME is designed to extract high-quality network representation with significantly lower memory resources and computational time over the state-of-the-art. Unlike prior work that uses a vector to encode each network node, we exploit the semantic relationships among network nodes to encode multiple nodes with similar semantics in shared vectors. We evaluate LIME by applying it to three representative network-based tasks, node classification, node clustering and anomaly detection, performing on three large-scale datasets. Our extensive experiments demonstrate that LIME not only reduces the memory footprint by over 80\% and computational time over 2x when learning network representation but also delivers comparable performance for downstream processing tasks.

[1]  Chengqi Zhang,et al.  MetaGraph2Vec: Complex Semantic Path Augmented Heterogeneous Network Embedding , 2018, PAKDD.

[2]  Philip S. Yu,et al.  Fine-grained Event Categorization with Heterogeneous Graph Convolutional Networks , 2019, IJCAI.

[3]  Kenneth Steiglitz,et al.  Combinatorial Optimization: Algorithms and Complexity , 1981 .

[4]  Manos Papagelis,et al.  Evolving network representation learning based on random walks , 2020, Appl. Netw. Sci..

[5]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[6]  Wang-Chien Lee,et al.  HIN2Vec: Explore Meta-paths in Heterogeneous Information Networks for Representation Learning , 2017, CIKM.

[7]  Palash Goyal,et al.  dyngraph2vec: Capturing Network Dynamics using Dynamic Graph Representation Learning , 2018, Knowl. Based Syst..

[8]  Jure Leskovec,et al.  How Powerful are Graph Neural Networks? , 2018, ICLR.

[9]  Qiaozhu Mei,et al.  PTE: Predictive Text Embedding through Large-scale Heterogeneous Text Networks , 2015, KDD.

[10]  Hongyuan Zha,et al.  Representation Learning over Dynamic Graphs , 2018, ArXiv.

[11]  Philip S. Yu,et al.  Deep Dynamic Network Embedding for Link Prediction , 2018, IEEE Access.

[12]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[13]  Tie-Yan Liu,et al.  LightRNN: Memory and Computation-Efficient Recurrent Neural Networks , 2016, NIPS.

[14]  Wenwu Zhu,et al.  Structural Deep Network Embedding , 2016, KDD.

[15]  Yu He,et al.  HeteSpaceyWalk: A Heterogeneous Spacey Random Walk for Heterogeneous Information Network Embedding , 2019, CIKM.

[16]  Bin Yu,et al.  A Survey on Dynamic Network Embedding , 2020, ArXiv.

[17]  Wen Jiang,et al.  Dynamic Heterogeneous Graph Embedding Using Hierarchical Attentions , 2020, ECIR.

[18]  Philip S. Yu,et al.  Alleviating the Inconsistency Problem of Applying Graph Neural Network to Fraud Detection , 2020, SIGIR.

[19]  Robert Preis,et al.  Linear Time 1/2-Approximation Algorithm for Maximum Weighted Matching in General Graphs , 1999, STACS.

[20]  Philip S. Yu,et al.  A Survey of Heterogeneous Information Network Analysis , 2015, IEEE Transactions on Knowledge and Data Engineering.

[21]  Qiongkai Xu,et al.  GraRep: Learning Graph Representations with Global Structural Information , 2015, CIKM.

[22]  Nitesh V. Chawla,et al.  metapath2vec: Scalable Representation Learning for Heterogeneous Networks , 2017, KDD.

[23]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[24]  Philip S. Yu,et al.  Hierarchical Taxonomy-Aware and Attentional Graph Capsule RCNNs for Large-Scale Multi-Label Text Classification , 2019, IEEE Transactions on Knowledge and Data Engineering.

[25]  Stefan Hougardy,et al.  Linear Time Local Improvements for Weighted Matchings in Graphs , 2003, WEA.

[26]  Wenjie Li,et al.  Predictive Network Representation Learning for Link Prediction , 2017, SIGIR.

[27]  Junjie Wu,et al.  Embedding Temporal Network via Neighborhood Formation , 2018, KDD.

[28]  Philip S. Yu,et al.  A Comprehensive Survey on Graph Neural Networks , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[29]  Jacek M. Zurada,et al.  Normalized Mutual Information Feature Selection , 2009, IEEE Transactions on Neural Networks.

[30]  Nevena Lazic,et al.  Embedding Methods for Fine Grained Entity Type Classification , 2015, ACL.

[31]  Philip S. Yu,et al.  Multi-information Source HIN for Medical Concept Embedding , 2020, PAKDD.

[32]  Lin Liu,et al.  Dynamic network embedding via incremental skip-gram with negative sampling , 2019, Science China Information Sciences.

[33]  Jiawei Han,et al.  Meta-Graph Based HIN Spectral Embedding: Methods, Analyses, and Insights , 2018, 2018 IEEE International Conference on Data Mining (ICDM).

[34]  Guojie Song,et al.  Dynamic Network Embedding : An Extended Approach for Skip-gram based Network Embedding , 2018, IJCAI.

[35]  Yueting Zhuang,et al.  Dynamic Network Embedding by Modeling Triadic Closure Process , 2018, AAAI.

[36]  Mingzhe Wang,et al.  LINE: Large-scale Information Network Embedding , 2015, WWW.

[37]  Yong Deng,et al.  Location of Facility Based on Simulated Annealing and “ZKW” Algorithms , 2017 .

[38]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.

[39]  Philip S. Yu,et al.  HinCTI: A Cyber Threat Intelligence Modeling and Identification System Based on Heterogeneous Information Network , 2020, IEEE Transactions on Knowledge and Data Engineering.

[40]  Vincent Van Asch,et al.  Macro-and micro-averaged evaluation measures [ [ BASIC DRAFT ] ] , 2013 .

[41]  Ryan A. Rossi,et al.  Continuous-Time Dynamic Network Embeddings , 2018, WWW.

[42]  Nikos Mamoulis,et al.  Heterogeneous Information Network Embedding for Meta Path based Proximity , 2017, ArXiv.

[43]  Jie Tang,et al.  ArnetMiner: extraction and mining of academic social networks , 2008, KDD.

[44]  Jie Chen,et al.  EvolveGCN: Evolving Graph Convolutional Networks for Dynamic Graphs , 2020, AAAI.

[45]  Stefan Hougardy,et al.  A simple approximation algorithm for the weighted matching problem , 2003, Inf. Process. Lett..