A Tensor-based Markov Chain Model for Heterogeneous Information Network Collective Classification

Heterogeneous Information Network (HIN) collecitve classification studies the problem of predicting labels for one type of nodes in a HIN which contains multiple types of nodes multiple types of links among them. Previous studies have revealed that exploiting relative importance of links is quite useful to improve node classification performance as connected nodes tend to have similar labels. Most existing approaches exploit the relative importance of links either by directly counting the number of connections among nodes or by learning the weight of each type of link from labeled data only. However, these approaches either neglect the importance of types of links to the class labels or may lead to overfitting problem. We propose a Tensor-based Markov chain (T-Mark) approach, which is able to automatically and simultaneously predict the labels for unlabeled nodes and give the relative importance of types of links that actually improve the classification accuracy. Specifically, we build two tensor equations by using the HIN and features of nodes from both labeled and unlabeled data. A Markov chain-based model is proposed and it is solved by an iterative process to obtain the stationary distributions. Theoretical analyses of the existence and uniqueness of such probability distributions are given. Extensive experimental results demonstrate that T-Mark is able to achieve superior performance in the comparison and obtain reasonable relative importance of links.

[1]  Philip S. Yu,et al.  Meta path-based collective classification in heterogeneous information networks , 2012, CIKM.

[2]  Huan Liu,et al.  CubeSVD: a novel approach to personalized Web search , 2005, WWW '05.

[3]  Michael K. Ng,et al.  Tensor Based Relations Ranking for Multi-relational Collective Classification , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[4]  Christos Faloutsos,et al.  MultiAspectForensics: Pattern Mining on Large-Scale Heterogeneous Networks with Tensor Analysis , 2011, 2011 International Conference on Advances in Social Networks Analysis and Mining.

[5]  Geoffrey E. Hinton,et al.  Neighbourhood Components Analysis , 2004, NIPS.

[6]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[7]  Sheldon M. Ross,et al.  Introduction to probability models , 1975 .

[8]  Jimeng Sun,et al.  MetaFac: community discovery via relational hypergraph factorization , 2009, KDD.

[9]  Nicolas Le Roux,et al.  A latent factor model for highly multi-relational data , 2012, NIPS.

[10]  Lars Schmidt-Thieme,et al.  Ensembles of relational classifiers , 2008, Knowledge and Information Systems.

[11]  David W. Aha,et al.  Semi-Supervised Collective Classification via Hybrid Label Regularization , 2012, ICML.

[12]  Heiko Paulheim,et al.  Knowledge graph refinement: A survey of approaches and evaluation methods , 2016, Semantic Web.

[13]  Ludovic Denoyer,et al.  Classification and annotation in social corpora using multiple relations , 2011, CIKM '11.

[14]  Yunming Ye,et al.  MultiComm: Finding Community Structure in Multi-Dimensional Networks , 2014, IEEE Transactions on Knowledge and Data Engineering.

[15]  Balaraman Ravindran,et al.  Multi-label collective classification in multi-attribute multi-relational network data , 2014, 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014).

[16]  Jennifer Neville,et al.  Across-Model Collective Ensemble Classification , 2011, AAAI.

[17]  Yangyong Zhu,et al.  NetCycle: Collective Evolution Inference in Heterogeneous Information Networks , 2016, KDD.

[18]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[19]  Sofus A. Macskassy Improving Learning in Networked Data by Combining Explicit and Mined Links , 2007, AAAI.

[20]  Yunming Ye,et al.  HAR: Hub, Authority and Relevance Scores in Multi-Relational Data for Query Search , 2012, SDM.

[21]  Yunming Ye,et al.  MultiVCRank With Applications to Image Retrieval , 2016, IEEE Transactions on Image Processing.

[22]  Christos Faloutsos,et al.  ZooBP: Belief Propagation for Heterogeneous Networks , 2017, Proc. VLDB Endow..

[23]  Philip S. Yu,et al.  Learning from Heterogeneous Sources via Gradient Boosting Consensus , 2012, SDM.

[24]  Christos Faloutsos,et al.  Random walk with restart: fast solutions and applications , 2008, Knowledge and Information Systems.

[25]  Yizhou Sun,et al.  Graph Regularized Transductive Classification on Heterogeneous Information Networks , 2010, ECML/PKDD.

[26]  Yangyong Zhu,et al.  GraphInception: Convolutional Neural Networks for Collective Classification in Heterogeneous Information Networks , 2019, IEEE Transactions on Knowledge and Data Engineering.

[27]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[28]  Inderjit S. Dhillon,et al.  Information-theoretic metric learning , 2006, ICML '07.

[29]  Taher H. Haveliwala Topic-Sensitive PageRank: A Context-Sensitive Ranking Algorithm for Web Search , 2003, IEEE Trans. Knowl. Data Eng..

[30]  Lise Getoor,et al.  Collective Classification in Network Data , 2008, AI Mag..

[31]  Jiawei Han,et al.  Ranking-based classification of heterogeneous information networks , 2011, KDD.

[32]  R. B. Kellogg,et al.  Uniqueness in the Schauder fixed point theorem , 1976 .

[33]  Hoan Quoc Nguyen-Mau,et al.  The Graph of Things: A step towards the Live Knowledge Graph of connected things , 2016, J. Web Semant..

[34]  Jürgen Schmidhuber,et al.  Highway Networks , 2015, ArXiv.

[35]  Yunming Ye,et al.  MultiRank: co-ranking for objects and relations in multi-relational data , 2011, KDD.

[36]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[37]  Balaraman Ravindran,et al.  Extended Discriminative Random Walk: A Hypergraph Approach to Multi-View Multi-Relational Transductive Learning , 2015, IJCAI.

[38]  Hans-Peter Kriegel,et al.  A Three-Way Model for Collective Learning on Multi-Relational Data , 2011, ICML.