Heterogeneous Network Embedding via Deep Architectures

Data embedding is used in many machine learning applications to create low-dimensional feature representations, which preserves the structure of data points in their original space. In this paper, we examine the scenario of a heterogeneous network with nodes and content of various types. Such networks are notoriously difficult to mine because of the bewildering combination of heterogeneous contents and structures. The creation of a multidimensional embedding of such data opens the door to the use of a wide variety of off-the-shelf mining techniques for multidimensional data. Despite the importance of this problem, limited efforts have been made on embedding a network of scalable, dynamic and heterogeneous data. In such cases, both the content and linkage structure provide important cues for creating a unified feature representation of the underlying network. In this paper, we design a deep embedding algorithm for networked data. A highly nonlinear multi-layered embedding function is used to capture the complex interactions between the heterogeneous data in a network. Our goal is to create a multi-resolution deep embedding function, that reflects both the local and global network structures, and makes the resulting embedding useful for a variety of data mining tasks. In particular, we demonstrate that the rich content and linkage information in a heterogeneous network can be captured by such an approach, so that similarities among cross-modal data can be measured directly in a common embedding space. Once this goal has been achieved, a wide variety of data mining problems can be solved by applying off-the-shelf algorithms designed for handling vector representations. Our experiments on real-world network datasets show the effectiveness and scalability of the proposed algorithm as compared to the state-of-the-art embedding methods.

[1]  Yan Liu,et al.  Latent feature learning in social media network , 2013, ACM Multimedia.

[2]  Ning Xu,et al.  Multimedia Classification , 2014, Data Classification: Algorithms and Applications.

[3]  Jun Wang,et al.  Comparing apples to oranges: a scalable solution with heterogeneous hashing , 2013, KDD.

[4]  Jiawei Han,et al.  ClusCite: effective citation recommendation by information network-based clustering , 2014, KDD.

[5]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[6]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[7]  Marc'Aurelio Ranzato,et al.  DeViSE: A Deep Visual-Semantic Embedding Model , 2013, NIPS.

[8]  Yihong Gong,et al.  Combining content and link for classification using matrix factorization , 2007, SIGIR.

[9]  Hui Li,et al.  A Deep Learning Approach to Link Prediction in Dynamic Networks , 2014, SDM.

[10]  Razvan Pascanu,et al.  Theano: A CPU and GPU Math Compiler in Python , 2010, SciPy.

[11]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[12]  Tony Jebara,et al.  Structure preserving embedding , 2009, ICML '09.

[13]  Thomas Hofmann,et al.  Greedy Layer-Wise Training of Deep Networks , 2007 .

[14]  Enhong Chen,et al.  Learning Deep Representations for Graph Clustering , 2014, AAAI.

[15]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[16]  Nitish Srivastava,et al.  Multimodal learning with deep Boltzmann machines , 2012, J. Mach. Learn. Res..

[17]  David E. Rapach,et al.  In-sample vs. out-of-sample tests of stock return predictability in the context of data mining , 2006 .

[18]  Hui Xiong,et al.  Temporal Skeletonization on Sequential Data: Patterns, Categorization, and Visualization , 2016, IEEE Trans. Knowl. Data Eng..

[19]  Geoffrey E. Hinton,et al.  Learning Distributed Representations of Relational Data using Linear Relational Embedding , 2001, WIRN.

[20]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[21]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[22]  Zhen Li,et al.  Learning Locally-Adaptive Decision Functions for Person Verification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Huan Liu,et al.  Unsupervised feature selection for linked social media data , 2012, KDD.

[24]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[25]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[26]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[27]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[28]  Charu C. Aggarwal,et al.  Transfer Learning of Distance Metrics by Cross-Domain Metric Sampling across Heterogeneous Spaces , 2012, SDM.

[29]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[30]  Yun Chi,et al.  Combining link and content for community detection: a discriminative approach , 2009, KDD.

[31]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[32]  Dong Liu,et al.  Hybrid social media network , 2012, ACM Multimedia.

[33]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[34]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[35]  Juhan Nam,et al.  Multimodal Deep Learning , 2011, ICML.

[36]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[37]  Hans-Peter Kriegel,et al.  A Three-Way Model for Collective Learning on Multi-Relational Data , 2011, ICML.

[38]  Nicolas Le Roux,et al.  A latent factor model for highly multi-relational data , 2012, NIPS.

[39]  Geoffrey J. Gordon,et al.  Relational learning via collective matrix factorization , 2008, KDD.

[40]  Jun Wang,et al.  A Single-Pass Algorithm for Efficiently Recovering Sparse Cluster Centers of High-dimensional Data , 2014, ICML.

[41]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[42]  Yizhou Sun,et al.  P-Rank: a comprehensive structural similarity measure over information networks , 2009, CIKM.

[43]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[44]  Hui Xiong,et al.  Exploiting geographic dependencies for real estate appraisal: a mutual perspective of ranking and clustering , 2014, KDD.

[45]  Lise Getoor,et al.  Collective Classification in Network Data , 2008, AI Mag..

[46]  Geoffrey E. Hinton,et al.  Learning Distributed Representations of Concepts Using Linear Relational Embedding , 2001, IEEE Trans. Knowl. Data Eng..

[47]  Samy Bengio,et al.  Zero-Shot Learning by Convex Combination of Semantic Embeddings , 2013, ICLR.

[48]  Sushil Jajodia,et al.  Who is tweeting on Twitter: human, bot, or cyborg? , 2010, ACSAC '10.

[49]  Fei Wang,et al.  From micro to macro: data driven phenotyping by densification of longitudinal electronic medical records , 2014, KDD.

[50]  Philip S. Yu,et al.  Integrating meta-path selection with user-guided object clustering in heterogeneous information networks , 2012, KDD.

[51]  Philip S. Yu,et al.  Meta-path based multi-network collective link prediction , 2014, KDD.

[52]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[53]  Charu C. Aggarwal,et al.  Towards semantic knowledge propagation from text corpus to web images , 2011, WWW.