Deep Learning for Learning Graph Representations

Mining graph data has become a popular research topic in computer science and has been widely studied in both academia and industry given the increasing amount of network data in the recent years. However, the huge amount of network data has posed great challenges for efficient analysis. This motivates the advent of graph representation which maps the graph into a low-dimension vector space, keeping original graph structure and supporting graph inference. The investigation on efficient representation of a graph has profound theoretical significance and important realistic meaning, we therefore introduce some basic ideas in graph representation/network embedding as well as some representative models in this chapter.

[1]  Wenwu Zhu,et al.  Deep Variational Network Embedding in Wasserstein Space , 2018, KDD.

[2]  Tara N. Sainath,et al.  Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.

[3]  Nicolas Le Roux,et al.  Efficient Non-Parametric Function Induction in Semi-Supervised Learning , 2004, AISTATS.

[4]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[5]  Carl Doersch,et al.  Tutorial on Variational Autoencoders , 2016, ArXiv.

[6]  Ivor W. Tsang,et al.  Two-Layer Multiple Kernel Learning , 2011, AISTATS.

[7]  Qiongkai Xu,et al.  GraRep: Learning Graph Representations with Global Structural Information , 2015, CIKM.

[8]  Mathieu Desbrun,et al.  Blue noise through optimal transport , 2012, ACM Trans. Graph..

[9]  Christos Faloutsos,et al.  Long Short Memory Process: Modeling Growth Dynamics of Microscopic Social Connectivity , 2017, KDD.

[10]  V. Bryant Metric Spaces: Iteration and Application , 1985 .

[11]  Paul J. Werbos,et al.  Backpropagation Through Time: What It Does and How to Do It , 1990, Proc. IEEE.

[12]  Wenwu Zhu,et al.  Structural Deep Network Embedding , 2016, KDD.

[13]  M Girvan,et al.  Structure of growing social networks. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[14]  Feiping Nie,et al.  Cauchy Graph Embedding , 2011, ICML.

[15]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[16]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[17]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Andrew McCallum,et al.  Word Representations via Gaussian Embedding , 2014, ICLR.

[19]  P J Webros BACKPROPAGATION THROUGH TIME: WHAT IT DOES AND HOW TO DO IT , 1990 .

[20]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[21]  Philip S. Yu,et al.  Deep Recursive Network Embedding with Regular Equivalence , 2018, KDD.

[22]  Tony Jebara,et al.  Structure preserving embedding , 2009, ICML '09.

[23]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..

[24]  S. Dreyfus The numerical solution of variational problems , 1962 .

[25]  M. Newman,et al.  Vertex similarity in networks. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[26]  Arnaud Doucet,et al.  Fast Computation of Wasserstein Barycenters , 2013, ICML.

[27]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[28]  Lukás Burget,et al.  Extensions of recurrent neural network language model , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[29]  Alexander J. Smola,et al.  Kernels and Regularization on Graphs , 2003, COLT.

[30]  Jon M. Kleinberg,et al.  The link-prediction problem for social networks , 2007, J. Assoc. Inf. Sci. Technol..

[31]  L. Ambrosio,et al.  Chapter 1 – Gradient Flows of Probability Measures , 2007 .

[32]  Geoffrey E. Hinton,et al.  Semantic hashing , 2009, Int. J. Approx. Reason..

[33]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[34]  W. Desch,et al.  An elementary proof of the triangle inequality for the Wasserstein metric , 2008 .

[35]  Jieping Ye,et al.  Hypergraph spectral learning for multi-label classification , 2008, KDD.

[36]  David A. Bader,et al.  A Dynamic Algorithm for Updating Katz Centrality in Graphs , 2017, ASONAM.

[37]  M. V. D. Panne,et al.  Displacement Interpolation Using Lagrangian Mass Transport , 2011 .

[38]  Lukás Burget,et al.  Recurrent neural network based language model , 2010, INTERSPEECH.

[39]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[40]  Nicolas Courty,et al.  Learning Wasserstein Embeddings , 2017, ICLR.

[41]  Hanghang Tong,et al.  Fast Eigen-Functions Tracking on Dynamic Graphs , 2015, SDM.

[42]  Mingzhe Wang,et al.  LINE: Large-scale Information Network Embedding , 2015, WWW.

[43]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[44]  Phillip Bonacich,et al.  Some unique properties of eigenvector centrality , 2007, Soc. Networks.

[45]  Arindam Banerjee,et al.  Bregman Alternating Direction Method of Multipliers , 2013, NIPS.

[46]  Wenwu Zhu,et al.  DepthLGP: Learning Embeddings of Out-of-Sample Nodes in Dynamic Networks , 2018, AAAI.

[47]  Kurt Hornik,et al.  Approximation capabilities of multilayer feedforward networks , 1991, Neural Networks.

[48]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[49]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[50]  Ryan A. Rossi,et al.  Role Discovery in Networks , 2014, IEEE Transactions on Knowledge and Data Engineering.

[51]  Yoshua Bengio,et al.  Why Does Unsupervised Pre-training Help Deep Learning? , 2010, AISTATS.

[52]  Veselin Stoyanov,et al.  Empirical Risk Minimization of Graphical Model Parameters Given Approximate Inference, Decoding, and Model Structure , 2011, AISTATS.

[53]  Jian Pei,et al.  Asymmetric Transitivity Preserving Graph Embedding , 2016, KDD.

[54]  Hang-Hyun Jo,et al.  Tail-scope: Using friends to estimate heavy tails of degree distributions in large-scale complex networks , 2014, Scientific Reports.

[55]  Fei Wang,et al.  Structural Deep Embedding for Hyper-Networks , 2017, AAAI.

[56]  C. Givens,et al.  A class of Wasserstein metrics for probability distributions. , 1984 .

[57]  James A. Anderson,et al.  Neurocomputing: Foundations of Research , 1988 .

[58]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[59]  Julien Rabin,et al.  Sliced and Radon Wasserstein Barycenters of Measures , 2014, Journal of Mathematical Imaging and Vision.

[60]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[61]  Niladri Sekhar Dash,et al.  Context and Contextual Word Meaning , 2008 .

[62]  Zoubin Ghahramani,et al.  Learning from labeled and unlabeled data with label propagation , 2002 .

[63]  Geoffrey E. Hinton,et al.  Layer Normalization , 2016, ArXiv.

[64]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[65]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[66]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[67]  Christopher K. I. Williams,et al.  Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning) , 2005 .

[68]  Hava T. Siegelmann,et al.  On the Computational Power of Neural Nets , 1995, J. Comput. Syst. Sci..

[69]  Bernhard Schölkopf,et al.  Wasserstein Auto-Encoders , 2017, ICLR.

[70]  P. Holland,et al.  Holland and Leinhardt Reply: Some Evidence on the Transitivity of Positive Interpersonal Sentiment , 1972, American Journal of Sociology.

[71]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[72]  Gustavo K. Rohde,et al.  Optimal Mass Transport: Signal processing and machine-learning applications , 2017, IEEE Signal Processing Magazine.

[73]  Serge J. Belongie,et al.  Higher order learning with graphs , 2006, ICML.

[74]  L. Ambrosio,et al.  Gradient Flows: In Metric Spaces and in the Space of Probability Measures , 2005 .

[75]  Martin Ester,et al.  A matrix factorization technique with trust propagation for recommendation in social networks , 2010, RecSys '10.

[76]  Nicolas Courty,et al.  Optimal Transport for Domain Adaptation , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[77]  Fu Jie Huang,et al.  A Tutorial on Energy-Based Learning , 2006 .

[78]  Enhong Chen,et al.  Learning Deep Representations for Graph Clustering , 2014, AAAI.

[79]  Leonidas J. Guibas,et al.  Shape google: Geometric words and expressions for invariant shape retrieval , 2011, TOGS.

[80]  S. V. N. Vishwanathan,et al.  Graph kernels , 2007 .

[81]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.