Graph Entropy Guided Node Embedding Dimension Selection for Graph Neural Networks

Graph representation learning has achieved great success in many areas, including e-commerce, chemistry, biology, etc. However, the fundamental problem of choosing the appropriate dimension of node embedding for a given graph still remains unsolved. The commonly used strategies for Node Embedding Dimension Selection (NEDS) based on grid search or empirical knowledge suffer from heavy computation and poor model performance. In this paper, we revisit NEDS from the perspective of minimum entropy principle. Subsequently, we propose a novel Minimum Graph Entropy (MinGE) algorithm for NEDS with graph data. To be specific, MinGE considers both feature entropy and structure entropy on graphs, which are carefully designed according to the characteristics of the rich information in them. The feature entropy, which assumes the embeddings of adjacent nodes to be more similar, connects node features and link topology on graphs. The structure entropy takes the normalized degree as basic unit to further measure the higher-order structure of graphs. Based on them, we design MinGE to directly calculate the ideal node embedding dimension for any graph. Finally, comprehensive experiments with popular Graph Neural Networks (GNNs) on benchmark datasets demonstrate the effectiveness and generalizability of our proposed MinGE.

[1]  Hao Peng,et al.  SUGAR: Subgraph Neural Network with Reinforcement Pooling and Self-Supervised Mutual Information Mechanism , 2021, WWW.

[2]  S. Severini,et al.  The Laplacian of a Graph as a Density Matrix: A Basic Combinatorial Approach to Separability of Mixed States , 2004, quant-ph/0406165.

[3]  Lidia Arroyo Prieto Acm , 2020, Encyclopedia of Cryptography and Security.

[4]  C. Raychaudhury,et al.  Discrimination of isomeric structures using information theoretic topological indices , 1984 .

[5]  Song-Chun Zhu,et al.  Minimax Entropy Principle and Its Application to Texture Modeling , 1997, Neural Computation.

[6]  Philip S. Yu,et al.  A Comprehensive Survey on Graph Neural Networks , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[7]  Xiaofeng Zhu,et al.  Robust Graph Dimensionality Reduction , 2018, IJCAI.

[8]  Frederick P. Brooks,et al.  Three great challenges for half-century-old computer science , 2003, JACM.

[9]  Jure Leskovec,et al.  Hyperbolic Graph Convolutional Neural Networks , 2019, NeurIPS.

[10]  Zi Yin,et al.  On the Dimensionality of Word Embedding , 2018, NeurIPS.

[11]  Feiping Nie,et al.  Linear Manifold Regularization with Adaptive Graph for Semi-supervised Dimensionality Reduction , 2017, IJCAI.

[12]  Lise Getoor,et al.  Collective Classification in Network Data , 2008, AI Mag..

[13]  Magnus Sahlgren,et al.  The Distributional Hypothesis , 2008 .

[14]  Kenji Yamanishi,et al.  Word2vec Skip-Gram Dimensionality Selection via Sequential Normalized Maximum Likelihood , 2020, Entropy.

[15]  Lin Wang,et al.  Discriminative sparse embedding based on adaptive graph for dimension reduction , 2020, Eng. Appl. Artif. Intell..

[16]  Bo Chen,et al.  Dynamically constructed network with error correction for accurate ventricle volume estimation , 2020, Medical Image Anal..

[17]  Claude E. Shannon,et al.  The lattice theory of information , 1953, Trans. IRE Prof. Group Inf. Theory.

[18]  Jianping Fan,et al.  A generalized least-squares approach regularized with graph embedding for dimensionality reduction , 2020, Pattern Recognit..

[19]  James S. Duncan,et al.  Medical Image Analysis , 1999, IEEE Pulse.

[20]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[21]  Matthias Dehmer,et al.  Entropy and the Complexity of Graphs Revisited , 2012, Entropy.

[22]  Philip S. Yu,et al.  Hierarchical Taxonomy-Aware and Attentional Graph Capsule RCNNs for Large-Scale Multi-Label Text Classification , 2019, IEEE Transactions on Knowledge and Data Engineering.

[23]  C. Tsallis Entropy , 2022, Thermodynamic Weirdness.

[24]  Matthias Dehmer,et al.  Information processing in complex networks: Graph entropy and information functionals , 2008, Appl. Math. Comput..

[25]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[26]  M. V. Rossum,et al.  In Neural Computation , 2022 .

[27]  Yu Wang,et al.  Single Training Dimension Selection for Word Embedding with PCA , 2019, EMNLP.

[28]  Angsheng Li,et al.  Structural Information and Dynamical Complexity of Networks , 2016, IEEE Transactions on Information Theory.

[29]  Yiming Yang,et al.  On the Sentence Embeddings from BERT for Semantic Textual Similarity , 2020, EMNLP.

[30]  Raymond Chi-Wing Wong,et al.  When Do GNNs Work: Understanding and Improving Neighborhood Aggregation , 2020, IJCAI.

[31]  D. Henderson,et al.  Experiencing Geometry: On Plane and Sphere , 1995 .