A Consistent Diffusion-Based Algorithm for Semi-Supervised Classification on Graphs

Semi-supervised classification on graphs aims at assigning labels to all nodes of a graph based on the labels known for a few nodes, called the seeds. The most popular algorithm relies on the principle of heat diffusion, where the labels of the seeds are spread by thermo-conductance and the temperature of each node is used as a score function for each label. Using a simple block model, we prove that this algorithm is not consistent unless the temperatures of the nodes are centered before classification. We show that this simple modification of the algorithm is enough to get significant performance gains on real data.

[1]  Risi Kondor,et al.  Diffusion kernels on graphs and other discrete structures , 2002, ICML 2002.

[2]  Heinrich Müller,et al.  SplineCNN: Fast Geometric Deep Learning with Continuous B-Spline Kernels , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3]  Edoardo M. Airoldi,et al.  Mixed Membership Stochastic Blockmodels , 2007, NIPS.

[4]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[5]  Ling Li,et al.  Semi-supervised Learning on Graph with an Alternating Diffusion Process , 2020, IEEE transactions on neural networks and learning systems.

[6]  W. Zachary,et al.  An Information Flow Model for Conflict and Fission in Small Groups , 1977, Journal of Anthropological Research.

[7]  Jure Leskovec,et al.  {SNAP Datasets}: {Stanford} Large Network Dataset Collection , 2014 .

[8]  Pierre Borgnat,et al.  Graph Wavelets for Multiscale Community Mining , 2014, IEEE Transactions on Signal Processing.

[9]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[10]  Michael R. Lyu,et al.  Mining social networks using heat diffusion processes for marketing candidates selection , 2008, CIKM '08.

[11]  U. Feige,et al.  Spectral Graph Theory , 2015 .

[12]  Georgios B. Giannakis,et al.  AdaDIF: Adaptive Diffusions for Efficient Semi-supervised Learning over Graphs , 2018, 2018 IEEE International Conference on Big Data (Big Data).

[13]  Choochart Haruechaiyasak,et al.  Article Recommendation Based on a Topic Model for Wikipedia Selection for Schools , 2008, ICADL.

[14]  Jure Leskovec,et al.  Learning Structural Node Embeddings via Diffusion Wavelets , 2017, KDD.

[15]  Pascal Frossard,et al.  Learning Heat Diffusion Graphs , 2016, IEEE Transactions on Signal and Information Processing over Networks.

[16]  Michael R. Lyu,et al.  Mining Web Graphs for Recommendations , 2012, IEEE Transactions on Knowledge and Data Engineering.

[17]  Qing Wang,et al.  DFNets: Spectral CNNs for Graphs with Feedback-Looped Filters , 2019, NeurIPS.

[18]  Andrea L. Bertozzi,et al.  A Semi-supervised Heat Kernel Pagerank MBO Algorithm for Data Classification , 2018 .

[19]  Junzhou Huang,et al.  Adaptive Sampling Towards Fast Graph Representation Learning , 2018, NeurIPS.

[20]  Ronald Rosenfeld,et al.  Semi-supervised learning with graphs , 2005 .