Active Learning on Graphs via Spanning Trees

Active learning algorithms for graph node classification select a subset L of nodes in a given graph. The goal is to minimize the mistakes made on the remaining nodes by a standard node classifier using L as training set. Bilmes and Guillory introduced a combinatorial quantity, Ψ∗(L), and related it to the performance of the mincut classifier run on any given training set L. While no efficient algorithms for minimizing Ψ∗ are known, they show that simple heuristics for (approximately) minimizing it do not work well in practice. Building on previous theoretical results about active learning on trees, we show that exact minimization of Ψ∗ on suitable spanning trees of the graph yields an efficient active learner that compares well against standard baselines on real-world graphs.

[1]  P. Blau Inequality and Heterogeneity: A Primitive Theory of Social Structure , 1978 .

[2]  David Bruce Wilson,et al.  Generating random spanning trees more quickly than the cover time , 1996, STOC '96.

[3]  Avrim Blum,et al.  Learning from Labeled and Unlabeled Data using Graph Mincuts , 2001, ICML.

[4]  M. McPherson,et al.  Birds of a Feather: Homophily in Social Networks , 2001 .

[5]  Zoubin Ghahramani,et al.  Learning from labeled and unlabeled data with label propagation , 2002 .

[6]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[7]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[8]  Mikhail Belkin,et al.  Regularization and Semi-supervised Learning on Large Graphs , 2004, COLT.

[9]  Gerhard Weikum,et al.  Graph-based text classification: learn from your neighbors , 2006, SIGIR.

[10]  Ehsan Chiniforooshan,et al.  On the Complexity of Finding an Unknown Cut Via Vertex Queries , 2007, COCOON.

[11]  Mark Herbster,et al.  Fast Prediction on a Tree , 2008, NIPS.

[12]  Jianping Yin,et al.  Graph-Based Active Learning Based on Label Propagation , 2008, MDAI.

[13]  En Zhu,et al.  A Scalable Algorithm for Graph-Based Active Learning , 2008, FAW.

[14]  Guy Lever,et al.  Online Prediction on Large Diameter Graphs , 2008, NIPS.

[15]  Jeff A. Bilmes,et al.  Label Selection on Graphs , 2009, NIPS.

[16]  Claudio Gentile,et al.  Random Spanning Trees and the Prediction of Weighted Graphs , 2010, ICML.

[17]  Claudio Gentile,et al.  Active Learning on Trees and Graphs , 2010, COLT.

[18]  Lise Getoor,et al.  Active Learning for Networked Data , 2010, ICML.

[19]  Nicolas Le Roux,et al.  11 Label Propagation and Quadratic Criterion , 2022 .