论文信息 - Active Learning on Trees and Graphs

Active Learning on Trees and Graphs

We investigate the problem of active learning on a given tree whose nodes are assigned binary labels in an adversarial way. Inspired by recent results by Guillory and Bilmes, we characterize (up to constant factors) the optimal placement of queries so to minimize the mistakes made on the non-queried nodes. Our query selection algorithm is extremely efficient, and the optimal number of mistakes on the non-queried nodes is achieved by a simple and efficient mincut classifier. Through a simple modification of the query selection algorithm we also show optimality (up to constant factors) with respect to the trad e-off between number of queries and number of mistakes on non-queried nodes. By using spanning trees, our algorithms can be efficiently applied to general graphs, although the problem of finding optimal and efficient active learning algorithms for general graphs remains open. Towards this end, we provide a lower bound on the number of mistakes made on arbitrary graphs by any active learning algorithm using a number of queries which is up to a constant fraction of the graph size.

Claudio Gentile | Nicolò Cesa-Bianchi | Fabio Vitale | Giovanni Zappella

[1] Nicolas Le Roux,et al. Label Propagation and Quadratic Criterion , 2006, Semi-Supervised Learning.

[2] Avrim Blum,et al. Learning from Labeled and Unlabeled Data using Graph Mincuts , 2001, ICML.

[3] Claudio Gentile,et al. Fast and Optimal Prediction on a Labeled Tree , 2009, COLT.

[4] Ehsan Chiniforooshan,et al. On the Complexity of Finding an Unknown Cut Via Vertex Queries , 2007, COCOON.

[5] Alexander Zien,et al. Label Propagation and Quadratic Criterion , 2006 .

[6] Claudio Gentile,et al. A Linear Time Active Learning Algorithm for Link Classification , 2012, NIPS.

[7] Mikhail Belkin,et al. Regularization and Semi-supervised Learning on Large Graphs , 2004, COLT.

[8] Jeff A. Bilmes,et al. Label Selection on Graphs , 2009, NIPS.

[9] Zoubin Ghahramani,et al. Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[10] John D. Lafferty,et al. Semi-supervised learning using randomized mincuts , 2004, ICML.

[11] Claudio Gentile,et al. A Correlation Clustering Approach to Link Classification in Signed Networks , 2012, COLT.

[12] Claudio Gentile,et al. See the Tree Through the Lines: The Shazoo Algorithm , 2011, NIPS.

[13] Fabio Vitale,et al. Navigation Piles with Applications to Sorting, Priority Queues, and Priority Deques , 2003, Nord. J. Comput..

[14] Claudio Gentile,et al. Random Spanning Trees and the Prediction of Weighted Graphs , 2010, ICML.