论文信息 - Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions

Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions

Active and semi-supervised learning are important techniques when labeled data are scarce. We combine the two under a Gaussian random field model. Labeled and unlabeled data are represented as vertices in a weighted graph, with edge weights encoding the similarity between instances. The semi-supervised learning problem is then formulated in terms of a Gaussian random field on this graph, the mean of which is characterized in terms of harmonic functions. Active learning is performed on top of the semisupervised learning scheme by greedily selecting queries from the unlabeled data to minimize the estimated expected classification error (risk); in the case of Gaussian fields the risk is efficiently computed using matrix methods. We present experimental results on synthetic data, handwritten digit recognition, and text classification tasks. The active learning scheme requires a much smaller number of queries to achieve high accuracy compared with random query selection.

J. Lafferty | Xiaojin Zhu | Zoubin Ghahramani

[1] Peter G. Doyle,et al. Random Walks and Electric Networks: REFERENCES , 1987 .

[2] Lawrence D. Jackel,et al. Handwritten Digit Recognition with a Back-Propagation Network , 1989, NIPS.

[3] David A. Cohn,et al. Active Learning with Statistical Models , 1996, NIPS.

[4] Jonathan J. Hull,et al. A Database for Handwritten Text Recognition Research , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[5] K. Chaloner,et al. Bayesian Experimental Design: A Review , 1995 .

[6] Daphne Koller,et al. Support Vector Machine Active Learning with Application sto Text Classification , 2000, ICML.

[7] Eiji Watanabe,et al. A Distributed-Cooperative Learning Algorithm for Multi-Layered Neural Networks using a PC Cluster , 2001 .

[8] Daphne Koller,et al. Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[9] Craig A. Knoblock,et al. Active + Semi-supervised Learning = Robust Multi-View Learning , 2002, ICML.

[10] Zoubin Ghahramani,et al. Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[11] H. Sebastian Seung,et al. Selective Sampling Using the Query by Committee Algorithm , 1997, Machine Learning.