Learning from labeled and unlabeled data on a directed graph

We propose a general framework for learning from labeled and unlabeled data on a directed graph in which the structure of the graph including the directionality of the edges is considered. The time complexity of the algorithm derived from this framework is nearly linear due to recently developed numerical techniques. In the absence of labeled instances, this framework can be utilized as a spectral clustering method for directed graphs, which generalizes the spectral clustering approach for undirected graphs. We have applied our framework to real-world web classification problems and obtained encouraging results.

[1]  A. N. Tikhonov,et al.  Solutions of ill-posed problems , 1977 .

[2]  G. Wahba Spline models for observational data , 1990 .

[3]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[4]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[5]  M. KleinbergJon Authoritative sources in a hyperlinked environment , 1999 .

[6]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[7]  Monika Henzinger,et al.  Hyperlink Analysis for the Web , 2001, IEEE Internet Comput..

[8]  Ben Taskar,et al.  Learning Probabilistic Models of Link Structure , 2003, J. Mach. Learn. Res..

[9]  Monika Henzinger,et al.  Algorithmic Challenges in Web Search Engines , 2004, Internet Math..

[10]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[11]  Shang-Hua Teng,et al.  Solving sparse, symmetric, diagonally-dominant linear systems in time O(m/sup 1.31/ , 2003, 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings..

[12]  Shang-Hua Teng,et al.  Solving Sparse, Symmetric, Diagonally-Dominant Linear Systems in Time O(m1.31) , 2003, ArXiv.

[13]  Mikhail Belkin,et al.  Tikhonov regularization and semi-supervised learning on large graphs , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[14]  Thomas Hofmann,et al.  Semi-supervised Learning on Directed Graphs , 2004, NIPS.

[15]  F. Chung Laplacians and the Cheeger Inequality for Directed Graphs , 2005 .