Knowledge Transferring Via Implicit Link Analysis

In this paper, we design a local classification algorithm using implicit link analysis, considering the situation that the labeled and unlabeled data are drawn from two different albeit related domains. In contrast to many global classifiers, e.g. Support Vector Machines, our local classifier only takes into account the neighborhood information around unlabeled data points, and is hardly based on the global distribution in the data set. Thus, the local classifier has good abilities to tackle the non-i.i.d. classification problem since its generalization will not degrade by the bias w.r.t. each unlabeled data point. We build a local neighborhood by connecting the similar data points. Based on these implicit links, the Relaxation Labeling technique is employed. In this work, we theoretically and empirically analyze our algorithm, and show how our algorithm improves the traditional classifiers. It turned out that our algorithm greatly outperforms the state-of-the-art supervised and semi-supervised algorithms when classifying documents across different domains.

[1]  Rich Caruana,et al.  Multitask Learning , 1997, Machine-mediated learning.

[2]  Michael P. Wellman A Market-Oriented Programming Environment and its Application to Distributed Multicommodity Flow Problems , 1993, J. Artif. Intell. Res..

[3]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[4]  Piotr Indyk,et al.  Enhanced hypertext categorization using hyperlinks , 1998, SIGMOD '98.

[5]  Lionel Pelkowitz,et al.  A continuous relaxation labeling algorithm for Markov random fields , 1990, IEEE Trans. Syst. Man Cybern..

[6]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[7]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[8]  Thomas G. Dietterich,et al.  Improving SVM accuracy by training on auxiliary data sources , 2004, ICML.

[9]  Sebastian Thrun,et al.  Learning One More Thing , 1994, IJCAI.

[10]  Juergen Schmidhuber,et al.  On learning how to learn learning strategies , 1994 .

[11]  Xiaojin Zhu,et al.  --1 CONTENTS , 2006 .

[12]  Daniel Marcu,et al.  Domain Adaptation for Statistical Classifiers , 2006, J. Artif. Intell. Res..

[13]  Gerhard Weikum,et al.  Graph-based text classification: learn from your neighbors , 2006, SIGIR.

[14]  Yiming Yang,et al.  A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.

[15]  David D. Lewis,et al.  Representation and Learning in Information Retrieval , 1991 .

[16]  T. Ben-David,et al.  Exploiting Task Relatedness for Multiple , 2003 .