Boosting for transfer learning from multiple data sources

Transfer learning aims at adapting a classifier trained on one domain with adequate labeled samples to a new domain where samples are from a different distribution and have no class labels. In this paper, we explore the transfer learning problems with multiple data sources and present a novel boosting algorithm, SharedBoost. This novel algorithm is capable of applying for very high dimensional data such as in text mining where the feature dimension is beyond several ten thousands. The experimental results illustrate that the SharedBoost algorithm significantly outperforms the traditional methods which transfer knowledge with supervised learning techniques. Besides, SharedBoost also provides much better classification accuracy and more stable performance than some other typical transfer learning methods such as the structural correspondence learning (SCL) and the structural learning in the multiple sources transfer learning problems.

[1]  Koby Crammer,et al.  Analysis of Representations for Domain Adaptation , 2006, NIPS.

[2]  Gang Wang,et al.  A novel learning approach to multiple tasks based on boosting methodology , 2010, Pattern Recognit. Lett..

[3]  Peter Stone,et al.  Boosting for Regression Transfer , 2010, ICML.

[4]  Ping Li,et al.  ABC-boost: adaptive base class boost for multi-class classification , 2008, ICML '09.

[5]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[6]  ChengXiang Zhai,et al.  Instance Weighting for Domain Adaptation in NLP , 2007, ACL.

[7]  Tong Zhang,et al.  A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , 2005, J. Mach. Learn. Res..

[8]  Kristin P. Bennett,et al.  Constructing Orthogonal Latent Features for Arbitrary Loss , 2006, Feature Extraction.

[9]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[10]  Jianping Fan,et al.  Integrating Concept Ontology and Multitask Learning to Achieve More Effective Classifier Training for Multilevel Image Annotation , 2008, IEEE Transactions on Image Processing.

[11]  Jianping Fan,et al.  Building concept ontology for medical video annotation , 2006, MM '06.

[12]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[13]  Jenn-Jier James Lien,et al.  Pedestrian Detection System Using Cascaded Boosting with Invariance of Oriented Gradients , 2009, Int. J. Pattern Recognit. Artif. Intell..

[14]  B. Kégl,et al.  Fast boosting using adversarial bandits , 2010, ICML.

[15]  Yishay Mansour,et al.  Domain Adaptation with Multiple Sources , 2008, NIPS.

[16]  Qiang Yang,et al.  Boosting for transfer learning , 2007, ICML '07.

[17]  Rajat Raina,et al.  Self-taught learning: transfer learning from unlabeled data , 2007, ICML '07.

[18]  Yoram Singer,et al.  Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[19]  Jian Li,et al.  Reducing the Overfitting of Adaboost by Controlling its Data Distribution Skewness , 2006, Int. J. Pattern Recognit. Artif. Intell..

[20]  ChengXiang Zhai,et al.  A two-stage approach to domain adaptation for statistical classifiers , 2007, CIKM '07.

[21]  Antonio Torralba,et al.  Sharing Visual Features for Multiclass and Multiview Object Detection , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Balázs Kégl,et al.  Boosting products of base classifiers , 2009, ICML '09.

[23]  John Blitzer,et al.  Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification , 2007, ACL.

[24]  Yoram Singer,et al.  BoosTexter: A Boosting-based System for Text Categorization , 2000, Machine Learning.

[25]  Massimiliano Pontil,et al.  Best Of NIPS 2005: Highlights on the 'Inductive Transfer : 10 Years Later' Workshop , 2006 .

[26]  Hal Daumé,et al.  Frustratingly Easy Domain Adaptation , 2007, ACL.

[27]  Yi Yao,et al.  Boosting for transfer learning with multiple sources , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[28]  Antonio Torralba,et al.  Sharing features: efficient boosting procedures for multiclass object detection , 2004, CVPR 2004.

[29]  John Blitzer,et al.  Domain Adaptation with Structural Correspondence Learning , 2006, EMNLP.

[30]  Harris Drucker,et al.  Boosting Performance in Neural Networks , 1993, Int. J. Pattern Recognit. Artif. Intell..