Robust non-negative matrix factorization via joint sparse and graph regularization for transfer learning

In real-world applications, we often have to deal with some high-dimensional, sparse, noisy, and non-independent identically distributed data. In this paper, we aim to handle this kind of complex data in a transfer learning framework, and propose a robust non-negative matrix factorization via joint sparse and graph regularization model for transfer learning. First, we employ robust non-negative matrix factorization via sparse regularization model (RSNMF) to handle source domain data and then learn a meaningful matrix, which contains much common information between source domain and target domain data. Second, we treat this learned matrix as a bridge and transfer it to target domain. Target domain data are reconstructed by our robust non-negative matrix factorization via joint sparse and graph regularization model (RSGNMF). Third, we employ feature selection technique on new sparse represented target data. Fourth, we provide novel efficient iterative algorithms for RSNMF model and RSGNMF model and also give rigorous convergence and correctness analysis separately. Finally, experimental results on both text and image data sets demonstrate that our REGTL model outperforms existing start-of-art methods.

[1]  Raymond J. Mooney,et al.  Transfer Learning by Mapping with Minimal Target Data , 2008 .

[2]  Feiping Nie,et al.  Regression Reformulations of LLE and LTSA With Locally Linear Transformation , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[3]  Zhong-Yuan Zhang,et al.  Divergence Functions of Non negative Matrix Factorization: A Comparison Study , 2011, Commun. Stat. Simul. Comput..

[4]  Shiliang Sun,et al.  Multi-view Laplacian Support Vector Machines , 2011, ADMA.

[5]  Junbin Gao,et al.  Gaussian Process for Dimensionality Reduction in Transfer Learning , 2011, SDM.

[6]  John Blitzer,et al.  Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification , 2007, ACL.

[7]  Stan Z. Li,et al.  Learning spatially localized, parts-based representation , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[8]  Chris H. Q. Ding,et al.  Robust nonnegative matrix factorization using L21-norm , 2011, CIKM '11.

[9]  Chris H. Q. Ding,et al.  On Trivial Solution and Scale Transfer Problems in Graph Regularized NMF , 2011, IJCAI.

[10]  Andrzej Cichocki,et al.  Non-Negative Tensor Factorization using Alpha and Beta Divergences , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[11]  Zhong-Yuan Zhang,et al.  Nonnegative Matrix Factorization: Models, Algorithms and Applications , 2012 .

[12]  Jing Zhao,et al.  Document Clustering Based on Nonnegative Sparse Matrix Factorization , 2005, ICNC.

[13]  Thomas S. Huang,et al.  Graph Regularized Nonnegative Matrix Factorization for Data Representation. , 2011, IEEE transactions on pattern analysis and machine intelligence.

[14]  Xuelong Li,et al.  Constrained Nonnegative Matrix Factorization for Image Representation , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[16]  Feiping Nie,et al.  Discriminative Least Squares Regression for Multiclass Classification and Feature Selection , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[17]  Renato D. C. Monteiro,et al.  Group Sparsity in Nonnegative Matrix Factorization , 2012, SDM.

[18]  Zi Huang,et al.  Self-taught dimensionality reduction on the high-dimensional small-sized data , 2013, Pattern Recognit..

[19]  Qiang Yang,et al.  Transfer Learning via Dimensionality Reduction , 2008, AAAI.

[20]  Feiping Nie,et al.  Efficient and Robust Feature Selection via Joint ℓ2, 1-Norms Minimization , 2010, NIPS.

[21]  Ivor W. Tsang,et al.  Domain Adaptation via Transfer Component Analysis , 2009, IEEE Transactions on Neural Networks.

[22]  John Shawe-Taylor,et al.  MahNMF: Manhattan Non-negative Matrix Factorization , 2012, ArXiv.

[23]  Chris H. Q. Ding,et al.  R1-PCA: rotational invariant L1-norm principal component analysis for robust subspace factorization , 2006, ICML.

[24]  Fei Wang,et al.  Graph dual regularization non-negative matrix factorization for co-clustering , 2012, Pattern Recognit..

[25]  Pedro M. Domingos,et al.  Deep transfer via second-order Markov logic , 2009, ICML '09.

[26]  Shiliang Sun,et al.  Multi-source Transfer Learning with Multi-view Adaboost , 2012, ICONIP.

[27]  Joydeep Ghosh,et al.  Cluster Ensembles A Knowledge Reuse Framework for Combining Partitionings , 2002, AAAI/IAAI.

[28]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[29]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[30]  Kenneth Steiglitz,et al.  Combinatorial Optimization: Algorithms and Complexity , 1981 .

[31]  Michael I. Jordan,et al.  Multi-task feature selection , 2006 .

[32]  Feiping Nie,et al.  Cross-language web page classification via dual knowledge transfer using nonnegative matrix tri-factorization , 2011, SIGIR.

[33]  Meng Wang,et al.  Robust Non-negative Graph Embedding: Towards noisy data, unreliable graphs, and noisy labels , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  E. M. Wright,et al.  Adaptive Control Processes: A Guided Tour , 1961, The Mathematical Gazette.

[35]  A. N. Tikhonov,et al.  REGULARIZATION OF INCORRECTLY POSED PROBLEMS , 1963 .

[36]  Anindya Datta,et al.  Domain Adaptation for Sentiment Classification in Light of Multiple Sources , 2014, INFORMS J. Comput..

[37]  Jianmin Wang,et al.  Dual Transfer Learning , 2012, SDM.

[38]  Richard Bellman,et al.  Adaptive Control Processes: A Guided Tour , 1961, The Mathematical Gazette.

[39]  Massimiliano Pontil,et al.  Multi-Task Feature Learning , 2006, NIPS.

[40]  Quanquan Gu,et al.  Co-clustering on manifolds , 2009, KDD.

[41]  Yu-Jin Zhang,et al.  Nonnegative Matrix Factorization: A Comprehensive Review , 2013, IEEE Transactions on Knowledge and Data Engineering.

[42]  David J. Kriegman,et al.  Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection , 1996, ECCV.

[43]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[44]  Shiliang Sun,et al.  Sparse Semi-supervised Learning Using Conjugate Functions , 2010, J. Mach. Learn. Res..

[45]  Xiaofei He,et al.  Robust non-negative matrix factorization , 2011 .

[46]  Patrik O. Hoyer,et al.  Non-negative Matrix Factorization with Sparseness Constraints , 2004, J. Mach. Learn. Res..

[47]  Jianmin Wang,et al.  Transfer Learning with Graph Co-Regularization , 2012, IEEE Transactions on Knowledge and Data Engineering.

[48]  Inderjit S. Dhillon,et al.  Generalized Nonnegative Matrix Approximations with Bregman Divergences , 2005, NIPS.

[49]  Shiliang Sun,et al.  Multi-view Transfer Learning with Adaboost , 2011, 2011 IEEE 23rd International Conference on Tools with Artificial Intelligence.

[50]  Lakhmi C. Jain,et al.  Data Mining: Foundations and Intelligent Paradigms , 2012 .

[51]  Xuelong Li,et al.  Transfer latent variable model based on divergence analysis , 2011, Pattern Recognit..

[52]  Andrzej Cichocki,et al.  Non-negative matrix factorization with alpha-divergence , 2008, Pattern Recognit. Lett..

[53]  John Blitzer,et al.  Domain Adaptation with Structural Correspondence Learning , 2006, EMNLP.

[54]  Xiaofei He,et al.  Locality Preserving Projections , 2003, NIPS.

[55]  Feiping Nie,et al.  Dyadic transfer learning for cross-domain image classification , 2011, 2011 International Conference on Computer Vision.