Transfer Learning of Distance Metrics by Cross-Domain Metric Sampling across Heterogeneous Spaces

The problem of transfer learning has recently been of great interest in a variety of machine learning applications. In this paper, we examine a new angle to the transfer learning problem, where we examine the problem of distance function learning. Specifically, we focus on the problem of how our knowledge of distance functions in one domain can be transferred to a new domain. A good semantic understanding of the feature space is critical in providing the domain specific understanding for setting up good distance functions. Unfortunately, not all domains have feature representations which are equally interpretable. For example, in some domains such as text, the semantics of the feature representation are clear, as a result of which it is easy for a domain expert to set up distance functions for specific kinds of semantics. In the case of image data, the features are semantically harder to interpret, and it is harder to set up distance functions, especially for particular semantic criteria. In this paper, we focus on the problem of transfer learning as a way to close the semantic gap between different domains, and show how to use correspondence information between two domains in order to set up distance functions for the semantically more challenging domain.

[1]  S. Yun,et al.  An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems , 2009 .

[2]  Qiang Yang,et al.  Heterogeneous Transfer Learning for Image Clustering via the SocialWeb , 2009, ACL.

[3]  Tommi S. Jaakkola,et al.  Maximum-Margin Matrix Factorization , 2004, NIPS.

[4]  Rajat Raina,et al.  Self-taught learning: transfer learning from unlabeled data , 2007, ICML '07.

[5]  Rajat Raina,et al.  Constructing informative priors using transfer learning , 2006, ICML.

[6]  S. Yun,et al.  An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems , 2009 .

[7]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[8]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[9]  Inderjit S. Dhillon,et al.  Information-theoretic metric learning , 2006, ICML '07.

[10]  Xian-Sheng Hua,et al.  Learning semantic distance from community-tagged media collection , 2009, MM '09.

[11]  Charu C. Aggarwal,et al.  Towards systematic design of distance functions for data mining applications , 2003, KDD '03.

[12]  Thorsten Joachims,et al.  Learning a Distance Metric from Relative Comparisons , 2003, NIPS.

[13]  André Elisseeff,et al.  Stability and Generalization , 2002, J. Mach. Learn. Res..

[14]  Shimon Ullman,et al.  Uncovering shared structures in multiclass classification , 2007, ICML '07.

[15]  Rong Jin,et al.  Regularized Distance Metric Learning: Theory and Algorithm , 2009, NIPS.

[16]  Qiang Yang,et al.  Translated Learning: Transfer Learning across Different Feature Spaces , 2008, NIPS.

[17]  Patrick F. Reidy An Introduction to Latent Semantic Analysis , 2009 .

[18]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Analysis , 1999, UAI.

[19]  Tomer Hertz,et al.  Learning Distance Functions using Equivalence Relations , 2003, ICML.

[20]  Naftali Tishby,et al.  The information bottleneck method , 2000, ArXiv.

[21]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[22]  Emmanuel J. Candès,et al.  A Singular Value Thresholding Algorithm for Matrix Completion , 2008, SIAM J. Optim..

[23]  Qiang Yang,et al.  Heterogeneous Transfer Learning for Image Classification , 2011, AAAI.

[24]  Chitra Dorai,et al.  Bridging the semantic gap with computational media aesthetics , 2003, IEEE MultiMedia.