An Information-Theoretic Approach to Transferability in Task Transfer Learning

Task transfer learning is a popular technique in image processing applications that uses pre-trained models to reduce the supervision cost of related tasks. An important question is to determine task transferability, i.e. given a common input domain, estimating to what extent representations learned from a source task can help in learning a target task. Typically, transferability is either measured experimentally or inferred through task relatedness, which is often defined without a clear operational meaning. In this paper, we present a novel metric, H-score, an easily-computable evaluation function that estimates the performance of transferred representations from one task to another in classification problems using statistical and information theoretic principles. Experiments on real image data show that our metric is not only consistent with the empirical transferability measurement, but also useful to practitioners in applications such as source model selection and task transfer curriculum learning.

[1]  J. Friedman,et al.  Estimating Optimal Transformations for Multiple Regression and Correlation. , 1985 .

[2]  Junzhou Huang,et al.  An Efficient Approach to Informative Feature Extraction from Multimodal Data , 2018, AAAI.

[3]  Christoph H. Lampert,et al.  A PAC-Bayesian bound for Lifelong Learning , 2013, ICML.

[4]  Shao-Lun Huang,et al.  On Universal Features for High-Dimensional Learning and Inference , 2019, ArXiv.

[5]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[6]  Shao-Lun Huang,et al.  An efficient algorithm for information decomposition and extraction , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[7]  Leonidas J. Guibas,et al.  Taskonomy: Disentangling Task Transfer Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[8]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[9]  Khalid M. Mosalam,et al.  Deep Transfer Learning for Image‐Based Structural Damage Recognition , 2018, Comput. Aided Civ. Infrastructure Eng..

[10]  Lorien Y. Pratt,et al.  Discriminability-Based Transfer between Neural Networks , 1992, NIPS.

[11]  Holger Schwenk,et al.  Supervised Learning of Universal Sentence Representations from Natural Language Inference Data , 2017, EMNLP.

[12]  Yoshua Bengio,et al.  How transferable are features in deep neural networks? , 2014, NIPS.

[13]  Edward Y. Chang,et al.  Transfer representation learning for medical image analysis , 2015, 2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[14]  Jonathan Baxter,et al.  A Model of Inductive Bias Learning , 2000, J. Artif. Intell. Res..

[15]  T. Ben-David,et al.  Exploiting Task Relatedness for Multiple , 2003 .

[16]  Shao-Lun Huang,et al.  An information-theoretic approach to universal feature selection in high-dimensional inference , 2017, 2017 IEEE International Symposium on Information Theory (ISIT).

[17]  Andreas Maurer,et al.  Transfer bounds for linear feature learning , 2009, Machine Learning.