Zero-data Learning of New Tasks

We introduce the problem of zero-data learning, where a model must generalize to classes or tasks for which no training data are available and only a description of the classes or tasks are provided. Zero-data learning is useful for problems where the set of classes to distinguish or tasks to solve is very large and is not entirely covered by the training data. The main contributions of this work lie in the presentation of a general formalization of zero-data learning, in an experimental analysis of its properties and in empirical evidence showing that generalization is possible and significant in this context. The experimental work of this paper addresses two classification problems of character recognition and a multitask ranking problem in the context of drug discovery. Finally, we conclude by discussing how this new framework could lead to a novel perspective on how to extend machine learning towards AI, where an agent can be given a specification for a learning problem before attempting to solve it (with very few or even zero examples).

[1]  I. Biederman Recognition-by-components: a theory of human image understanding. , 1987, Psychological review.

[2]  Sebastian Thrun,et al.  Is Learning The n-th Thing Any Easier Than Learning The First? , 1995, NIPS.

[3]  S. Ullman,et al.  Generalization to Novel Images in Upright and Inverted Faces , 1993, Perception.

[4]  Rich Caruana,et al.  Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[5]  David M. Pennock,et al.  Methods and metrics for cold-start recommendations , 2002, SIGIR '02.

[6]  Erik Gundersen Miller,et al.  Learning from one example in machine vision by sharing probability densities , 2002 .

[7]  Tom Heskes,et al.  Task Clustering and Gating for Bayesian Multitask Learning , 2003, J. Mach. Learn. Res..

[8]  Yoshua Bengio,et al.  Bias learning, knowledge sharing , 2003, IEEE Trans. Neural Networks.

[9]  Carl E. Rasmussen,et al.  Evaluating Predictive Uncertainty Challenge , 2005, MLCW.

[10]  Thomas Hofmann,et al.  Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..

[11]  Alexander Zien,et al.  Semi-Supervised Learning , 2006 .

[12]  Yoshua Bengio,et al.  Collaborative Filtering on a Family of Biological Targets , 2006, J. Chem. Inf. Model..

[13]  Rajat Raina,et al.  Self-taught learning: transfer learning from unlabeled data , 2007, ICML '07.

[14]  Edwin V. Bonilla,et al.  Kernel Multi-task Learning using Task-specific Features , 2007, AISTATS.

[15]  Xiaojin Zhu,et al.  Humans Perform Semi-Supervised Classification Too , 2007, AAAI.