Generalizing to a zero-data task : a computational chemistry case study

We investigate the problem of learning several tasks simultaneously in order to transfer the acquired knowledge to a completely new task for which no training data are available. Assuming that the tasks share some representation that we can discover efficiently, such a scenario should lead to a better model of the new task, as compared to the model that is learned by only using the knowledge of the new task. We have evaluated several supervised learning algorithms in order to discover shared representations among the tasks defined in a computational chemistry/drug discovery problem. We have cast the problem from a statistical learning point of view and set up the general hypotheses that have to be tested in order to validate the multi-task learning approach. We have then evaluated the performance of the learning algorithms and showed that it is indeed possible to learn a shared representation of the tasks that allows to generalize to a new task for which no training data are available. From a theoretical point of view, our contribution also comprises a modification to the Support Vector Machine algorithm, which can produce state-of-the-art results using multi-task learning concepts at its core. From a practical point of view, our contribution is that this algorithm can be readily used by pharmaceutical companies for virtual screening campaigns.

[1]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[2]  K. Wanner,et al.  Methods and Principles in Medicinal Chemistry , 2007 .

[3]  Thomas Hofmann,et al.  Unifying collaborative and content-based filtering , 2004, ICML.

[4]  H. Kubinyi,et al.  3D QSAR in drug design. , 2002 .

[5]  Shai Ben-David,et al.  Exploiting Task Relatedness for Mulitple Task Learning , 2003, COLT.

[6]  Andreas Zell,et al.  Kernel Functions for Attributed Molecular Graphs – A New Similarity‐Based Approach to ADME Prediction in Classification and Regression , 2006 .

[7]  Rich Caruana,et al.  Multitask Learning: A Knowledge-Based Source of Inductive Bias , 1993, ICML.

[8]  John Riedl,et al.  GroupLens: an open architecture for collaborative filtering of netnews , 1994, CSCW '94.

[9]  D. Hoekman Exploring QSAR Fundamentals and Applications in Chemistry and Biology, Volume 1. Hydrophobic, Electronic and Steric Constants, Volume 2 J. Am. Chem. Soc. 1995, 117, 9782 , 1996 .

[10]  Johann Gasteiger,et al.  Neural networks in chemistry and drug design , 1999 .

[11]  Tom Heskes,et al.  Task Clustering and Gating for Bayesian Multitask Learning , 2003, J. Mach. Learn. Res..

[12]  Kristin P. Bennett,et al.  Sparse Kernel Partial Least Squares Regression , 2003, COLT.

[13]  Gunnar Rätsch,et al.  Active Learning with Support Vector Machines in the Drug Discovery Process , 2003, J. Chem. Inf. Comput. Sci..

[14]  Jonathan Baxter,et al.  A Model of Inductive Bias Learning , 2000, J. Artif. Intell. Res..

[15]  Rich Caruana,et al.  Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[16]  Nello Cristianini,et al.  Advances in Kernel Methods - Support Vector Learning , 1999 .

[17]  Editors , 1986, Brain Research Bulletin.

[18]  Jonathan Baxter,et al.  A Bayesian/information theoretic model of bias learning , 2019, COLT '96.

[19]  H. Kubinyi QSAR and 3D QSAR in drug design Part 1: methodology , 1997 .

[20]  Charles A. Micchelli,et al.  Learning Multiple Tasks with Kernel Methods , 2005, J. Mach. Learn. Res..

[21]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[22]  C. Hansch,et al.  Selection of a reference partitioning system for drug design work. , 1975, Journal of pharmaceutical sciences.

[23]  Yoshua Bengio,et al.  Bias learning, knowledge sharing , 2003, IEEE Trans. Neural Networks.

[24]  Gunnar Rätsch,et al.  Classifying 'Drug-likeness' with Kernel-Based Learning Methods , 2005, J. Chem. Inf. Model..