Generalized vec trick for fast learning of pairwise kernel models

Pairwise learning corresponds to the supervised learning setting where the goal is to make predictions for pairs of objects. Prominent applications include predicting drug-target or protein-protein interactions, or customer-product preferences. Several kernel functions have been proposed for incorporating prior knowledge about the relationship between the objects, when training kernel based learning methods. However, the number of training pairs n is often very large, making O(n^2) cost of constructing the pairwise kernel matrix infeasible. If each training pair x= (d,t) consists of drug d and target t, let m and q denote the number of unique drugs and targets appearing in the training pairs. In many real-world applications m,q << n, which can be used to develop computational shortcuts. Recently, a O(nm+nq) time algorithm we refer to as the generalized vec trick was introduced for training kernel methods with the Kronecker kernel. In this work, we show that a large class of pairwise kernels can be expressed as a sum of product matrices, which generalizes the result to the most commonly used pairwise kernels. This includes symmetric and anti-symmetric, metric-learning, Cartesian, ranking, as well as linear, polynomial and Gaussian kernels. In the experiments, we demonstrate how the introduced approach allows scaling pairwise kernels to much larger data sets than previously feasible, and compare the kernels on a number of biological interaction prediction tasks.

[1]  Y. Saad,et al.  GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems , 1986 .

[2]  Wei Chu,et al.  Information Services]: Web-based services , 2022 .

[3]  B. Merget,et al.  Profiling Prediction of Kinase Inhibitors: Toward the Virtual Assay. , 2017, Journal of medicinal chemistry.

[4]  Avner Schlessinger,et al.  Crowdsourced mapping of unexplored target space of kinase inhibitors , 2020, Nature Communications.

[5]  Juho Rousu,et al.  Computational-experimental approach to drug-target interaction mapping: A case study on kinase inhibitors , 2017, PLoS Comput. Biol..

[6]  Tatsuya Akutsu,et al.  Improving prediction of heterodimeric protein complexes using combination with pairwise kernel , 2018, BMC Bioinformatics.

[7]  Mehmet Gönen,et al.  Predicting drug-target interactions from chemical and genomic kernels using Bayesian matrix factorization , 2012, Bioinform..

[8]  Charles Elkan,et al.  A Log-Linear Model with Latent Features for Dyadic Prediction , 2010, 2010 IEEE International Conference on Data Mining.

[9]  Bernard De Baets,et al.  A Comparative Study of Pairwise Learning Methods Based on Kernel Ridge Regression , 2018, Neural Computation.

[10]  Christopher D. Manning,et al.  Using Feature Conjunctions Across Examples for Learning Pairwise Classifiers , 2004, ECML.

[11]  William Stafford Noble,et al.  Kernel methods for predicting protein-protein interactions , 2005, ISMB.

[12]  Simone Fulle,et al.  Kinome‐Wide Profiling Prediction of Small Molecules , 2018, ChemMedChem.

[13]  Yoshihiro Yamanishi,et al.  KEGG OC: a large-scale automatic construction of taxonomy-based ortholog clusters , 2012, Nucleic Acids Res..

[14]  Andreas Fischer,et al.  Pairwise support vector machines and their application to large scale problems , 2012, J. Mach. Learn. Res..

[15]  S. Pu,et al.  Up-to-date catalogues of yeast protein complexes , 2008, Nucleic acids research.

[16]  Bernard De Baets,et al.  Efficient regularized least-squares algorithms for conditional ranking on relational data , 2012, Machine Learning.

[17]  William Stafford Noble,et al.  A new pairwise kernel for biological network inference with support vector machines , 2007, BMC Bioinformatics.

[18]  Yoshihiro Yamanishi,et al.  On Pairwise Kernels: An Efficient Alternative and Generalization Analysis , 2009, PAKDD.

[19]  Thomas Hofmann,et al.  Unifying collaborative and content-based filtering , 2004, ICML.

[20]  Osamu Maruyama Heterodimeric protein complex identification , 2011, BCB '11.

[21]  Meila,et al.  Kernel multitask learning using task-specific features , 2007 .

[22]  J. Magnus,et al.  The Commutation Matrix: Some Properties and Applications , 1979 .

[23]  Eyke Hüllermeier,et al.  Multi-target prediction: a unifying view on problems and methods , 2018, Data Mining and Knowledge Discovery.

[24]  Federico Agostini,et al.  Predicting protein associations with long noncoding RNAs , 2011, Nature Methods.

[25]  Tapio Salakoski,et al.  A Kernel-Based Framework for Learning Graded Relations From Data , 2011, IEEE Transactions on Fuzzy Systems.

[26]  Thore Graepel,et al.  Large Margin Rank Boundaries for Ordinal Regression , 2000 .

[27]  Bernard De Baets,et al.  Algebraic shortcuts for leave-one-out cross-validation in supervised network inference , 2020, Briefings Bioinform..

[28]  Tapio Salakoski,et al.  Learning intransitive reciprocal relations with kernel methods , 2010, Eur. J. Oper. Res..

[29]  Tatsuya Akutsu,et al.  Prediction of Heterodimeric Protein Complexes from Weighted Protein-Protein Interaction Networks Using Novel Features and Kernel Functions , 2013, PloS one.

[30]  Juho Rousu,et al.  Learning with multiple pairwise kernels for drug bioactivity prediction , 2018, Bioinform..

[31]  Bernhard Schölkopf,et al.  A Generalized Representer Theorem , 2001, COLT/EuroCOLT.

[32]  Giorgio Gnecco,et al.  Symmetric and antisymmetric properties of solutions to kernel-based machine learning problems , 2016, Neurocomputing.

[33]  Ben Taskar,et al.  Link Prediction in Relational Data , 2003, NIPS.

[34]  Steffen Rendle,et al.  Factorization Machines , 2010, 2010 IEEE International Conference on Data Mining.

[35]  Philip H. S. Torr,et al.  An embarrassingly simple approach to zero-shot learning , 2015, ICML.

[36]  T. Poggio,et al.  The Mathematics of Learning: Dealing with Data , 2005, 2005 International Conference on Neural Networks and Brain.

[37]  Giorgio Gnecco Symmetry and antisymmetry properties of optimal solutions to regression problems , 2017, Optim. Lett..

[38]  Tapio Pahikkala,et al.  Spectral Analysis of Symmetric and Anti-Symmetric Pairwise Kernels , 2015, ArXiv.

[39]  Tapio Pahikkala,et al.  RLScore: Regularized Least-Squares Learners , 2016, J. Mach. Learn. Res..

[40]  Tapio Pahikkala,et al.  Toward more realistic drug^target interaction predictions , 2014 .

[41]  W. E. Roth On direct product matrices , 1934 .

[42]  Chih-Jen Lin,et al.  Large-scale Kernel RankSVM , 2014, SDM.

[43]  Bernard De Baets,et al.  A Two-Step Learning Approach for Solving Full and Almost Full Cold Start Problems in Dyadic Prediction , 2014, ECML/PKDD.

[44]  Tie-Yan Liu,et al.  Learning to Rank for Information Retrieval , 2011 .

[45]  Jean-Philippe Vert,et al.  Kernel Multitask Regression for Toxicogenetics , 2017, Molecular informatics.

[46]  E. Marcotte,et al.  A flaw in the typical evaluation scheme for pair-input computational predictions , 2012, Nature Methods.

[47]  Tapio Pahikkala,et al.  Fast Kronecker Product Kernel Methods via Generalized Vec Trick , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[48]  Tapio Pahikkala,et al.  An efficient algorithm for learning to rank from preference graphs , 2009, Machine Learning.

[49]  Ralf Herbrich,et al.  Large margin rank boundaries for ordinal regression , 2000 .

[50]  Alan Bridge,et al.  New and continuing developments at PROSITE , 2012, Nucleic Acids Res..

[51]  Yoshihiro Yamanishi,et al.  propagation: A fast semisupervised learning algorithm for link prediction , 2009 .

[52]  Eyke Hüllermeier,et al.  Dyad Ranking Using a Bilinear Plackett-Luce Model , 2015, LWA.

[53]  Rajarshi Guha,et al.  Chemical Informatics Functionality in R , 2007 .

[54]  P. Hajduk,et al.  Navigating the kinome. , 2011, Nature chemical biology.

[55]  Gianni Cesareni,et al.  WI‐PHI: A weighted yeast interactome enriched for direct physical interactions , 2007, Proteomics.