A Two-Step Learning Approach for Solving Full and Almost Full Cold Start Problems in Dyadic Prediction

Dyadic prediction methods operate on pairs of objects (dyads), aiming to infer labels for out-of-sample dyads. We consider the full and almost full cold start problem in dyadic prediction, a setting that occurs when both objects in an out-of-sample dyad have not been observed during training, or if one of them has been observed, but very few times. A popular approach for addressing this problem is to train a model that makes predictions based on a pairwise feature representation of the dyads, or, in case of kernel methods, based on a tensor product pairwise kernel. As an alternative to such a kernel approach, we introduce a novel two-step learning algorithm that borrows ideas from the fields of pairwise learning and spectral filtering. We show theoretically that the two-step method is very closely related to the tensor product kernel approach, and experimentally that it yields a slightly better predictive performance. Moreover, unlike existing tensor product kernel methods, the two-step method allows closed-form solutions for training and parameter selection via cross-validation estimates both in the full and almost full cold start settings, making the approach much more efficient and straightforward to implement.

[1]  Mindy I. Davis,et al.  Comprehensive analysis of kinase inhibitor selectivity , 2011, Nature Biotechnology.

[2]  Tapio Salakoski,et al.  A Kernel-Based Framework for Learning Graded Relations From Data , 2011, IEEE Transactions on Fuzzy Systems.

[3]  Carla D. Moravitz Martin,et al.  Shifted Kronecker Product Systems , 2006, SIAM J. Matrix Anal. Appl..

[4]  Neil D. Lawrence,et al.  Kernels for Vector-Valued Functions: a Review , 2011, Found. Trends Mach. Learn..

[5]  E. Marcotte,et al.  A flaw in the typical evaluation scheme for pair-input computational predictions , 2012, Nature Methods.

[6]  Ingo Steinwart,et al.  On the Influence of the Kernel on the Consistency of Support Vector Machines , 2002, J. Mach. Learn. Res..

[7]  Yoshihiro Yamanishi,et al.  propagation: A fast semisupervised learning algorithm for link prediction , 2009 .

[8]  Arindam Banerjee,et al.  Generalized Probabilistic Matrix Factorizations for Collaborative Filtering , 2010, 2010 IEEE International Conference on Data Mining.

[9]  Luo Si,et al.  Matrix co-factorization for recommendation with rich side information and implicit feedback , 2011, HetRec '11.

[10]  M. Gonen,et al.  Concordance probability and discriminatory power in proportional hazards regression , 2005 .

[11]  Tapio Salakoski,et al.  Conditional Ranking on Relational Data , 2010, ECML/PKDD.

[12]  Lorenzo Rosasco,et al.  Multi-output learning via spectral filtering , 2012, Machine Learning.

[13]  Patrick Seemann,et al.  Matrix Factorization Techniques for Recommender Systems , 2014 .

[14]  Eleazar Eskin,et al.  The Spectrum Kernel: A String Kernel for SVM Protein Classification , 2001, Pacific Symposium on Biocomputing.

[15]  Hisashi Kashima,et al.  Fast and Scalable Algorithms for Semi-supervised Link Prediction on Static and Dynamic Graphs , 2010, ECML/PKDD.

[16]  Pierre Geurts,et al.  On protocols and measures for the validation of supervised methods for the inference of biological networks , 2013, Front. Genet..

[17]  Ryan P. Adams,et al.  Incorporating Side Information in Probabilistic Matrix Factorization with Gaussian Processes , 2010, UAI.

[18]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[19]  Jean-Philippe Vert,et al.  Protein-ligand interaction prediction: an improved chemogenomics approach , 2008, Bioinform..

[20]  Thomas Hofmann,et al.  Unifying collaborative and content-based filtering , 2004, ICML.

[21]  R. Rifkin,et al.  Notes on Regularized Least Squares , 2007 .

[22]  Bernard De Baets,et al.  Efficient regularized least-squares algorithms for conditional ranking on relational data , 2012, Machine Learning.

[23]  Edwin V. Bonilla,et al.  Kernel Multi-task Learning using Task-specific Features , 2007, AISTATS.

[24]  Lorenzo Rosasco,et al.  On regularization algorithms in learning theory , 2007, J. Complex..

[25]  Lorenzo Rosasco,et al.  Spectral Algorithms for Supervised Learning , 2008, Neural Computation.

[26]  Christopher D. Manning,et al.  Using Feature Conjunctions Across Examples for Learning Pairwise Classifiers , 2004, ECML.

[27]  Guillermo Sapiro,et al.  Kernelized Probabilistic Matrix Factorization: Exploiting Graphs and Side Information , 2012, SDM.

[28]  Tapio Pahikkala,et al.  Toward more realistic drug^target interaction predictions , 2014 .

[29]  Tapio Salakoski,et al.  Learning intransitive reciprocal relations with kernel methods , 2010, Eur. J. Oper. Res..

[30]  Gunnar Rätsch,et al.  Input space versus feature space in kernel-based methods , 1999, IEEE Trans. Neural Networks.

[31]  Yoshua Bengio,et al.  Zero-data Learning of New Tasks , 2008, AAAI.

[32]  Charles Elkan,et al.  A Log-Linear Model with Latent Features for Dyadic Prediction , 2010, 2010 IEEE International Conference on Data Mining.

[33]  Pierre Geurts,et al.  Classifying pairs with trees for supervised biological network inference† †Electronic supplementary information (ESI) available: Implementation and computational issues, supplementary performance curves, and illustration of interpretability of trees. See DOI: 10.1039/c5mb00174a Click here for additi , 2014, Molecular bioSystems.

[34]  Wei Chu,et al.  Information Services]: Web-based services , 2022 .

[35]  C. Loan The ubiquitous Kronecker product , 2000 .

[36]  Ryan P. Adams,et al.  Incorporating side information into probabilistic matrix factorization using Gaussian Processes , 2010 .

[37]  Hisashi Kashima,et al.  Self-measuring Similarity for Multi-task Gaussian Process , 2011, ICML Unsupervised and Transfer Learning.

[38]  Tapio Pahikkala,et al.  Efficient cross-validation for kernelized least-squares regression with sparse basis expansions , 2012, Machine Learning.

[39]  N. Aronszajn Theory of Reproducing Kernels. , 1950 .

[40]  William Stafford Noble,et al.  Kernel methods for predicting protein-protein interactions , 2005, ISMB.