论文信息 - A Two-Step Learning Approach for Solving Full and Almost Full Cold Start Problems in Dyadic Prediction

A Two-Step Learning Approach for Solving Full and Almost Full Cold Start Problems in Dyadic Prediction

Dyadic prediction methods operate on pairs of objects (dyads), aiming to infer labels for out-of-sample dyads. We consider the full and almost full cold start problem in dyadic prediction, a setting that occurs when both objects in an out-of-sample dyad have not been observed during training, or if one of them has been observed, but very few times. A popular approach for addressing this problem is to train a model that makes predictions based on a pairwise feature representation of the dyads, or, in case of kernel methods, based on a tensor product pairwise kernel. As an alternative to such a kernel approach, we introduce a novel two-step learning algorithm that borrows ideas from the fields of pairwise learning and spectral filtering. We show theoretically that the two-step method is very closely related to the tensor product kernel approach, and experimentally that it yields a slightly better predictive performance. Moreover, unlike existing tensor product kernel methods, the two-step method allows closed-form solutions for training and parameter selection via cross-validation estimates both in the full and almost full cold start settings, making the approach much more efficient and straightforward to implement.

[1] Mindy I. Davis,et al. Comprehensive analysis of kinase inhibitor selectivity , 2011, Nature Biotechnology.

[2] Tapio Salakoski,et al. A Kernel-Based Framework for Learning Graded Relations From Data , 2011, IEEE Transactions on Fuzzy Systems.

[3] Carla D. Moravitz Martin,et al. Shifted Kronecker Product Systems , 2006, SIAM J. Matrix Anal. Appl..

[4] Neil D. Lawrence,et al. Kernels for Vector-Valued Functions: a Review , 2011, Found. Trends Mach. Learn..

[5] E. Marcotte,et al. A flaw in the typical evaluation scheme for pair-input computational predictions , 2012, Nature Methods.

[6] Ingo Steinwart,et al. On the Influence of the Kernel on the Consistency of Support Vector Machines , 2002, J. Mach. Learn. Res..

[7] Yoshihiro Yamanishi,et al. propagation: A fast semisupervised learning algorithm for link prediction , 2009 .

[8] Arindam Banerjee,et al. Generalized Probabilistic Matrix Factorizations for Collaborative Filtering , 2010, 2010 IEEE International Conference on Data Mining.

[9] Luo Si,et al. Matrix co-factorization for recommendation with rich side information and implicit feedback , 2011, HetRec '11.

[10] M. Gonen,et al. Concordance probability and discriminatory power in proportional hazards regression , 2005 .

[11] Tapio Salakoski,et al. Conditional Ranking on Relational Data , 2010, ECML/PKDD.

[12] Lorenzo Rosasco,et al. Multi-output learning via spectral filtering , 2012, Machine Learning.

[13] Patrick Seemann,et al. Matrix Factorization Techniques for Recommender Systems , 2014 .

[14] Eleazar Eskin,et al. The Spectrum Kernel: A String Kernel for SVM Protein Classification , 2001, Pacific Symposium on Biocomputing.

[15] Hisashi Kashima,et al. Fast and Scalable Algorithms for Semi-supervised Link Prediction on Static and Dynamic Graphs , 2010, ECML/PKDD.

[16] Pierre Geurts,et al. On protocols and measures for the validation of supervised methods for the inference of biological networks , 2013, Front. Genet..

[17] Ryan P. Adams,et al. Incorporating Side Information in Probabilistic Matrix Factorization with Gaussian Processes , 2010, UAI.

[18] Qiang Yang,et al. A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[19] Jean-Philippe Vert,et al. Protein-ligand interaction prediction: an improved chemogenomics approach , 2008, Bioinform..

[20] Thomas Hofmann,et al. Unifying collaborative and content-based filtering , 2004, ICML.

[21] R. Rifkin,et al. Notes on Regularized Least Squares , 2007 .

[22] Bernard De Baets,et al. Efficient regularized least-squares algorithms for conditional ranking on relational data , 2012, Machine Learning.

[23] Edwin V. Bonilla,et al. Kernel Multi-task Learning using Task-specific Features , 2007, AISTATS.

[24] Lorenzo Rosasco,et al. On regularization algorithms in learning theory , 2007, J. Complex..

[25] Lorenzo Rosasco,et al. Spectral Algorithms for Supervised Learning , 2008, Neural Computation.

[26] Christopher D. Manning,et al. Using Feature Conjunctions Across Examples for Learning Pairwise Classifiers , 2004, ECML.

[27] Guillermo Sapiro,et al. Kernelized Probabilistic Matrix Factorization: Exploiting Graphs and Side Information , 2012, SDM.

[28] Tapio Pahikkala,et al. Toward more realistic drug^target interaction predictions , 2014 .

[29] Tapio Salakoski,et al. Learning intransitive reciprocal relations with kernel methods , 2010, Eur. J. Oper. Res..

[30] Gunnar Rätsch,et al. Input space versus feature space in kernel-based methods , 1999, IEEE Trans. Neural Networks.

[31] Yoshua Bengio,et al. Zero-data Learning of New Tasks , 2008, AAAI.

[32] Charles Elkan,et al. A Log-Linear Model with Latent Features for Dyadic Prediction , 2010, 2010 IEEE International Conference on Data Mining.

[33] Pierre Geurts,et al. Classifying pairs with trees for supervised biological network inference† †Electronic supplementary information (ESI) available: Implementation and computational issues, supplementary performance curves, and illustration of interpretability of trees. See DOI: 10.1039/c5mb00174a Click here for additi , 2014, Molecular bioSystems.

[34] Wei Chu,et al. Information Services]: Web-based services , 2022 .

[35] C. Loan. The ubiquitous Kronecker product , 2000 .

[36] Ryan P. Adams,et al. Incorporating side information into probabilistic matrix factorization using Gaussian Processes , 2010 .

[37] Hisashi Kashima,et al. Self-measuring Similarity for Multi-task Gaussian Process , 2011, ICML Unsupervised and Transfer Learning.

[38] Tapio Pahikkala,et al. Efficient cross-validation for kernelized least-squares regression with sparse basis expansions , 2012, Machine Learning.

[39] N. Aronszajn. Theory of Reproducing Kernels. , 1950 .

[40] William Stafford Noble,et al. Kernel methods for predicting protein-protein interactions , 2005, ISMB.