Group-sparse Embeddings in Collective Matrix Factorization

Abstract: CMF is a technique for simultaneously learning low-rank representations based on a collection of matrices with shared entities. A typical example is the joint modeling of user-item, item-property, and user-feature matrices in a recommender system. The key idea in CMF is that the embeddings are shared across the matrices, which enables transferring information between them. The existing solutions, however, break down when the individual matrices have low-rank structure not shared with others. In this work we present a novel CMF solution that allows each of the matrices to have a separate low-rank structure that is independent of the other matrices, as well as structures that are shared only by a subset of them. We compare MAP and variational Bayesian solutions based on alternating optimization algorithms and show that the model automatically infers the nature of each factor using group-wise sparsity. Our approach supports in a principled way continuous, binary and count observations and is efficient for sparse matrices involving missing data. We illustrate the solution on a number of examples, focusing in particular on an interesting use-case of augmented multi-view learning.

[1]  Thore Graepel,et al.  WWW 2009 MADRID! Track: Data Mining / Session: Statistical Methods Matchbox: Large Scale Online Bayesian Recommendations , 2022 .

[2]  Daniela M Witten,et al.  Extensions of Sparse Canonical Correlation Analysis with Applications to Genomic Data , 2009, Statistical applications in genetics and molecular biology.

[3]  Guillaume Bouchard,et al.  Fast Variational Bayesian Inference for Non-Conjugate Matrix Factorization Models , 2012, AISTATS.

[4]  Geoffrey J. Gordon,et al.  Relational learning via collective matrix factorization , 2008, KDD.

[5]  Samuel Kaski,et al.  Bayesian Group Factor Analysis , 2012, AISTATS.

[6]  Luo Si,et al.  Matrix co-factorization for recommendation with rich side information and implicit feedback , 2011, HetRec '11.

[7]  Samuel Kaski,et al.  Bayesian Canonical correlation analysis , 2013, J. Mach. Learn. Res..

[8]  Sami Virpioja,et al.  Bilingual sentence matching using Kernel CCA , 2010, 2010 IEEE International Workshop on Machine Learning for Signal Processing.

[9]  John Riedl,et al.  Application of Dimensionality Reduction in Recommender System - A Case Study , 2000 .

[10]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[11]  Ruslan Salakhutdinov,et al.  Probabilistic Matrix Factorization , 2007, NIPS.

[12]  Geoffrey J. Gordon,et al.  A Bayesian Matrix Factorization Model for Relational Data , 2010, UAI.

[13]  Danqi Chen,et al.  Learning New Facts From Knowledge Bases With Neural Tensor Networks and Semantic Word Vectors , 2013, ICLR.

[14]  Tapani Raiko,et al.  Tkk Reports in Information and Computer Science Practical Approaches to Principal Component Analysis in the Presence of Missing Values Tkk Reports in Information and Computer Science Practical Approaches to Principal Component Analysis in the Presence of Missing Values , 2022 .

[15]  Hans-Peter Kriegel,et al.  A Three-Way Model for Collective Learning on Multi-Relational Data , 2011, ICML.

[16]  Volker Tresp,et al.  Relation Prediction in Multi-Relational Domains using Matrix Factorization , 2008 .

[17]  Vince D. Calhoun,et al.  Multi-set canonical correlation analysis for the fusion of concurrent single trial ERP and functional MRI , 2010, NeuroImage.

[18]  P. Paatero,et al.  Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values† , 1994 .

[19]  Guillaume Bouchard,et al.  Convex Collective Matrix Factorization , 2013, AISTATS.

[20]  Patrick Seemann,et al.  Matrix Factorization Techniques for Recommender Systems , 2014 .

[21]  Yehuda Koren,et al.  Matrix Factorization Techniques for Recommender Systems , 2009, Computer.

[22]  Trevor Darrell,et al.  Factorized Latent Spaces with Structured Sparsity , 2010, NIPS.

[23]  Jason Weston,et al.  A semantic matching energy function for learning with multi-relational data , 2013, Machine Learning.

[24]  Christian A. Rees,et al.  Microarray analysis reveals a major direct role of DNA copy number alteration in the transcriptional program of human breast tumors , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[25]  Brian D. Davison,et al.  Connecting comments and tags: improved modeling of social tagging systems , 2013, WSDM.