Sampling for Approximate Maximum Search in Factorized Tensor

Factorization models have been extensively used for recovering the missing entries of a matrix or tensor. However, directly computing all of the entries using the learned factorization models is prohibitive when the size of the matrix/tensor is large. On the other hand, in many applications, such as collaborative filtering, we are only interested in a few entries that are the largest among them. In this work, we propose a sampling-based approach for finding the top entries of a tensor which is decomposed by the CANDECOMP/PARAFAC model. We develop an algorithm to sample the entries with probabilities proportional to their values. We further extend it to make the sampling proportional to the k-th power of the values, amplifying the focus on the top ones. We provide theoretical analysis of the sampling algorithm and evaluate its performance on several real-world data sets. Experimental results indicate that the proposed approach is orders of magnitude faster than exhaustive computing. When applied to the special case of searching in a matrix, it also requires fewer samples than the other state-of-the-art method.

[1]  Tamara G. Kolda,et al.  Temporal Link Prediction Using Matrix and Tensor Factorizations , 2010, TKDD.

[2]  Petros Drineas,et al.  Fast Monte Carlo Algorithms for Matrices I: Approximating Matrix Multiplication , 2006, SIAM J. Comput..

[3]  Lars Schmidt-Thieme,et al.  BPR: Bayesian Personalized Ranking from Implicit Feedback , 2009, UAI.

[4]  Guillermo Sapiro,et al.  Image inpainting , 2000, SIGGRAPH.

[5]  Tamara G. Kolda,et al.  Efficient MATLAB Computations with Sparse and Factored Tensors , 2007, SIAM J. Sci. Comput..

[6]  Tamara G. Kolda,et al.  Diamond Sampling for Approximate Maximum All-Pairs Dot-Product (MAD) Search , 2015, 2015 IEEE International Conference on Data Mining.

[7]  Nicholas J. Higham,et al.  Estimating the Largest Elements of a Matrix , 2016, SIAM J. Sci. Comput..

[8]  Francesco Ricci,et al.  Context-Aware Recommender Systems , 2011, AI Mag..

[9]  Ilse C. F. Ipsen,et al.  Randomized Approximation of the Gram Matrix: Exact Computation and Probabilistic Bounds , 2015, SIAM J. Matrix Anal. Appl..

[10]  Larry S. Davis,et al.  Collaborative Fashion Recommendation: A Functional Tensor Factorization Approach , 2015, ACM Multimedia.

[11]  原田 秀逸 私の computer 環境 , 1998 .

[12]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[13]  Yehuda Koren,et al.  Matrix Factorization Techniques for Recommender Systems , 2009, Computer.

[14]  Lars Schmidt-Thieme,et al.  Learning optimal ranking with tensor factorization for tag recommendation , 2009, KDD.

[15]  Andreas Hotho,et al.  Tag recommendations in social bookmarking systems , 2008, AI Commun..

[16]  Gerard Salton,et al.  Information Retrieval , 2018, Encyclopedia of Social Network Analysis and Mining. 2nd Ed..

[17]  Tsvi Kuflik,et al.  Second workshop on information heterogeneity and fusion in recommender systems (HetRec2011) , 2011, RecSys '11.

[18]  Andrew S. Glassner,et al.  Proceedings of the 27th annual conference on Computer graphics and interactive techniques , 1994, SIGGRAPH 1994.

[19]  Edith Cohen,et al.  Approximating matrix multiplication for pattern recognition tasks , 1997, SODA '97.

[20]  Jiawei Han,et al.  ACM Transactions on Knowledge Discovery from Data: Introduction , 2007 .

[21]  Daniel Gooch,et al.  Communications of the ACM , 2011, XRDS.

[22]  Ping Li,et al.  Asymmetric LSH (ALSH) for Sublinear Time Maximum Inner Product Search (MIPS) , 2014, NIPS.

[23]  F. Maxwell Harper,et al.  The MovieLens Datasets: History and Context , 2016, TIIS.

[24]  Alexandr Andoni,et al.  Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[25]  Henning Schulzrinne,et al.  Proceedings of the 12th annual ACM international conference on Multimedia , 2004, MM 2004.

[26]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[27]  Tamara G. Kolda,et al.  Link Prediction on Evolving Data Using Matrix and Tensor Factorizations , 2009, 2009 IEEE International Conference on Data Mining Workshops.