A Greedy Approach for Budgeted Maximum Inner Product Search

Maximum Inner Product Search (MIPS) is an important task in many machine learning applications such as the prediction phase of a low-rank matrix factorization model for a recommender system. There have been some works on how to perform MIPS in sub-linear time recently. However, most of them do not have the flexibility to control the trade-off between search efficient and search quality. In this paper, we study the MIPS problem with a computational budget. By carefully studying the problem structure of MIPS, we develop a novel Greedy-MIPS algorithm, which can handle budgeted MIPS by design. While simple and intuitive, Greedy-MIPS yields surprisingly superior performance compared to state-of-the-art approaches. As a specific example, on a candidate set containing half a million vectors of dimension 200, Greedy-MIPS runs 200x faster than the naive approach while yielding search results with the top-5 precision greater than 75\%.

[1]  Parikshit Ram,et al.  Efficient retrieval of recommendations in a matrix factorization framework , 2012, CIKM.

[2]  Patrick Seemann,et al.  Matrix Factorization Techniques for Recommender Systems , 2014 .

[3]  Edith Cohen,et al.  Approximating matrix multiplication for pattern recognition tasks , 1997, SODA '97.

[4]  Yehuda Koren,et al.  The Yahoo! Music Dataset and KDD-Cup '11 , 2012, KDD Cup.

[5]  Ping Li,et al.  Asymmetric LSH (ALSH) for Sublinear Time Maximum Inner Product Search (MIPS) , 2014, NIPS.

[6]  Inderjit S. Dhillon,et al.  A Scalable Asynchronous Distributed Algorithm for Topic Modeling , 2014, WWW.

[7]  Yehuda Koren,et al.  Matrix Factorization Techniques for Recommender Systems , 2009, Computer.

[8]  Jason Weston,et al.  Large scale image annotation: learning to rank with joint word-image embeddings , 2010, Machine Learning.

[9]  Paul Covington,et al.  Deep Neural Networks for YouTube Recommendations , 2016, RecSys.

[10]  M. V. Wilkes,et al.  The Art of Computer Programming, Volume 3, Sorting and Searching , 1974 .

[11]  Inderjit S. Dhillon,et al.  Parallel matrix factorization for recommender systems , 2014, Knowl. Inf. Syst..

[12]  Parikshit Ram,et al.  Maximum inner-product search using cone trees , 2012, KDD.

[13]  Pascal Vincent,et al.  Clustering is Efficient for Approximate Maximum Inner Product Search , 2015, ArXiv.

[14]  Tamara G. Kolda,et al.  Diamond Sampling for Approximate Maximum All-Pairs Dot-Product (MAD) Search , 2015, 2015 IEEE International Conference on Data Mining.

[15]  Inderjit S. Dhillon,et al.  Large-scale Multi-label Learning with Missing Labels , 2013, ICML.

[16]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[17]  Ulrich Paquet,et al.  Speeding up the Xbox recommender system using a euclidean transformation for inner-product spaces , 2014, RecSys '14.

[18]  Ping Li,et al.  Improved Asymmetric Locality Sensitive Hashing (ALSH) for Maximum Inner Product Search (MIPS) , 2014, UAI.

[19]  Stefano Ermon,et al.  Learning and Inference via Maximum Inner Product Search , 2016, ICML.

[20]  Chih-Jen Lin,et al.  A Learning-Rate Schedule for Stochastic Gradient Methods to Matrix Factorization , 2015, PAKDD.

[21]  Nathan Srebro,et al.  On Symmetric and Asymmetric LSHs for Inner Product Search , 2014, ICML.