Learning Probabilistic Submodular Diversity Models Via Noise Contrastive Estimation

Modeling diversity of sets of items is important in many applications such as product recommendation and data summarization. Probabilistic submodular models, a family of models including the determinantal point process, form a natural class of distributions, encouraging eects such as diversity, repulsion and coverage. Current models, however, are limited to small and medium number of items due to the high time complexity for learning and inference. In this paper, we propose FLID, a novel log-submodular diversity model that scales to large numbers of items and can be eciently learned using noise contrastive estimation. We show that our model achieves state of the art performance in terms of model t, but can be also learned orders of magnitude faster. We demonstrate the wide applicability of our model using several experiments.

[1]  Mark Jerrum,et al.  Polynomial-Time Approximation Algorithms for the Ising Model , 1990, SIAM J. Comput..

[2]  Maria-Florina Balcan,et al.  Learning submodular functions , 2010, STOC '11.

[3]  Amin Karbasi,et al.  Fast Mixing for Discrete Point Processes , 2015, COLT.

[4]  Rishabh K. Iyer,et al.  Submodular Point Processes with Applications to Machine learning , 2015, AISTATS.

[5]  Sean M. McNee,et al.  Improving recommendation lists through topic diversification , 2005, WWW '05.

[6]  Hui Lin,et al.  A Class of Submodular Functions for Document Summarization , 2011, ACL.

[7]  Andreas Krause,et al.  From MAP to Marginals: Variational Inference in Bayesian Submodular Models , 2014, NIPS.

[8]  Kristen Grauman,et al.  Diverse Sequential Subset Selection for Supervised Video Summarization , 2014, NIPS.

[9]  Hui Lin,et al.  Learning Mixtures of Submodular Shells with Application to Document Summarization , 2012, UAI.

[10]  Rishabh K. Iyer,et al.  Learning Mixtures of Submodular Functions for Image Collection Summarization , 2014, NIPS.

[11]  Leslie Ann Goldberg,et al.  The Complexity of Ferromagnetic Ising with Local Fields , 2006, Combinatorics, Probability and Computing.

[12]  Aapo Hyvärinen,et al.  Noise-Contrastive Estimation of Unnormalized Statistical Models, with Applications to Natural Image Statistics , 2012, J. Mach. Learn. Res..

[13]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[14]  Suvrit Sra,et al.  Fixed-point algorithms for learning determinantal point processes , 2015, ICML.

[15]  Yisong Yue,et al.  Linear Submodular Bandits and their Application to Diversified Retrieval , 2011, NIPS.

[16]  Ben Taskar,et al.  Determinantal Point Processes for Machine Learning , 2012, Found. Trends Mach. Learn..

[17]  Dafna Shahaf,et al.  Turning down the noise in the blogosphere , 2009, KDD.

[18]  Jade Goldstein-Stewart,et al.  The Use of MMR, Diversity-Based Reranking for Reordering Documents and Producing Summaries , 1998, SIGIR Forum.

[19]  Alkis Gotovos,et al.  Sampling from Probabilistic Submodular Models , 2015, NIPS.

[20]  Vahab S. Mirrokni,et al.  Approximating submodular functions everywhere , 2009, SODA.

[21]  Steven M. Seitz,et al.  Scene Summarization for Online Image Collections , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[22]  Andreas Krause,et al.  Scalable Variational Inference in Log-supermodular Models , 2015, ICML.

[23]  Ben Taskar,et al.  Expectation-Maximization for Learning Determinantal Point Processes , 2014, NIPS.

[24]  Jade Goldstein-Stewart,et al.  The use of MMR, diversity-based reranking for reordering documents and producing summaries , 1998, SIGIR '98.

[25]  Jennifer Gillenwater Approximate inference for determinantal point processes , 2014 .

[26]  Ben Taskar,et al.  Learning Determinantal Point Processes , 2011, UAI.

[27]  Joseph Naor,et al.  A Tight Linear Time (1/2)-Approximation for Unconstrained Submodular Maximization , 2012, 2012 IEEE 53rd Annual Symposium on Foundations of Computer Science.