Exponentiated Strongly Rayleigh Distributions

Strongly Rayleigh (SR) measures are discrete probability distributions over the subsets of a ground set. They enjoy strong negative dependence properties, as a result of which they assign higher probability to subsets of diverse elements. We introduce in this paper Exponentiated Strongly Rayleigh (ESR) measures, which sharpen (or smoothen) the negative dependence property of SR measures via a single parameter (the exponent) that can intuitively understood as an inverse temperature. We develop efficient MCMC procedures for approximate sampling from ESRs, and obtain explicit mixing time bounds for two concrete instances: exponentiated versions of Determinantal Point Processes and Dual Volume Sampling. We illustrate some of the potential of ESRs, by applying them to a few machine learning tasks; empirical results confirm that beyond their theoretical appeal, ESR-based models hold significant promise for these tasks.

[1]  Hui Lin,et al.  Learning Mixtures of Submodular Shells with Application to Document Summarization , 2012, UAI.

[2]  R. Pemantle Towards a theory of negative dependence , 2000, math/0404095.

[3]  Jennifer Gillenwater Approximate inference for determinantal point processes , 2014 .

[4]  D. Spielman,et al.  Interlacing Families II: Mixed Characteristic Polynomials and the Kadison-Singer Problem , 2013, 1306.3969.

[5]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[6]  Suvrit Sra,et al.  Polynomial time algorithms for dual volume sampling , 2017, NIPS.

[7]  Nando de Freitas,et al.  An Introduction to MCMC for Machine Learning , 2004, Machine Learning.

[8]  Hans-Peter Kriegel,et al.  LoOP: local outlier probabilities , 2009, CIKM.

[9]  W. Specht Zur Theorie der elementaren Mittel , 1960 .

[10]  Suvrit Sra,et al.  Elementary Symmetric Polynomials for Optimal Experimental Design , 2017, NIPS.

[11]  Exact bound for the convergence of metropolis chains , 2000 .

[12]  Amin Karbasi,et al.  Fast Mixing for Discrete Point Processes , 2015, COLT.

[13]  Yi-Cheng Zhang,et al.  Solving the apparent diversity-accuracy dilemma of recommender systems , 2008, Proceedings of the National Academy of Sciences.

[14]  Ulrich Paquet,et al.  Low-Rank Factorization of Determinantal Point Processes , 2017, AAAI.

[15]  Alkis Gotovos,et al.  Sampling from Probabilistic Submodular Models , 2015, NIPS.

[16]  Suvrit Sra,et al.  Diversity Networks , 2015, ICLR.

[17]  Andrew Gelman,et al.  General methods for monitoring convergence of iterative simulations , 1998 .

[18]  Ryan P. Adams,et al.  Priors for Diversity in Generative Latent Variable Models , 2012, NIPS.

[19]  Ira Assent,et al.  Learning Outlier Ensembles: The Best of Both Worlds - Supervised and Unsupervised , 2014 .

[20]  P. Diaconis,et al.  Geometric Bounds for Eigenvalues of Markov Chains , 1991 .

[21]  M. Amer,et al.  Nearest-Neighbor and Clustering based Anomaly Detection Algorithms for RapidMiner , 2012 .

[22]  Andreas Krause,et al.  Variational Inference in Mixed Probabilistic Submodular Models , 2016, NIPS.

[23]  J. Borcea,et al.  Polya-Schur master theorems for circular domains and their boundaries , 2006, math/0607416.

[24]  Nima Anari,et al.  A generalization of permanent inequalities and applications in counting and optimization , 2017, STOC.

[25]  Suvrit Sra,et al.  Fast Mixing Markov Chains for Strongly Rayleigh Measures, DPPs, and Constrained Sampling , 2016, NIPS.

[26]  T. Liggett,et al.  Negative dependence and the geometry of polynomials , 2007, 0707.2340.

[27]  Kristen Grauman,et al.  Large-Margin Determinantal Point Processes , 2014, UAI.

[28]  Andreas Dengel,et al.  Histogram-based Outlier Score (HBOS): A fast Unsupervised Anomaly Detection Algorithm , 2012 .

[29]  Ben Taskar,et al.  Learning the Parameters of Determinantal Point Process Kernels , 2014, ICML.

[30]  E. Nyström Über Die Praktische Auflösung von Integralgleichungen mit Anwendungen auf Randwertaufgaben , 1930 .

[31]  Andreas Krause,et al.  From MAP to Marginals: Variational Inference in Bayesian Submodular Models , 2014, NIPS.

[32]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[33]  Matthias W. Seeger,et al.  Using the Nyström Method to Speed Up Kernel Machines , 2000, NIPS.

[34]  Yuval Peres,et al.  Concentration of Lipschitz Functionals of Determinantal and Other Strong Rayleigh Measures , 2011, Combinatorics, Probability and Computing.

[35]  Christos Boutsidis,et al.  Faster Subset Selection for Matrices and Applications , 2011, SIAM J. Matrix Anal. Appl..

[36]  T. Shirai,et al.  Random point fields associated with certain Fredholm determinants I: fermion, Poisson and boson point processes , 2003 .

[37]  Ben Taskar,et al.  Determinantal Point Processes for Machine Learning , 2012, Found. Trends Mach. Learn..

[38]  Clara Pizzuti,et al.  Fast Outlier Detection in High Dimensional Spaces , 2002, PKDD.

[39]  J. Borcea,et al.  Applications of stable polynomials to mixed determinants: Johnson's conjectures, unimodality, and symmetrized Fischer products , 2006, math/0607755.

[40]  P. Diaconis,et al.  COMPARISON THEOREMS FOR REVERSIBLE MARKOV CHAINS , 1993 .

[41]  Ben Taskar,et al.  Nystrom Approximation for Large-Scale Determinantal Processes , 2013, AISTATS.

[42]  Nima Anari,et al.  Monte Carlo Markov Chain Algorithms for Sampling Strongly Rayleigh Distributions and Determinantal Point Processes , 2016, COLT.

[43]  Andreas Krause,et al.  Near-Optimal Sensor Placements in Gaussian Processes: Theory, Efficient Algorithms and Empirical Studies , 2008, J. Mach. Learn. Res..

[44]  Suvrit Sra,et al.  Fast DPP Sampling for Nystrom with Application to Kernel Methods , 2016, ICML.

[45]  Manfred K. Warmuth,et al.  Unbiased estimates for linear regression via volume sampling , 2017, NIPS.

[46]  Seiichi Uchida,et al.  A Comparative Evaluation of Unsupervised Anomaly Detection Algorithms for Multivariate Data , 2016, PloS one.

[47]  Olvi L. Mangasarian,et al.  Nuclear feature extraction for breast tumor diagnosis , 1993, Electronic Imaging.

[48]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.