Scaled subordinators and generalizations of the Indian buffet process

We study random families of subsets of $\mathbb{N}$ that are similar to exchangeable random partitions, but do not require constituent sets to be disjoint: Each element of ${\mathbb{N}}$ may be contained in multiple subsets. One class of such objects, known as Indian buffet processes, has become a popular tool in machine learning. Based on an equivalence between Indian buffet and scale-invariant Poisson processes, we identify a random scaling variable whose role is similar to that played in exchangeable partition models by the total mass of a random measure. Analogous to the construction of exchangeable partitions from normalized subordinators, random families of sets can be constructed from randomly scaled subordinators. Coupling to a heavy-tailed scaling variable induces a power law on the number of sets containing the first $n$ elements. Several examples, with properties desirable in applications, are derived explicitly. A relationship to exchangeable partitions is made precise as a correspondence between scaled subordinators and Poisson-Kingman measures, generalizing a result of Arratia, Barbour and Tavare on scale-invariant processes.

[1]  L. L. Cam,et al.  An approximation theorem for the Poisson binomial distribution. , 1960 .

[2]  J. Kingman,et al.  Completely random measures. , 1967 .

[3]  J. Kingman Random Discrete Distributions , 1975 .

[4]  T. Ferguson,et al.  Bayesian Nonparametric Estimation Based on Censored Data , 1979 .

[5]  N. Hjort Nonparametric Bayes Estimators Based on Beta Processes in Models for Life History Data , 1990 .

[6]  Mihael Perman,et al.  Order statistics for jumps of normalised subordinators , 1993 .

[7]  O. Kallenberg Foundations of Modern Probability , 2021, Probability Theory and Stochastic Modelling.

[8]  Richard Arratia,et al.  On the central role of scale invariant Poisson processes on (0, ∞) , 1997, Microsurveys in Discrete Probability.

[9]  Daryl J. Daley,et al.  An Introduction to the Theory of Point Processes , 2013 .

[10]  Yongdai Kim NONPARAMETRIC BAYESIAN ESTIMATORS FOR COUNTING PROCESSES , 1999 .

[11]  Simon Tavaré,et al.  The Poisson–Dirichlet Distribution and the Scale-Invariant Poisson Process , 1999, Combinatorics, Probability and Computing.

[12]  Svante Janson,et al.  Random graphs , 2000, Wiley-Interscience series in discrete mathematics and optimization.

[13]  Svante Janson,et al.  Random graphs , 2000, ZOR Methods Model. Oper. Res..

[14]  J. Pitman Poisson-Kingman partitions , 2002, math/0210396.

[15]  R. Arratia,et al.  Logarithmic Combinatorial Structures: A Probabilistic Approach , 2003 .

[16]  Mathew D. Penrose,et al.  Random minimal directed spanning trees and Dickman-type distributions , 2004, Advances in Applied Probability.

[17]  Thomas L. Griffiths,et al.  Infinite latent feature models and the Indian buffet process , 2005, NIPS.

[18]  Dudley Stark LOGARITHMIC COMBINATORIAL STRUCTURES: A PROBABILISTIC APPROACH (EMS Monographs in Mathematics) By R ICHARD A RRATIA , A. D. B ARBOUR and S IMON T AVARÉ : 363 pp., €69.00, ISBN 3-03719-000-0 (European Mathematical Society, 2003) , 2005 .

[19]  M. Yor,et al.  On a particular class of self-decomposable random variables: the durations of Bessel excursions straddling independent exponential times , 2006 .

[20]  J. Pitman Combinatorial Stochastic Processes , 2006 .

[21]  B. Schölkopf,et al.  Modeling Dyadic Data with Binary Latent Factors , 2007 .

[22]  Michalis K. Titsias,et al.  The Infinite Gamma-Poisson Feature Model , 2007, NIPS.

[23]  Michael I. Jordan,et al.  Hierarchical Beta Processes and the Indian Buffet Process , 2007, AISTATS.

[24]  Yee Whye Teh,et al.  Stick-breaking Construction for the Indian Buffet Process , 2007, AISTATS.

[25]  T. Griffiths,et al.  Bayesian nonparametric latent feature models , 2007 .

[26]  Thomas L. Griffiths,et al.  Latent Features in Similarity Judgments: A Nonparametric Bayesian Approach , 2008, Neural Computation.

[27]  Jean-Philippe Vert,et al.  Clustered Multi-Task Learning: A Convex Formulation , 2008, NIPS.

[28]  David B. Dunson,et al.  Multi-Task Learning for Analyzing and Sorting Large Databases of Sequential Data , 2008, IEEE Transactions on Signal Processing.

[29]  Antonio Lijoi,et al.  Distributional properties of means of random probability measures , 2009 .

[30]  Massimiliano Pontil,et al.  Taking Advantage of Sparsity in Multi-Task Learning , 2009, COLT.

[31]  Y. Teh,et al.  Indian Buffet Processes with Power-law Behavior , 2009, NIPS.

[32]  Thomas L. Griffiths,et al.  Nonparametric Latent Feature Models for Link Prediction , 2009, NIPS.

[33]  Lawrence Carin,et al.  A Stick-Breaking Construction of the Beta Process , 2010, ICML.

[34]  Michael I. Jordan,et al.  Beta Processes, Stick-Breaking and Power Laws , 2011, 1106.0539.

[35]  Michael I. Jordan,et al.  Bayesian Nonparametric Latent Feature Models , 2011 .

[36]  Michael I. Jordan,et al.  Joint Modeling of Multiple Related Time Series via the Beta Process , 2011, 1111.4226.

[37]  Thomas L. Griffiths,et al.  The Indian Buffet Process: An Introduction and Review , 2011, J. Mach. Learn. Res..

[38]  François Caron Bayesian nonparametric models for bipartite graphs , 2012, NIPS.

[39]  David B. Dunson,et al.  Beta-Negative Binomial Process and Poisson Factor Analysis , 2011, AISTATS.

[40]  Michael I. Jordan,et al.  Cluster and Feature Modeling from Combinatorial Stochastic Processes , 2012, 1206.5862.

[41]  Luc Devroye,et al.  On simulation and properties of the stable law , 2014, Stat. Methods Appl..

[42]  Patrizia Berti,et al.  CENTRAL LIMIT THEOREMS FOR AN INDIAN BUFFET MODEL WITH RANDOM WEIGHTS , 2013, 1304.3626.

[43]  Ulrike Goldschmidt,et al.  An Introduction To The Theory Of Point Processes , 2016 .

[44]  Daniel M. Roy,et al.  The combinatorial structure of beta negative binomial processes , 2013, Bernoulli.