Cluster and Feature Modeling from Combinatorial Stochastic Processes

One of the focal points of the modern literature on Bayesian nonparametrics has been the problem of clustering, or partitioning, where each data point is modeled as being associated with one and only one of some collection of groups called clusters or partition blocks. Underlying these Bayesian nonparametric models are a set of interrelated stochastic processes, most notably the Dirichlet process and the Chinese restaurant process. In this paper we provide a formal development of an analogous problem, called feature modeling, for associating data points with arbitrary nonnegative integer numbers of groups, now called features or topics. We review the existing combinatorial stochastic process representations for the clustering problem and develop analogous representations for the feature modeling problem. These representations include the beta process and the Indian buffet process as well as new representations that provide insight into the connections between these processes. We thereby bring the same level of completeness to the treatment of Bayesian nonparametric feature modeling that has previously been achieved for Bayesian nonparametric clustering.

[1]  B. De Finetti,et al.  Funzione caratteristica di un fenomeno aleatorio , 1929 .

[2]  G. Pólya,et al.  Sur quelques points de la théorie des probabilités , 1930 .

[3]  L. J. Savage,et al.  Symmetric measures on Cartesian products , 1955 .

[4]  T. Teichmann,et al.  Harmonic Analysis and the Theory of Probability , 1957, The Mathematical Gazette.

[5]  D. Freedman Bernard Friedman's Urn , 1965 .

[6]  J. McCloskey,et al.  A model for the distribution of individuals by species in an environment , 1965 .

[7]  J. Kingman,et al.  Completely random measures. , 1967 .

[8]  D. Blackwell,et al.  Ferguson Distributions Via Polya Urn Schemes , 1973 .

[9]  T. Ferguson A Bayesian Analysis of Some Nonparametric Problems , 1973 .

[10]  J. Kingman The Representation of Partition Structures , 1978 .

[11]  L. Rogers,et al.  Diffusions, Markov processes, and martingales , 1979 .

[12]  F. Hoppe Pólya-like urns and the Ewens' sampling formula , 1984 .

[13]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  École d'été de probabilités de Saint-Flour,et al.  École d'été de probabilités de Saint-Flour XIII - 1983 , 1985 .

[15]  D. Aldous Exchangeability and related topics , 1985 .

[16]  N. Hjort Nonparametric Bayes Estimators Based on Beta Processes in Models for Life History Data , 1990 .

[17]  J. Sethuraman A CONSTRUCTIVE DEFINITION OF DIRICHLET PRIORS , 1991 .

[18]  M. Escobar Estimating Normal Means with a Dirichlet Process Prior , 1994 .

[19]  S. MacEachern Estimating normal means with a conjugate style dirichlet process prior , 1994 .

[20]  J. Pitman Exchangeable and partially exchangeable random partitions , 1995 .

[21]  J. Pitman Some developments of the Blackwell-MacQueen urn scheme , 1996 .

[22]  J. Pitman,et al.  The two-parameter Poisson-Dirichlet distribution derived from a stable subordinator , 1997 .

[23]  S. Taylor,et al.  LÉVY PROCESSES (Cambridge Tracts in Mathematics 121) , 1998 .

[24]  Yongdai Kim NONPARAMETRIC BAYESIAN ESTIMATORS FOR COUNTING PROCESSES , 1999 .

[25]  J. Bertoin Subordinators: Examples and Applications , 1999 .

[26]  J. Pitman,et al.  Prediction rules for exchangeable sequences related to species sampling ( , 2000 .

[27]  Radford M. Neal Markov Chain Sampling Methods for Dirichlet Process Mixture Models , 2000 .

[28]  Jean Bertoin,et al.  Subordinators, Lévy processes with no negative jumps, and branching processes , 2000 .

[29]  R. Wolpert Lévy Processes , 2000 .

[30]  M. Escobar,et al.  Markov Chain Sampling Methods for Dirichlet Process Mixture Models , 2000 .

[31]  H. Ishwaran,et al.  Markov chain Monte Carlo in approximate Dirichlet and beta two-parameter process hierarchical models , 2000 .

[32]  Lancelot F. James,et al.  Gibbs Sampling Methods for Stick-Breaking Priors , 2001 .

[33]  J. Pitman Poisson-Kingman partitions , 2002, math/0210396.

[34]  J. Pitman,et al.  Exchangeable Gibbs partitions and Stirling triangles , 2004, math/0412494.

[35]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[36]  Reflecting uncertainty in inverse problems: a Bayesian solution using Lévy processes , 2004 .

[37]  Thomas L. Griffiths,et al.  Infinite latent feature models and the Indian buffet process , 2005, NIPS.

[38]  J. Pitman Combinatorial Stochastic Processes , 2006 .

[39]  Wei Li,et al.  Pachinko allocation: DAG-structured mixture models of topic correlations , 2006, ICML.

[40]  Michael I. Jordan,et al.  Variational inference for Dirichlet process mixtures , 2006 .

[41]  Stephen G. Walker,et al.  Sampling the Dirichlet Mixture Model with Slices , 2006, Commun. Stat. Simul. Comput..

[42]  G. Roberts,et al.  Retrospective Markov chain Monte Carlo methods for Dirichlet process hierarchical models , 2007, 0710.4228.

[43]  Michael I. Jordan,et al.  Hierarchical Beta Processes and the Indian Buffet Process , 2007, AISTATS.

[44]  S. MacEachern,et al.  Bayesian Density Estimation and Inference Using Mixtures , 2007 .

[45]  Yee Whye Teh,et al.  Stick-breaking Construction for the Indian Buffet Process , 2007, AISTATS.

[46]  P. McCullagh,et al.  Gibbs fragmentation trees , 2007, 0704.0945.

[47]  D. Dunson,et al.  Kernel stick-breaking processes. , 2008, Biometrika.

[48]  Lawrence Carin,et al.  A Stick-Breaking Construction of the Beta Process , 2010, ICML.

[49]  Thomas L. Griffiths,et al.  The nested chinese restaurant process and bayesian nonparametric inference of topic hierarchies , 2007, JACM.

[50]  Peter I. Frazier,et al.  Distance dependent Chinese restaurant processes , 2009, ICML.

[51]  Michael I. Jordan,et al.  Tree-Structured Stick Breaking for Hierarchical Data , 2010, NIPS.

[52]  M. R. Leadbetter Poisson Processes , 2011, International Encyclopedia of Statistical Science.

[53]  Michael I. Jordan,et al.  Beta Processes, Stick-Breaking and Power Laws , 2011, 1106.0539.

[54]  Thomas L. Griffiths,et al.  The Indian Buffet Process: An Introduction and Review , 2011, J. Mach. Learn. Res..

[55]  David B. Dunson,et al.  Beta-Negative Binomial Process and Poisson Factor Analysis , 2011, AISTATS.

[56]  Michael I. Jordan,et al.  Feature allocations, probability functions, and paintboxes , 2013, 1301.6647.

[57]  P. Müller,et al.  Defining Predictive Probability Functions for Species Sampling Models. , 2013, Statistical science : a review journal of the Institute of Mathematical Statistics.

[58]  Michael I. Jordan,et al.  Combinatorial Clustering and the Beta Negative Binomial Process , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.