Combinatorial Clustering and the Beta Negative Binomial Process

We develop a Bayesian nonparametric approach to a general family of latent class problems in which individuals can belong simultaneously to multiple classes and where each class can be exhibited multiple times by an individual. We introduce a combinatorial stochastic process known as the <italic>negative binomial process</italic> (<inline-formula> <tex-math>${\rm NBP}$</tex-math><alternatives><graphic position="float" orientation="portrait" xlink:type="simple" xlink:href="broderick-ieq1-2318721.gif"/></alternatives> </inline-formula>) as an infinite-dimensional prior appropriate for such problems. We show that the <inline-formula> <tex-math>${\rm NBP}$</tex-math><alternatives><graphic position="float" orientation="portrait" xlink:type="simple" xlink:href="broderick-ieq2-2318721.gif"/></alternatives> </inline-formula> is conjugate to the beta process, and we characterize the posterior distribution under the beta-negative binomial process (<inline-formula><tex-math>${\rm BNBP}$</tex-math><alternatives> <graphic position="float" orientation="portrait" xlink:type="simple" xlink:href="broderick-ieq3-2318721.gif"/></alternatives></inline-formula>) and hierarchical models based on the <inline-formula><tex-math>${\rm BNBP}$</tex-math><alternatives><graphic position="float" orientation="portrait" xlink:type="simple" xlink:href="broderick-ieq4-2318721.gif"/> </alternatives></inline-formula> (the <inline-formula><tex-math>${\rm HBNBP}$</tex-math><alternatives> <graphic position="float" orientation="portrait" xlink:type="simple" xlink:href="broderick-ieq5-2318721.gif"/></alternatives></inline-formula>). We study the asymptotic properties of the <inline-formula><tex-math>${\rm BNBP}$</tex-math><alternatives> <graphic position="float" orientation="portrait" xlink:type="simple" xlink:href="broderick-ieq6-2318721.gif"/></alternatives></inline-formula> and develop a three-parameter extension of the <inline-formula><tex-math>${\rm BNBP}$</tex-math><alternatives> <graphic position="float" orientation="portrait" xlink:type="simple" xlink:href="broderick-ieq7-2318721.gif"/></alternatives></inline-formula> that exhibits power-law behavior. We derive MCMC algorithms for posterior inference under the <inline-formula><tex-math>${\rm HBNBP}$</tex-math> <alternatives><graphic position="float" orientation="portrait" xlink:type="simple" xlink:href="broderick-ieq8-2318721.gif"/></alternatives></inline-formula>, and we present experiments using these algorithms in the domains of image segmentation, object recognition, and document analysis.

[1]  J. McCloskey,et al.  A model for the distribution of individuals by species in an environment , 1965 .

[2]  J. Kingman,et al.  Completely random measures. , 1967 .

[3]  W. Ewens The sampling theory of selectively neutral alleles. , 1972, Theoretical population biology.

[4]  R. M. Korwar,et al.  Contributions to the Theory of Dirichlet Processes , 1973 .

[5]  T. Ferguson A Bayesian Analysis of Some Nonparametric Problems , 1973 .

[6]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Geoffrey J. McLachlan,et al.  Mixture models : inference and applications to clustering , 1989 .

[8]  N. Hjort Nonparametric Bayes Estimators Based on Beta Processes in Models for Life History Data , 1990 .

[9]  M. West,et al.  Hyperparameter estimation in Dirichlet process mixture models , 1992 .

[10]  J. Pitman,et al.  The two-parameter Poisson-Dirichlet distribution derived from a stable subordinator , 1997 .

[11]  S. MacEachern,et al.  Estimating mixture of dirichlet process models , 1998 .

[12]  Yongdai Kim NONPARAMETRIC BAYESIAN ESTIMATORS FOR COUNTING PROCESSES , 1999 .

[13]  P. Damlen,et al.  Gibbs sampling for Bayesian non‐conjugate and hierarchical models by using auxiliary variables , 1999 .

[14]  Radford M. Neal Markov Chain Sampling Methods for Dirichlet Process Mixture Models , 2000 .

[15]  Adrian E. Raftery,et al.  Model-Based Clustering, Discriminant Analysis, and Density Estimation , 2002 .

[16]  Radford M. Neal Slice Sampling , 2003, The Annals of Statistics.

[17]  Michael Mitzenmacher,et al.  A Brief History of Generative Models for Power Law and Lognormal Distributions , 2004, Internet Math..

[18]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[19]  S. Walker Invited comment on the paper "Slice Sampling" by Radford Neal , 2003 .

[20]  Stephen E. Fienberg,et al.  Bayesian Mixed Membership Models for Soft Clustering and Classification , 2004, GfKl.

[21]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[22]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[23]  M. Newman Power laws, Pareto distributions and Zipf's law , 2005 .

[24]  Thomas L. Griffiths,et al.  Interpolating between types and tokens by estimating power-law generators , 2005, NIPS.

[25]  Thomas L. Griffiths,et al.  Infinite latent feature models and the Indian buffet process , 2005, NIPS.

[26]  Alexei A. Efros,et al.  Discovering object categories in image collections , 2005 .

[27]  Henk A. L. Kiers,et al.  Classification - the Ubiquitous challenge. , 2005 .

[28]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[29]  Alexei A. Efros,et al.  Using Multiple Segmentations to Discover Objects and their Extent in Image Collections , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[30]  Yee Whye Teh,et al.  A Hierarchical Bayesian Language Model Based On Pitman-Yor Processes , 2006, ACL.

[31]  J. Pitman Combinatorial Stochastic Processes , 2006 .

[32]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[33]  Cordelia Schmid,et al.  Coloring Local Feature Extraction , 2006, ECCV.

[34]  Stephen G. Walker,et al.  Sampling the Dirichlet Mixture Model with Slices , 2006, Commun. Stat. Simul. Comput..

[35]  Ramsés H. Mena,et al.  Controlling the reinforcement in Bayesian non‐parametric mixture models , 2007 .

[36]  Bill Triggs,et al.  Region Classification with Markov Field Aspect Models , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[37]  J. Pitman,et al.  Notes on the occupancy problem with infinitely many boxes: general asymptotics and power laws ∗ , 2007, math/0701718.

[38]  Michalis K. Titsias,et al.  The Infinite Gamma-Poisson Feature Model , 2007, NIPS.

[39]  Michael I. Jordan,et al.  Hierarchical Beta Processes and the Indian Buffet Process , 2007, AISTATS.

[40]  Yee Whye Teh,et al.  Stick-breaking Construction for the Indian Buffet Process , 2007, AISTATS.

[41]  Fei-Fei Li,et al.  Spatially Coherent Latent Topic Model for Concurrent Segmentation and Classification of Objects and Scenes , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[42]  Michael I. Jordan,et al.  Nonparametric bayesian models for machine learning , 2008 .

[43]  O. Papaspiliopoulos A note on posterior sampling from Dirichlet mixture models , 2008 .

[44]  Feng Qi (祁锋) Bounds for the Ratio of Two Gamma Functions , 2009 .

[45]  Y. Teh,et al.  Indian Buffet Processes with Power-law Behavior , 2009, NIPS.

[46]  Michael I. Jordan,et al.  Sharing Features among Dynamical Systems with Beta Processes , 2009, NIPS.

[47]  Yee Whye Teh,et al.  A stochastic memoizer for sequence data , 2009, ICML '09.

[48]  Lawrence Carin,et al.  A Stick-Breaking Construction of the Beta Process , 2010, ICML.

[49]  M. R. Leadbetter Poisson Processes , 2011, International Encyclopedia of Statistical Science.

[50]  Stephen G. Walker,et al.  Slice sampling mixture models , 2011, Stat. Comput..

[51]  Michael I. Jordan,et al.  Beta Processes, Stick-Breaking and Power Laws , 2011, 1106.0539.

[52]  Hao Chen,et al.  Estimation of Parent Specific DNA Copy Number in Tumors using High-Density Genotyping Arrays , 2011, PLoS Comput. Biol..

[53]  A. Laforgia,et al.  On the asymptotic expansion of a ratio of gamma functions , 2012 .

[54]  David B. Dunson,et al.  Beta-Negative Binomial Process and Poisson Factor Analysis , 2011, AISTATS.