论文信息 - The nested Chinese restaurant process and hierarchical topic models

The nested Chinese restaurant process and hierarchical topic models

We present the nested Chinese restaurant process (nCRP), a stochastic process which assigns probability distributions to infinitely-deep, infinitely-branching trees. We show how this stochastic process can be used as a prior distribution in a Bayesian nonparametric model of document collections. Specifically, we present an application to information retrieval in which documents are modeled as paths down a random tree, and the preferential attachment dynamics of the nCRP leads to clustering of documents according to sharing of topics at multiple levels of abstraction. Given a corpus of documents, a posterior inference algorithm finds an approximation to a posterior distribution over trees, topics and allocations of words to levels of the tree. We demonstrate this algorithm on collections of scientific abstracts from several journals. This model exemplifies a recent trend in statistical machine learning--the use of Bayesian nonparametric methods to infer distributions on flexible data structures.

Thomas L. Griffiths | Michael I. Jordan | David M. Blei | T. Griffiths | D. Blei

[1] T. Ferguson. A Bayesian Analysis of Some Nonparametric Problems , 1973 .

[2] C. Antoniak. Mixtures of Dirichlet Processes with Applications to Bayesian Nonparametric Problems , 1974 .

[3] Samuel Kotz,et al. Urn Models and Their Applications: An Approach to Modern Discrete Probability Theory , 1978, The Mathematical Gazette.

[4] Donald Geman,et al. Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5] D. Aldous. Exchangeability and related topics , 1985 .

[6] A. F. Smith,et al. Statistical analysis of finite mixture distributions , 1986 .

[7] Adrian F. M. Smith,et al. Sampling-Based Approaches to Calculating Marginal Densities , 1990 .

[8] T. Landauer,et al. Indexing by Latent Semantic Analysis , 1990 .

[9] J. Sethuraman. A CONSTRUCTIVE DEFINITION OF DIRICHLET PRIORS , 1991 .

[10] Jun S. Liu,et al. The Collapsed Gibbs Sampler in Bayesian Computations with Applications to a Gene Regulation Problem , 1994 .

[11] David B. Dunson,et al. Bayesian Data Analysis , 2010 .

[12] M. Escobar,et al. Bayesian Density Estimation and Inference Using Mixtures , 1995 .

[13] S. MacEachern,et al. Estimating mixture of dirichlet process models , 1998 .

[14] Thomas Hofmann,et al. The Cluster-Abstraction Model: Unsupervised Learning of Topic Hierarchies from Text Data , 1999, IJCAI.

[15] Thomas Hofmann,et al. Probabilistic latent semantic indexing , 1999, SIGIR '99.

[16] Albert,et al. Emergence of scaling in random networks , 1999, Science.

[17] Nello Cristianini,et al. An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[18] M. Escobar,et al. Markov Chain Sampling Methods for Dirichlet Process Mixture Models , 2000 .

[19] Stuart J. Russell,et al. Approximate inference for first-order probabilistic languages , 2001, IJCAI.

[20] Mihaela Enachescu,et al. Variations on Random Graph Models for the Web , 2001 .

[21] S. Redner,et al. Organization of growing random networks. , 2000, Physical review. E, Statistical, nonlinear, and soft matter physics.

[22] Albert-László Barabási,et al. Statistical mechanics of complex networks , 2001, ArXiv.

[23] Thorsten Joachims,et al. Learning to classify text using support vector machines - methods, theory and algorithms , 2002, The Kluwer international series in engineering and computer science.

[24] Michael I. Jordan,et al. Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[25] Christian P. Robert,et al. Monte Carlo Statistical Methods , 2005, Springer Texts in Statistics.

[26] Thomas L. Griffiths,et al. Infinite latent feature models and the Indian buffet process , 2005, NIPS.

[27] Christian P. Robert,et al. Monte Carlo Statistical Methods (Springer Texts in Statistics) , 2005 .

[28] Antonio Torralba,et al. Describing Visual Scenes using Transformed Dirichlet Processes , 2005, NIPS.

[29] Pietro Perona,et al. A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[30] Stuart J. Russell,et al. Approximate Inference for Infinite Contingent Bayesian Networks , 2005, AISTATS.

[31] Alexei A. Efros,et al. Using Multiple Segmentations to Discover Objects and their Extent in Image Collections , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[32] B. Schölkopf,et al. Edinburgh Research Explorer Interpolating between types and tokens by estimating power-law generators , 2006 .

[33] J. Pitman. Combinatorial Stochastic Processes , 2006 .

[34] Thomas L. Griffiths,et al. Contextual Dependencies in Unsupervised Word Segmentation , 2006, ACL.

[35] Michael I. Jordan,et al. Hierarchical Dirichlet Processes , 2006 .

[36] Max Welling,et al. Accelerated Variational Dirichlet Process Mixtures , 2006, NIPS.

[37] Thomas L. Griffiths,et al. Adaptor Grammars: A Framework for Specifying Compositional Nonparametric Bayesian Models , 2006, NIPS.

[38] Michael I. Jordan,et al. Variational inference for Dirichlet process mixtures , 2006 .

[39] Roded Sharan,et al. Bayesian Haplotype Inference via the Dirichlet Process , 2007, J. Comput. Biol..

[40] Michael I. Jordan,et al. Hierarchical Beta Processes and the Indian Buffet Process , 2007, AISTATS.

[41] Wei Li,et al. Nonparametric Bayes Pachinko Allocation , 2007, UAI.

[42] Yee Whye Teh,et al. Stick-breaking Construction for the Indian Buffet Process , 2007, AISTATS.

[43] Dan Klein,et al. The Infinite PCFG Using Hierarchical Dirichlet Processes , 2007, EMNLP.

[44] David Poole,et al. Logical Generative Models for Probabilistic Reasoning about Existence, Roles and Identity , 2007, AAAI.