Bayesian nonparametric learning for complicated text mining

Text mining has gained the ever-increasing attention of researchers in recent years because text is one of the most natural and easy ways to express human knowledge and opinions, and is therefore believed to have a variety of application scenarios and a potentially high commercial value. It is commonly accepted that Bayesian models with finite-dimensional probability distributions as building blocks, also known as parametric topic models, are effective tools for text mining. However, one problem in existing parametric topic models is that the hidden topic number needs to be fixed in advance. Determining an appropriate number is very difficult, and sometimes unrealistic, for many real-world applications and may lead to over-fitting or under-fitting issues. Bayesian nonparametric learning is a key approach for learning the number of mixtures in a mixture model (also called the model selection problem), and has emerged as an elegant way to handle a flexible number of topics. The core idea of Bayesian nonparametric models is to use stochastic processes as building blocks, instead of traditional fixed-dimensional probability distributions. Even though Bayesian nonparametric learning has gained considerable research attention and undergone rapid development, its ability to conduct complicated text mining tasks, such as: document-word co-clustering, document network learning, multi-label document learning, and so on, is still weak. Therefore, there is still a gap between the Bayesian nonparametric learning theory and complicated real-world

[1]  David B. Dunson,et al.  Dependent Hierarchical Beta Process for Image Interpolation and Denoising , 2011, AISTATS.

[2]  J. Sethuraman A CONSTRUCTIVE DEFINITION OF DIRICHLET PRIORS , 1991 .

[3]  Zhaoshui He,et al.  Symmetric Nonnegative Matrix Factorization: Algorithms and Applications to Probabilistic Clustering , 2011, IEEE Transactions on Neural Networks.

[4]  H. Park Hyperlink network analysis: A new method for the study of social structure on the web , 2003 .

[5]  Chong Wang,et al.  Stochastic variational inference , 2012, J. Mach. Learn. Res..

[6]  Shunzheng Yu,et al.  Hidden semi-Markov models , 2010, Artif. Intell..

[7]  Hirokazu Kameoka,et al.  Bayesian nonparametric music parser , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[8]  Min-Ling Zhang,et al.  Lift: Multi-Label Learning with Label-Specific Features , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Yee Whye Teh,et al.  The Mondrian Process for Machine Learning , 2015, 1507.05181.

[10]  Chris H. Q. Ding,et al.  Convex and Semi-Nonnegative Matrix Factorizations , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Emily B. Fox,et al.  Bayesian nonparametric learning of complex dynamical phenomena , 2009 .

[12]  Saso Dzeroski,et al.  An extensive experimental comparison of methods for multi-label learning , 2012, Pattern Recognit..

[13]  Daniel P. W. Ellis,et al.  Beta Process Sparse Nonnegative Matrix Factorization for Music , 2013, ISMIR.

[14]  Carl E. Rasmussen,et al.  The Infinite Gaussian Mixture Model , 1999, NIPS.

[15]  Eric P. Xing,et al.  MedLDA: maximum margin supervised topic models , 2012, J. Mach. Learn. Res..

[16]  Michael I. Jordan,et al.  Beta Processes, Stick-Breaking and Power Laws , 2011, 1106.0539.

[17]  Eric P. Xing,et al.  Parallel Markov Chain Monte Carlo for Pitman-Yor Mixture Models , 2014, UAI.

[18]  Radford M. Neal,et al.  Density Modeling and Clustering Using Dirichlet Diffusion Trees , 2003 .

[19]  John W. Fisher,et al.  Parallel Sampling of DP Mixture Models using Sub-Cluster Splits , 2013, NIPS.

[20]  Nizar Bouguila,et al.  Unsupervised learning of a finite mixture model based on the Dirichlet distribution and its application , 2004, IEEE Transactions on Image Processing.

[21]  Michael I. Jordan,et al.  Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems , 1994, NIPS.

[22]  Nando de Freitas,et al.  An Introduction to MCMC for Machine Learning , 2004, Machine Learning.

[23]  Sang-goo Lee,et al.  Building topic hierarchy based on fuzzy relations , 2003, Neurocomputing.

[24]  Michael I. Jordan,et al.  Nonparametric Bayesian Learning of Switching Linear Dynamical Systems , 2008, NIPS.

[25]  Yee Whye Teh,et al.  Spatial Normalized Gamma Processes , 2009, NIPS.

[26]  Yee Whye Teh,et al.  Dependent Normalized Random Measures , 2013, ICML.

[27]  K. Bretonnel Cohen,et al.  Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing , 2007 .

[28]  Kun Zhang,et al.  Multi-label learning by exploiting label dependency , 2010, KDD.

[29]  Yee Whye Teh,et al.  Collapsed Variational Dirichlet Process Mixture Models , 2007, IJCAI.

[30]  M. M. Hassan Mahmud,et al.  Constructing States for Reinforcement Learning , 2010, ICML.

[31]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[32]  Lan Du,et al.  Differential Topic Models , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Longbing Cao,et al.  Dynamic Infinite Mixed-Membership Stochastic Blockmodel , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[34]  Dekang Lin,et al.  Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1 , 2011 .

[35]  Andrew McCallum,et al.  Topics over time: a non-Markov continuous-time model of topical trends , 2006, KDD '06.

[36]  Arnaud Doucet,et al.  Generalized Polya Urn for Time-varying Dirichlet Process Mixtures , 2007, UAI.

[37]  Charles-Edmond Bichot Co-clustering Documents and Words by Minimizing the Normalized Cut Objective Function , 2010, J. Math. Model. Algorithms.

[38]  Ricardo Baeza-Yates,et al.  Information Retrieval: Data Structures and Algorithms , 1992 .

[39]  Chong Wang,et al.  Nested Hierarchical Dirichlet Processes , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Chong Wang,et al.  Variational Inference for the Nested Chinese Restaurant Process , 2009, NIPS.

[41]  T. Griffiths,et al.  A Bayesian framework for word segmentation: Exploring the effects of context , 2009, Cognition.

[42]  Jonathan P. How,et al.  Streaming, Distributed Variational Inference for Bayesian Nonparametrics , 2015, NIPS.

[43]  David M. Blei,et al.  Hierarchical relational models for document networks , 2009, 0909.4331.

[44]  Ingram Olkin,et al.  A bivariate beta distribution , 2003 .

[45]  Yuan Qi,et al.  Nonparametric Bayesian Matrix Factorization by Power-EP , 2010, AISTATS.

[46]  Ning Chen,et al.  Discriminative Relational Topic Models , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[47]  J. Kingman,et al.  Completely random measures. , 1967 .

[48]  Yizhou Sun,et al.  ETM: Entity Topic Models for Mining Documents Associated with Entities , 2012, 2012 IEEE 12th International Conference on Data Mining.

[49]  Geoffrey I. Webb,et al.  Encyclopedia of Machine Learning , 2011, Encyclopedia of Machine Learning.

[50]  Marius Pasca,et al.  Latent Variable Models of Concept-Attribute Attachment , 2009, ACL/IJCNLP.

[51]  Stan Z. Li,et al.  Markov Random Field Modeling in Computer Vision , 1995, Computer Science Workbench.

[52]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[53]  Chong Wang,et al.  Truncation-free Online Variational Inference for Bayesian Nonparametric Models , 2012, NIPS.

[54]  Andrew M. Dai,et al.  The Supervised Hierarchical Dirichlet Process , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[55]  Yee Whye Teh,et al.  A Hierarchical Bayesian Language Model Based On Pitman-Yor Processes , 2006, ACL.

[56]  Jianwen Zhang,et al.  Evolutionary hierarchical dirichlet processes for multiple correlated time-varying corpora , 2010, KDD.

[57]  Paul Fearnhead,et al.  Particle filters for mixture models with an unknown number of components , 2004, Stat. Comput..

[58]  Daniel N. Rockmore,et al.  A unifying representation for a class of dependent random measures , 2012, AISTATS.

[59]  Markus Flierl,et al.  Bayesian estimation of Dirichlet mixture model with variational inference , 2014, Pattern Recognit..

[60]  Michael I. Jordan,et al.  Tree-Structured Stick Breaking for Hierarchical Data , 2010, NIPS.

[61]  Jun S. Liu,et al.  Sequential importance sampling for nonparametric Bayes models: The next generation , 1999 .

[62]  D. Dunson,et al.  The local Dirichlet process , 2011, Annals of the Institute of Statistical Mathematics.

[63]  Scott Lindroth,et al.  Dynamic Nonparametric Bayesian Models for Analysis of Music , 2010 .

[64]  John Elder,et al.  Practical Text Mining and Statistical Analysis for Non-structured Text Data Applications , 2012 .

[65]  N. Hjort Nonparametric Bayes Estimators Based on Beta Processes in Models for Life History Data , 1990 .

[66]  T. Xiang,et al.  Background Subtraction with DirichletProcess Mixture Models , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[67]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[68]  D. Dunson,et al.  Kernel stick-breaking processes. , 2008, Biometrika.

[69]  Martin A. Tanner,et al.  From EM to Data Augmentation: The Emergence of MCMC Bayesian Computation in the 1980s , 2010, 1104.2210.

[70]  Yee Whye Teh,et al.  Variational Inference for the Indian Buffet Process , 2009, AISTATS.

[71]  Lise Getoor,et al.  Collective Classification in Network Data , 2008, AI Mag..

[72]  Thomas L. Griffiths,et al.  Learning author-topic models from text corpora , 2010, TOIS.

[73]  Samuel J. Gershman,et al.  A Tutorial on Bayesian Nonparametric Models , 2011, 1106.2697.

[74]  Yee Whye Teh,et al.  The Infinite Factorial Hidden Markov Model , 2008, NIPS.

[75]  David Pfau,et al.  Bayesian Nonparametric Methods for Partially-Observable Reinforcement Learning , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[76]  Aad van der Vaart,et al.  Dirichlet Process Mixtures , 2017 .

[77]  Min-Ling Zhang,et al.  A Review on Multi-Label Learning Algorithms , 2014, IEEE Transactions on Knowledge and Data Engineering.

[78]  Yee Whye Teh,et al.  Collapsed Variational Inference for HDP , 2007, NIPS.

[79]  Han Tong Loh,et al.  Grouping of TRIZ Inventive Principles to facilitate automatic patent classification , 2008, Expert Syst. Appl..

[80]  C. J-F,et al.  THE COALESCENT , 1980 .

[81]  Thomas L. Griffiths,et al.  Infinite latent feature models and the Indian buffet process , 2005, NIPS.

[82]  Perry R. Cook,et al.  Content-Based Musical Similarity Computation using the Hierarchical Dirichlet Process , 2008, ISMIR.

[83]  Philip S. Yu,et al.  Evolutionary Clustering by Hierarchical Dirichlet Process with Hidden Markov State , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[84]  Charu C. Aggarwal,et al.  Mining Text Data , 2012, Springer US.

[85]  L. R. Rasmussen,et al.  In information retrieval: data structures and algorithms , 1992 .

[86]  Nicholas J. Foti,et al.  A Survey of Non-Exchangeable Priors for Bayesian Nonparametric Models , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[87]  Ramesh Nallapati,et al.  Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora , 2009, EMNLP.

[88]  Yee Whye Teh,et al.  The Mondrian Process , 2008, NIPS.

[89]  Zoubin Ghahramani,et al.  Beta Diffusion Trees , 2014, ICML.

[90]  Gurpreet Singh Lehal,et al.  A Survey of Text Mining Techniques and Applications , 2009 .

[91]  Svetha Venkatesh,et al.  A Slice Sampler for Restricted Hierarchical Beta Process with Applications to Shared Subspace Learning , 2012, UAI.

[92]  Michael I. Jordan,et al.  Hierarchical Beta Processes and the Indian Buffet Process , 2007, AISTATS.

[93]  Lawrence Carin,et al.  Negative Binomial Process Count and Mixture Modeling , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[94]  Thomas L. Griffiths,et al.  Particle Filtering for Nonparametric Bayesian Matrix Factorization , 2006, NIPS.

[95]  Perry R. Cook,et al.  Bayesian Nonparametric Matrix Factorization for Recorded Music , 2010, ICML.

[96]  David B. Dunson,et al.  The Kernel Beta Process , 2011, NIPS.

[97]  Arthur Gretton,et al.  Parallel Gibbs Sampling: From Colored Fields to Thin Junction Trees , 2011, AISTATS.

[98]  Thomas L. Griffiths,et al.  Probabilistic author-topic models for information discovery , 2004, KDD.

[99]  Michael I. Jordan,et al.  JOINT MODELING OF MULTIPLE TIME SERIES VIA THE BETA PROCESS WITH APPLICATION TO MOTION CAPTURE SEGMENTATION , 2013, 1308.4747.

[100]  Tat-Seng Chua,et al.  Topic hierarchy construction for the organization of multi-source user generated contents , 2013, SIGIR.

[101]  Michael I. Jordan,et al.  A Sticky HDP-HMM With Application to Speaker Diarization , 2009, 0905.2592.

[102]  Thomas L. Griffiths,et al.  The nested chinese restaurant process and bayesian nonparametric inference of topic hierarchies , 2007, JACM.

[103]  Haixun Wang,et al.  Tracking and Connecting Topics via Incremental Hierarchical Dirichlet Processes , 2011, 2011 IEEE 11th International Conference on Data Mining.

[104]  Michael I. Jordan,et al.  Probabilistic models of text and images , 2004 .

[105]  David B. Dunson,et al.  The Hierarchical Beta Process for Convolutional Factor Analysis and Deep Learning , 2011, ICML.

[106]  Hedvig Kjellström,et al.  Supervised Hierarchical Dirichlet Processes with Variational Inference , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[107]  T. Martin McGinnity,et al.  A Context-Based Word Indexing Model for Document Summarization , 2013, IEEE Transactions on Knowledge and Data Engineering.

[108]  Shengli Xie,et al.  Blind Spectral Unmixing Based on Sparse Nonnegative Matrix Factorization , 2011, IEEE Transactions on Image Processing.

[109]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Indexing , 1999, SIGIR Forum.

[110]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[111]  Thomas L. Griffiths,et al.  The Author-Topic Model for Authors and Documents , 2004, UAI.

[112]  Xiaohua Hu,et al.  Tree Labeled LDA: A Hierarchical model for web summaries , 2013, 2013 IEEE International Conference on Big Data.

[113]  Xiao-Li Meng,et al.  The Art of Data Augmentation , 2001 .

[114]  Carolyn J. Crouch,et al.  A cluster-based approach to thesaurus construction , 1988, SIGIR '88.

[115]  Yee Whye Teh,et al.  Stick-breaking Construction for the Indian Buffet Process , 2007, AISTATS.

[116]  Max Welling,et al.  Asynchronous Distributed Learning of Topic Models , 2008, NIPS.

[117]  Phil Blunsom,et al.  A Hierarchical Pitman-Yor Process HMM for Unsupervised Part of Speech Induction , 2011, ACL.

[118]  Stefano Favaro,et al.  A new estimator of the discovery probability. , 2012, Biometrics.

[119]  Pravin K. Trivedi,et al.  Copula Modeling: An Introduction for Practitioners , 2007 .

[120]  Babak Shahbaba,et al.  Nonlinear Models Using Dirichlet Process Mixtures , 2007, J. Mach. Learn. Res..

[121]  Yee Whye Teh,et al.  Modelling Genetic Variations using Fragmentation-Coagulation Processes , 2011, NIPS.

[122]  Rong Yan,et al.  Mining Social Emotions from Affective Text , 2012, IEEE Transactions on Knowledge and Data Engineering.

[123]  Zoubin Ghahramani,et al.  Flexible Martingale Priors for Deep Hierarchies , 2012, AISTATS.

[124]  Zoubin Ghahramani,et al.  The infinite HMM for unsupervised PoS tagging , 2009, EMNLP.

[125]  Yee Whye Teh,et al.  Bayesian Agglomerative Clustering with Coalescents , 2007, NIPS.

[126]  Michael Lindenbaum,et al.  Nonnegative Matrix Factorization with Earth Mover's Distance Metric for Image Analysis , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[127]  Wei Li,et al.  Mixtures of hierarchical topics with Pachinko allocation , 2007, ICML '07.

[128]  Frank D. Wood,et al.  Hierarchically Supervised Latent Dirichlet Allocation , 2011, NIPS.

[129]  Thomas L. Griffiths,et al.  The Indian Buffet Process: An Introduction and Review , 2011, J. Mach. Learn. Res..

[130]  Michael I. Jordan,et al.  Variational inference for Dirichlet process mixtures , 2006 .

[131]  Adelino R. Ferreira da Silva,et al.  A Dirichlet process mixture model for brain MRI tissue classification , 2007, Medical Image Anal..

[132]  Hiroshi Nakagawa,et al.  Practical collapsed variational bayes inference for hierarchical dirichlet process , 2012, KDD.

[133]  Christoph Schnörr,et al.  Learning Sparse Representations by Non-Negative Matrix Factorization and Sequential Cone Programming , 2006, J. Mach. Learn. Res..

[134]  Fredric C. Gey,et al.  Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval , 1999, SIGIR 1999.

[135]  David M. Blei,et al.  Relational Topic Models for Document Networks , 2009, AISTATS.

[136]  Murat Dundar,et al.  The Infinite Mixture of Infinite Gaussian Mixtures , 2014, NIPS.

[137]  Guillermo Sapiro,et al.  Non-Parametric Bayesian Dictionary Learning for Sparse Image Representations , 2009, NIPS.

[138]  M. Steel,et al.  Comparing distributions by using dependent normalized random‐measure mixtures , 2013 .

[139]  Yingjian Wang,et al.  Levy Measure Decompositions for the Beta and Gamma Processes , 2012, ICML.

[140]  W. Sudderth,et al.  Polya Trees and Random Distributions , 1992 .

[141]  Russell Zaretzki,et al.  Beta Process Joint Dictionary Learning for Coupled Feature Spaces with Application to Single Image Super-Resolution , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[142]  David B. Dunson,et al.  Beta-Negative Binomial Process and Poisson Factor Analysis , 2011, AISTATS.

[143]  W. L. Windsor Music and Probability , 2009 .

[144]  Stephen G. Walker,et al.  Slice sampling mixture models , 2011, Stat. Comput..

[145]  Inderjit S. Dhillon,et al.  Co-clustering documents and words using bipartite spectral graph partitioning , 2001, KDD '01.

[146]  Khalid Alfalqi,et al.  A Survey of Topic Modeling in Text Mining , 2015 .

[147]  John P Huelsenbeck,et al.  A Dirichlet process model for detecting positive selection in protein-coding DNA sequences. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[148]  David B. Dunson,et al.  The dynamic hierarchical Dirichlet process , 2008, ICML '08.

[149]  Francesco Archetti,et al.  Granular modeling of web documents: impact on information retrieval systems , 2008, WIDM '08.

[150]  Brian Litt,et al.  Modeling the complex dynamics and changing correlations of epileptic events , 2014, Artif. Intell..

[151]  Jie Lu,et al.  Infinite Author Topic Model Based on Mixed Gamma-Negative Binomial Process , 2015, 2015 IEEE International Conference on Data Mining.

[152]  Shenghuo Zhu,et al.  Topic hierarchy generation via linear discriminant projection , 2003, SIGIR '03.

[153]  K. Bretonnel Cohen,et al.  A shared task involving multi-label classification of clinical free text , 2007, BioNLP@ACL.

[154]  Peter I. Frazier,et al.  Distance dependent Chinese restaurant processes , 2009, ICML.

[155]  John DeNero,et al.  Sampling Alignment Structure under a Bayesian Translation Model , 2008, EMNLP.

[156]  Brian Kulis,et al.  Gamma Processes, Stick-Breaking, and Variational Inference , 2015, AISTATS.

[157]  Xiangfeng Luo,et al.  Topic Model for Graph Mining , 2015, IEEE Transactions on Cybernetics.

[158]  Matthew J. Johnson,et al.  Bayesian nonparametric hidden semi-Markov models , 2012, J. Mach. Learn. Res..

[159]  S. MacEachern,et al.  Bayesian Nonparametric Spatial Modeling With Dirichlet Process Mixing , 2005 .

[160]  Tomoharu Iwata,et al.  Discovering latent influence in online social activities via shared cascade poisson processes , 2013, KDD.

[161]  Haesun Park,et al.  Sparse Nonnegative Matrix Factorization for Clustering , 2008 .

[162]  Edoardo M. Airoldi,et al.  Mixed Membership Stochastic Blockmodels , 2007, NIPS.

[163]  Siyuan Liu,et al.  Effective Mobile Context Pattern Discovery via Adapted Hierarchical Dirichlet Processes , 2014, 2014 IEEE 15th International Conference on Mobile Data Management.

[164]  Guillaume Bouchard,et al.  Latent IBP Compound Dirichlet Allocation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[165]  Thomas L. Griffiths,et al.  A Nonparametric Bayesian Model of Multi-Level Category Learning , 2011, AAAI.

[166]  Radford M. Neal Slice Sampling , 2003, The Annals of Statistics.

[167]  Michael I. Jordan,et al.  Bayesian Nonparametric Inference of Switching Dynamic Linear Models , 2010, IEEE Transactions on Signal Processing.

[168]  Lawrence Carin,et al.  Augment-and-Conquer Negative Binomial Processes , 2012, NIPS.

[169]  Jun Zhou,et al.  Multitask Sparse Nonnegative Matrix Factorization for Joint Spectral–Spatial Hyperspectral Imagery Denoising , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[170]  Shui-Lung Chuang,et al.  A practical web-based approach to generating topic hierarchy for text segments , 2004, CIKM '04.

[171]  Mark W. Woolrich,et al.  Multiple-subjects connectivity-based parcellation using hierarchical Dirichlet process mixture models , 2009, NeuroImage.

[172]  Wray L. Buntine,et al.  Dependent Hierarchical Normalized Random Measures for Dynamic Topic Modeling , 2012, ICML.

[173]  Tom Minka,et al.  A* Sampling , 2014, NIPS.

[174]  Max Welling,et al.  Accelerated Variational Dirichlet Process Mixtures , 2006, NIPS.

[175]  Peter I. Frazier,et al.  Distance Dependent Infinite Latent Feature Models , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[176]  Zoubin Ghahramani,et al.  Dependent Indian Buffet Processes , 2010, AISTATS.

[177]  Eric P. Xing,et al.  Parallel Markov Chain Monte Carlo for Nonparametric Mixture Models , 2013, ICML.

[178]  Zoubin Ghahramani,et al.  Distributed Inference for Dirichlet Process Mixture Models , 2015, ICML.

[179]  Rafael Geraldeli Rossi,et al.  Building a topic hierarchy using the bag-of-related-words representation , 2011, DocEng '11.

[180]  Brendan K. Beare COPULAS AND TEMPORAL DEPENDENCE , 2008 .

[181]  Zoubin Ghahramani,et al.  Pitman Yor Diffusion Trees for Bayesian Hierarchical Clustering , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[182]  P. Damlen,et al.  Gibbs sampling for Bayesian non‐conjugate and hierarchical models by using auxiliary variables , 1999 .

[183]  Yihong Gong,et al.  A Two-Level Topic Model Towards Knowledge Discovery from Citation Networks , 2014, IEEE Transactions on Knowledge and Data Engineering.

[184]  Michael I. Jordan,et al.  Combinatorial Clustering and the Beta Negative Binomial Process , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[185]  T. Ferguson A Bayesian Analysis of Some Nonparametric Problems , 1973 .

[186]  Antonio Lijoi,et al.  A Bayesian nonparametric method for prediction in EST analysis , 2007, BMC Bioinformatics.

[187]  W. Eric L. Grimson,et al.  Construction of Dependent Dirichlet Processes based on Poisson Processes , 2010, NIPS.

[188]  Eric R. Ziegel,et al.  Generalized Linear Models , 2002, Technometrics.

[189]  Arindam Banerjee,et al.  Bayesian Co-clustering , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[190]  Wei Li,et al.  Pachinko allocation: DAG-structured mixture models of topic correlations , 2006, ICML.

[191]  J. E. Griffin,et al.  Order-Based Dependent Dirichlet Processes , 2006 .

[192]  Sebastián Ventura,et al.  A Tutorial on Multilabel Learning , 2015, ACM Comput. Surv..

[193]  Warren B. Powell,et al.  Dirichlet Process Mixtures of Generalized Linear Models , 2009, J. Mach. Learn. Res..

[194]  Nello Cristianini,et al.  Classification using String Kernels , 2000 .

[195]  David M. Blei,et al.  Probabilistic topic models , 2012, Commun. ACM.

[196]  Erik B. Sudderth,et al.  Truly Nonparametric Online Variational Inference for Hierarchical Dirichlet Processes , 2012, NIPS.

[197]  Chong Wang,et al.  Embarrassingly Parallel Variational Inference in Nonconjugate Models , 2015, ArXiv.

[198]  Bonnie Webber Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2 , 2009 .

[199]  Ryan P. Adams,et al.  ClusterCluster: Parallel Markov Chain Monte Carlo for Dirichlet Process Mixtures , 2013, ArXiv.

[200]  Chong Wang,et al.  Online Variational Inference for the Hierarchical Dirichlet Process , 2011, AISTATS.