Topic-Based Generative Models for Text Information Access

[1]  David Kauchak,et al.  Modeling word burstiness using the Dirichlet distribution , 2005, ICML.

[2]  Ata Kabán,et al.  On an equivalence between PLSI and LDA , 2003, SIGIR.

[3]  Inderjit S. Dhillon,et al.  Clustering with Bregman Divergences , 2005, J. Mach. Learn. Res..

[4]  Mark A. Girolami,et al.  A Probabilistic Framework for the Hierarchic Organisation and Classification of Document Collections , 2004, Journal of Intelligent Information Systems.

[5]  Yee Whye Teh,et al.  A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation , 2006, NIPS.

[6]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[7]  Xin Jin,et al.  Web usage mining based on probabilistic latent semantic analysis , 2004, KDD.

[8]  David Cohn,et al.  Learning to Probabilistically Identify Authoritative Documents , 2000, ICML.

[9]  Donna K. Harman,et al.  Overview of the Fourth Text REtrieval Conference (TREC-4) , 1995, TREC.

[10]  Michael I. Jordan,et al.  A latent variable model for chemogenomic profiling , 2005, Bioinform..

[11]  Max Welling,et al.  Distributed Algorithms for Topic Models , 2009, J. Mach. Learn. Res..

[12]  Slava M. Katz Distribution of content words and phrases in text and language modelling , 1996, Natural Language Engineering.

[13]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.

[14]  Pietro Perona,et al.  Memory bounded inference in topic models , 2008, ICML '08.

[15]  Thomas L. Griffiths,et al.  The Author-Topic Model for Authors and Documents , 2004, UAI.

[16]  Max Welling,et al.  Deterministic Latent Variable Models and Their Pitfalls , 2008, SDM.

[17]  CHENGXIANG ZHAI,et al.  A study of smoothing methods for language models applied to information retrieval , 2004, TOIS.

[18]  Wray L. Buntine Estimating Likelihoods for Topic Models , 2009, ACML.

[19]  L. J. Savage,et al.  Symmetric measures on Cartesian products , 1955 .

[20]  Charles Elkan,et al.  Accounting for burstiness in topic models , 2009, ICML '09.

[21]  Fabrizio Sebastiani,et al.  Machine learning in automated text categorization , 2001, CSUR.

[22]  Shun-ichi Amari,et al.  Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.

[23]  Hal Daumé,et al.  A geometric view of conjugate priors , 2010, Machine Learning.

[24]  Thomas L. Griffiths,et al.  The nested chinese restaurant process and bayesian nonparametric inference of topic hierarchies , 2007, JACM.

[25]  John D. Lafferty,et al.  Model-based feedback in the language modeling approach to information retrieval , 2001, CIKM '01.

[26]  Sebastian Thrun,et al.  Text Classification from Labeled and Unlabeled Documents using EM , 2000, Machine Learning.

[27]  Michael I. Jordan,et al.  Modeling annotated data , 2003, SIGIR.

[28]  M. Meilă Comparing clusterings---an information based distance , 2007 .

[29]  Michael I. Jordan,et al.  DiscLDA: Discriminative Learning for Dimensionality Reduction and Classification , 2008, NIPS.

[30]  Thomas L. Griffiths,et al.  Probabilistic author-topic models for information discovery , 2004, KDD.

[31]  Wray L. Buntine Variational Extensions to EM and Multinomial PCA , 2002, ECML.

[32]  John D. Lafferty,et al.  A correlated topic model of Science , 2007, 0708.3601.

[33]  L. Azzopardi,et al.  Topic based language models for ad hoc information retrieval , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[34]  Wei Li,et al.  Pachinko allocation: DAG-structured mixture models of topic correlations , 2006, ICML.

[35]  Andrew McCallum,et al.  A comparison of event models for naive bayes text classification , 1998, AAAI 1998.

[36]  C. Goutte,et al.  Co-Occurrence Models in Music Genre Classification , 2005, 2005 IEEE Workshop on Machine Learning for Signal Processing.

[37]  J. Dickey Multiple Hypergeometric Functions: Probabilistic Interpretations and Statistical Uses , 1983 .

[38]  Jean-Cédric Chappelier,et al.  PLSI: The True Fisher Kernel and beyond , 2009, ECML/PKDD.

[39]  Andrew McCallum,et al.  Topic and Role Discovery in Social Networks with Experiments on Enron and Academic Email , 2007, J. Artif. Intell. Res..

[40]  Éric Gaussier,et al.  Relation between PLSA and NMF and implications , 2005, SIGIR '05.

[41]  Ramesh Nallapati,et al.  Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora , 2009, EMNLP.

[42]  Daniel Gatica-Perez,et al.  Modeling Semantic Aspects for Cross-Media Image Indexing , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Kris Popat,et al.  A Hierarchical Model for Clustering and Categorising Documents , 2002, ECIR.

[44]  Tom Minka,et al.  Expectation-Propogation for the Generative Aspect Model , 2002, UAI.

[45]  P. Donnelly,et al.  Inference of population structure using multilocus genotype data. , 2000, Genetics.

[46]  Gregor Heinrich,et al.  A Generic Approach to Topic Models , 2009, ECML/PKDD.

[47]  Aleks Jakulin,et al.  Discrete Component Analysis , 2005, SLSFS.

[48]  David R. Karger,et al.  Tackling the Poor Assumptions of Naive Bayes Text Classifiers , 2003, ICML.

[49]  Charles Elkan,et al.  Clustering documents with an exponential-family approximation of the Dirichlet compound multinomial distribution , 2006, ICML.

[50]  ChengXiang Zhai,et al.  A mixture model for contextual text mining , 2006, KDD '06.

[51]  Daniel Gatica-Perez,et al.  PLSA-based image auto-annotation: constraining the latent space , 2004, MULTIMEDIA '04.

[52]  Andrew Zisserman,et al.  Scene Classification Via pLSA , 2006, ECCV.

[53]  Thomas Hofmann,et al.  Unsupervised Learning by Probabilistic Latent Semantic Analysis , 2004, Machine Learning.

[54]  Max Welling,et al.  Asynchronous distributed estimation of topic models for document analysis , 2011, Statistical Methodology.

[55]  Thomas L. Griffiths,et al.  A probabilistic approach to semantic representation , 2019, Proceedings of the Twenty-Fourth Annual Conference of the Cognitive Science Society.

[56]  Christopher Joseph Pal,et al.  Multi-Conditional Learning: Generative/Discriminative Training for Clustering and Classification , 2006, AAAI.

[57]  Ramesh Nallapati,et al.  Joint latent topic models for text and citations , 2008, KDD.

[58]  Alexander Zien,et al.  Semi-Supervised Learning , 2006 .

[59]  Yee Whye Teh,et al.  On Smoothing and Inference for Topic Models , 2009, UAI.

[60]  Charles Elkan,et al.  Deriving TF-IDF as a Fisher Kernel , 2005, SPIRE.

[61]  François Yvon,et al.  Using LDA to detect semantically incoherent documents , 2008, CoNLL.

[62]  Andrew McCallum,et al.  Rethinking LDA: Why Priors Matter , 2009, NIPS.

[63]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[64]  D. K. Harmon,et al.  Overview of the Third Text Retrieval Conference (TREC-3) , 1996 .

[65]  John F. Canny,et al.  GaP: a factor model for discrete data , 2004, SIGIR '04.

[66]  David D. Lewis,et al.  Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval , 1998, ECML.

[67]  Éric Gaussier,et al.  The BNB Distribution for Text Modeling , 2008, ECIR.

[68]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[69]  Jean-Cédric Chappelier,et al.  Revisiting Fisher Kernels for Document Similarities , 2006, ECML.

[70]  ChengXiang Zhai,et al.  Statistical Language Models for Information Retrieval: A Critical Review , 2008, Found. Trends Inf. Retr..

[71]  Aleks Jakulin,et al.  Applying Discrete PCA in Data Analysis , 2004, UAI.

[72]  David M. Pennock,et al.  Probabilistic Models for Unified Collaborative and Content-Based Recommendation in Sparse-Data Environments , 2001, UAI.

[73]  W. Bruce Croft,et al.  LDA-based document models for ad-hoc retrieval , 2006, SIGIR.

[74]  Thomas L. Griffiths,et al.  Hierarchical Topic Models and the Nested Chinese Restaurant Process , 2003, NIPS.

[75]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[76]  Hanna M. Wallach,et al.  Topic modeling: beyond bag-of-words , 2006, ICML.

[77]  Thomas L. Griffiths,et al.  Integrating Topics and Syntax , 2004, NIPS.

[78]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[79]  Jean-Cédric Chappelier,et al.  An Ad Hoc Information Retrieval Perspective on PLSI through Language Model Identification , 2009, ICTIR.