Parameter estimation for text analysis

Presents parameter estimation methods common with discrete proba- bility distributions, which is of particular interest in text modeling. Starting with maximum likelihood, a posteriori and Bayesian estimation, central concepts like conjugate distributions and Bayesian networks are reviewed. As an application, the model of latent Dirichlet allocation (LDA) is explained in detail with a full derivation of an approximate inference algorithm based on Gibbs sampling, in- cluding a discussion of Dirichlet hyperparameter estimation. Finally, analysis methods of LDA models are discussed.

[1]  L. J. Savage,et al.  Symmetric measures on Cartesian products , 1955 .

[2]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[3]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[4]  W. Gilks,et al.  Adaptive Rejection Sampling for Gibbs Sampling , 1992 .

[5]  W. Gilks,et al.  Adaptive Rejection Metropolis Sampling Within Gibbs Sampling , 1995 .

[6]  M. Escobar,et al.  Bayesian Density Estimation and Inference Using Mixtures , 1995 .

[7]  G. Casella,et al.  Rao-Blackwellisation of sampling schemes , 1996 .

[8]  Huaiyu Zhu On Information and Sufficiency , 1997 .

[9]  Ross D. Shachter Bayes-Ball: The Rational Pastime (for Determining Irrelevance and Requisite Information in Belief Networks and Influence Diagrams) , 1998, UAI.

[10]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Analysis , 1999, UAI.

[11]  Radford M. Neal Markov Chain Sampling Methods for Dirichlet Process Mixture Models , 2000 .

[12]  P. Donnelly,et al.  Inference of population structure using multilocus genotype data. , 2000, Genetics.

[13]  P. Gehler,et al.  An introduction to graphical models , 2001 .

[14]  Marina MeWi Comparing Clusterings , 2002 .

[15]  Tom Minka,et al.  Expectation-Propogation for the Generative Aspect Model , 2002, UAI.

[16]  Tim Hesterberg,et al.  Monte Carlo Strategies in Scientific Computing , 2002, Technometrics.

[17]  C. J. van Rijsbergen,et al.  Investigating the relationship between language model perplexity and IR precision-recall measures , 2003, SIGIR.

[18]  Ata Kabán,et al.  On an equivalence between PLSI and LDA , 2003, SIGIR.

[19]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[20]  Thomas L. Griffiths,et al.  Probabilistic author-topic models for information discovery , 2004, KDD.

[21]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[22]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[23]  Schloss Birlinghoven,et al.  Investigating Word Correlation at Difierent Scopes { a Latent-Concept Approach , 2005 .

[24]  E. Frykberg,et al.  Triage: Principles and Practice , 2005, Scandinavian journal of surgery : SJS : official organ for the Finnish Surgical Society and the Scandinavian Surgical Society.

[25]  T. Griffiths,et al.  Modeling individual differences using Dirichlet processes , 2006 .

[26]  W. Bruce Croft,et al.  LDA-based document models for ad-hoc retrieval , 2006, SIGIR.

[27]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[28]  Andrew McCallum,et al.  Topic and Role Discovery in Social Networks with Experiments on Enron and Academic Email , 2007, J. Artif. Intell. Res..

[29]  Gregor Heinrich,et al.  A Generic Approach to Topic Models , 2009, ECML/PKDD.

[30]  By W. R. GILKSt,et al.  Adaptive Rejection Sampling for Gibbs Sampling , 2010 .