Applications of Topic Models

How can a single person understand what’s going on in a collection of millions of documents? This is an increasingly widespread problem: sifting through an organization’s e-mails, understanding a decade worth of newspapers, or characterizing a scientific field’s research. This monograph explores the ways that humans and computers make sense of document collections through tools called topic models. Topic models are a statistical framework that help users understand large document collections; not just to find individual documents but to understand the general themes present in the collection. Applications of Topic Models describes the recent academic and industrial applications of topic models. In addition to topic models’ effective application to traditional problems like information retrieval, visualization, statistical inference, multilingual modeling, and linguistic understanding, Applications of Topic Models also reviews topic models’ ability to unlock large text collections for qualitative analysis. It reviews their successful use by researchers to help understand fiction, non-fiction, scientific publications, and political texts. Applications of Topic Models is aimed at the reader with some knowledge of document processing, basic understanding of some probability, and interested in many application domains. It discusses the information needs of each application area, and how those specific needs affect models, curation procedures, and interpretations. By the end of the monograph, it is hoped that readers will be excited enough to attempt to embark on building their own topic models. It should also be of interest to topic model experts as the coverage of diverse applications may expose models and approaches they had not seen before.

[1]  James Allan,et al.  Topic detection and tracking: event-based information organization , 2002 .

[2]  Jordan L. Boyd-Graber,et al.  Mr. LDA: a flexible large scale topic modeling package using variational inference in MapReduce , 2012, WWW.

[3]  Clément Farabet,et al.  Torch7: A Matlab-like Environment for Machine Learning , 2011, NIPS 2011.

[4]  W. Bruce Croft,et al.  A general language model for information retrieval , 1999, CIKM '99.

[5]  Yiming Yang,et al.  Translingual Information Retrieval: A Comparative Evaluation , 1997, IJCAI.

[6]  Thomas L. Griffiths,et al.  The nested chinese restaurant process and bayesian nonparametric inference of topic hierarchies , 2007, JACM.

[7]  Marie-Francine Moens,et al.  Cross-Language Information Retrieval with Latent Topic Models Trained on a Comparable Corpus , 2011, AIRS.

[8]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[9]  Michal Rosen-Zvi,et al.  Hidden Topic Markov Models , 2007, AISTATS.

[10]  John D. Lafferty,et al.  Model-based feedback in the language modeling approach to information retrieval , 2001, CIKM '01.

[11]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[12]  Stanley F. Chen,et al.  Evaluation Metrics For Language Models , 1998 .

[13]  Timothy Baldwin,et al.  Automatic Evaluation of Topic Coherence , 2010, NAACL.

[15]  Michael J. Paul,et al.  Using Social Media to Perform Local Influenza Surveillance in an Inner-City Hospital: A Retrospective Observational Study , 2015, JMIR public health and surveillance.

[16]  Margaret E. Roberts,et al.  stm: An R Package for Structural Topic Models , 2019, Journal of Statistical Software.

[17]  Quentin Pleple,et al.  Interactive Topic Modeling , 2013 .

[18]  David M. Blei,et al.  Supervised Topic Models , 2007, NIPS.

[19]  Philip Resnik,et al.  Tea Party in the House: A Hierarchical Ideal Point Topic Model and Its Application to Republican Legislators in the 112th Congress , 2015, ACL.

[20]  Philip Resnik,et al.  GIBBS SAMPLING FOR THE UNINITIATED , 2010 .

[21]  Thang Nguyen,et al.  Is Your Anchor Going Up or Down? Fast and Accurate Supervised Topic Models , 2015, NAACL.

[22]  Marie-Francine Moens,et al.  Probabilistic Models of Cross-Lingual Semantic Similarity in Context Based on Latent Cross-Lingual Concepts Induced from Comparable Data , 2014, EMNLP.

[23]  Thomas Hofmann,et al.  Topic-based language models using EM , 1999, EUROSPEECH.

[24]  Santosh S. Vempala,et al.  Latent semantic indexing: a probabilistic analysis , 1998, PODS '98.

[25]  Jordan Boyd-Graber,et al.  Topic Models for Translation Domain Adaptation , 2013 .

[26]  W. Bruce Croft,et al.  Relevance-Based Language Models , 2001, SIGIR '01.

[27]  Iryna Gurevych,et al.  A Study on the Semantic Relatedness of Query and Document Terms in Information Retrieval , 2009, EMNLP.

[28]  Julia Lane,et al.  Star Metrics and the Science of Science Policy , 2012 .

[29]  Yue Lu,et al.  Opinion integration through semi-supervised topic modeling , 2008, WWW.

[30]  Lawrence K. Saul,et al.  10 th International Society for Music Information Retrieval Conference ( ISMIR 2009 ) A PROBABILISTIC TOPIC MODEL FOR UNSUPERVISED LEARNING OF MUSICAL KEY-PROFILES , 2009 .

[31]  Jeffrey Heer,et al.  TopicCheck: Interactive Alignment for Assessing Topic Model Stability , 2015, NAACL.

[32]  David M. Blei,et al.  Sparse stochastic inference for latent Dirichlet allocation , 2012, ICML.

[33]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[34]  Andrew McCallum,et al.  Topic and Role Discovery in Social Networks with Experiments on Enron and Academic Email , 2007, J. Artif. Intell. Res..

[35]  W. Bruce Croft,et al.  Term level search result diversification , 2013, SIGIR.

[36]  Andrew McCallum,et al.  Rethinking LDA: Why Priors Matter , 2009, NIPS.

[37]  Fabio Crestani,et al.  Building user profiles from topic models for personalised search , 2013, CIKM.

[38]  Timothy Baldwin,et al.  Automatic Labelling of Topic Models , 2011, ACL.

[39]  Slava M. Katz,et al.  Estimation of probabilities from sparse data for the language model component of a speech recognizer , 1987, IEEE Trans. Acoust. Speech Signal Process..

[40]  Brendan T. O'Connor,et al.  A Latent Variable Model for Geographic Lexical Variation , 2010, EMNLP.

[41]  Michael J. Paul,et al.  Social Media as a Sensor of Air Quality and Public Response in China , 2015, Journal of medical Internet research.

[42]  Philip Resnik,et al.  Learning a Concept Hierarchy from Multi-labeled Documents , 2014, NIPS.

[43]  Justin Grimmer,et al.  A Bayesian Hierarchical Topic Model for Political Texts: Measuring Expressed Agendas in Senate Press Releases , 2010, Political Analysis.

[44]  Sanjeev Arora,et al.  A Practical Algorithm for Topic Modeling with Provable Guarantees , 2012, ICML.

[45]  Ruslan Salakhutdinov,et al.  Evaluation methods for topic models , 2009, ICML '09.

[46]  Lise Getoor,et al.  Collective entity resolution in relational data , 2007, TKDD.

[47]  Anima Anandkumar,et al.  A Spectral Algorithm for Latent Dirichlet Allocation , 2012, Algorithmica.

[48]  Ronald Rosenfeld,et al.  Using story topics for language model adaptation , 1997, EUROSPEECH.

[49]  Steffen Bickel,et al.  Unsupervised prediction of citation influences , 2007, ICML '07.

[50]  Cai-Nicolas Ziegler,et al.  Tracking Topic Evolution in News Environments , 2008, 2008 10th IEEE Conference on E-Commerce Technology and the Fifth IEEE Conference on Enterprise Computing, E-Commerce and E-Services.

[51]  Alexander J. Smola,et al.  Reducing the sampling complexity of topic models , 2014, KDD.

[52]  T. Underwood,et al.  The Quiet Transformations of Literary Studies: What Thirteen Thousand Scholars Could Tell Us , 2014 .

[53]  Deyi Xiong,et al.  A Topic-Based Coherence Model for Statistical Machine Translation , 2013, AAAI.

[54]  Sean Gerrish,et al.  A Language-based Approach to Measuring Scholarly Impact , 2010, ICML.

[55]  Junghoo Cho,et al.  Social-network analysis using topic models , 2012, SIGIR '12.

[56]  Andrew McCallum,et al.  Efficient methods for topic model inference on streaming document collections , 2009, KDD.

[57]  P. Donnelly,et al.  Inference of population structure using multilocus genotype data. , 2000, Genetics.

[58]  M. de Rijke,et al.  Personalized search result diversification via structured learning , 2014, KDD.

[59]  Matthew L. Jockers,et al.  Significant themes in 19th-century literature , 2013 .

[60]  Matt Gardner The Topic Browser An Interactive Tool for Browsing Topic Models , 2010 .

[61]  Ben Shneiderman,et al.  Visual Analysis of Topical Evolution in Unstructured Text: Design and Evaluation of TopicFlow , 2015, Applications of Social Media and Social Network Analysis.

[62]  Francis R. Bach,et al.  Online Learning for Latent Dirichlet Allocation , 2010, NIPS.

[63]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[64]  Kenneth Wai-Ting Leung,et al.  Collaborative personalized Twitter search with topic-language models , 2014, SIGIR.

[65]  David Buttler,et al.  Latent topic feedback for information retrieval , 2011, KDD.

[66]  Brian D. Davison,et al.  Empirical study of topic modeling in Twitter , 2010, SOMA '10.

[67]  Bei Yu,et al.  A cross-collection mixture model for comparative text mining , 2004, KDD.

[68]  Edoardo M. Airoldi,et al.  Mixed Membership Stochastic Blockmodels , 2007, NIPS.

[69]  Lynette Hirschman,et al.  Natural language question answering: the view from here , 2001, Natural Language Engineering.

[70]  Franco Moretti “Operationalizing”: or, the Function of Measurement in Modern Literary Theory , 2014 .

[71]  Matthew L. Jockers Macroanalysis: Digital Methods and Literary History , 2013 .

[72]  Roland Kuhn,et al.  Mixture-Model Adaptation for SMT , 2007, WMT@ACL.

[73]  Aniket Kittur,et al.  TopicViz: interactive topic exploration in document collections , 2012, CHI Extended Abstracts.

[74]  Fabio Crestani,et al.  Towards query log based personalization using topic models , 2010, CIKM.

[75]  Christopher D. Manning,et al.  Which universities lead and lag ? Toward university rankings based on scholarly output , 2010 .

[76]  Philip Resnik,et al.  Modeling Perspective Using Adaptor Grammars , 2010, EMNLP.

[77]  Andrew Y. Ng,et al.  Semantic Compositionality through Recursive Matrix-Vector Spaces , 2012, EMNLP.

[78]  Philip Koehn,et al.  Statistical Machine Translation , 2010, EAMT.

[79]  Jordan L. Boyd-Graber,et al.  Tandem Anchoring: a Multiword Anchor Approach for Interactive Topic Modeling , 2017, ACL.

[80]  Daniel Barbará,et al.  On-line LDA: Adaptive Topic Models for Mining Text Streams with Applications to Topic Detection and Tracking , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[81]  Hongfei Yan,et al.  Comparing Twitter and Traditional Media Using Topic Models , 2011, ECIR.

[82]  John Burrows,et al.  'Delta': a Measure of Stylistic Difference and a Guide to Likely Authorship , 2002, Lit. Linguistic Comput..

[83]  Max Welling,et al.  Distributed Inference for Latent Dirichlet Allocation , 2007, NIPS.

[84]  Roland Kuhn,et al.  Adaptation of Reordering Models for Statistical Machine Translation , 2013, NAACL.

[85]  Rada Mihalcea,et al.  Topic Modeling on Historical Newspapers , 2011, LaTeCH@ACL.

[86]  Yulan He,et al.  Joint sentiment/topic model for sentiment analysis , 2009, CIKM.

[87]  Yizhou Sun,et al.  iTopicModel: Information Network-Integrated Topic Modeling , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[88]  Xiaojin Zhu,et al.  Incorporating domain knowledge into topic modeling via Dirichlet Forest priors , 2009, ICML '09.

[89]  Marcello Federico,et al.  Topic Adaptation for Lecture Translation through Bilingual Latent Semantic Models , 2011, WMT@EMNLP.

[90]  ChengXiang Zhai,et al.  Discovering evolutionary theme patterns from text: an exploration of temporal text mining , 2005, KDD '05.

[91]  CHENGXIANG ZHAI,et al.  A study of smoothing methods for language models applied to information retrieval , 2004, TOIS.

[92]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[93]  J. J. Rocchio,et al.  Relevance feedback in information retrieval , 1971 .

[94]  ChengXiang Zhai,et al.  Cross-Lingual Latent Topic Extraction , 2010, ACL.

[95]  F. Mosteller,et al.  Inference and Disputed Authorship: The Federalist , 1966 .

[96]  Radford M. Neal Probabilistic Inference Using Markov Chain Monte Carlo Methods , 2011 .

[97]  Jochen Peters,et al.  Semantic clustering for adaptive language modeling , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[98]  Ramesh Nallapati,et al.  Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora , 2009, EMNLP.

[99]  Jeffrey Heer,et al.  Termite: visualization techniques for assessing textual topic models , 2012, AVI.

[100]  Qun Liu,et al.  A Topic-Triggered Language Model for Statistical Machine Translation , 2013, IJCNLP.

[101]  Jiawei Han,et al.  Geographical topic discovery and comparison , 2011, WWW.

[102]  Franco Moretti,et al.  On paragraphs. Scale, themes, and narrative form , 2015 .

[103]  Andrew McCallum,et al.  Database of NIH grants using machine-learned categories and graphical clustering , 2011, Nature Methods.

[104]  Ramesh Nallapati,et al.  Link-PLSA-LDA: A New Unsupervised Model for Topics and Influence of Blogs , 2021, ICWSM.

[105]  D. Mimno,et al.  Care and Feeding of Topic Models: Problems, Diagnostics, and Improvements , 2014 .

[106]  Nematollah Batmanghelich,et al.  Nonparametric Spherical Topic Modeling with Word Embeddings , 2016, ACL.

[107]  Marie-Francine Moens,et al.  Cross-language information retrieval models based on latent topic models trained with document-aligned comparable corpora , 2013, Information Retrieval.

[108]  Lisa Rhody Topic Modeling and Figurative Language , 2012 .

[109]  Thomas L. Griffiths,et al.  Probabilistic author-topic models for information discovery , 2004, KDD.

[110]  Hal Daumé,et al.  Deep Unordered Composition Rivals Syntactic Methods for Text Classification , 2015, ACL.

[111]  Qiaozhu Mei,et al.  Understanding the Limiting Factors of Topic Modeling via Posterior Contraction Analysis , 2014, ICML.

[112]  Marie-Francine Moens,et al.  Detecting Highly Confident Word Translations from Comparable Corpora without Any Prior Knowledge , 2012, EACL.

[113]  ChengXiang Zhai,et al.  Learning Query and Document Relevance from a Web-scale Click Graph , 2016, SIGIR.

[114]  Daniel Jurafsky,et al.  Towards better integration of semantic predictors in statistical language modeling , 1998, ICSLP.

[115]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[116]  David M. Blei,et al.  Multilingual Topic Models for Unaligned Text , 2009, UAI.

[117]  Pablo Castells,et al.  Personalized diversification of search results , 2012, SIGIR '12.

[118]  John D. Lafferty,et al.  Dynamic topic models , 2006, ICML.

[119]  Timothy R. Tangherlini,et al.  Trawling in the Sea of the Great Unread: Sub-corpus topic modeling and Humanities research , 2013 .

[120]  Rick Szostak,et al.  Classifying Science: Phenomena, Data, Theory, Method, Practice , 2005 .

[121]  Yan Liu,et al.  Topic-link LDA: joint models of topic and author community , 2009, ICML '09.

[122]  Thomas Hofmann,et al.  Probabilistic latent semantic indexing , 1999, SIGIR '99.

[123]  W. Bruce Croft,et al.  A language modeling approach to information retrieval , 1998, SIGIR '98.

[124]  Gideon S. Mann,et al.  Bibliometric impact measures leveraging topic analysis , 2006, Proceedings of the 6th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '06).

[125]  Jerome R. Bellegarda,et al.  A latent semantic analysis framework for large-Span language modeling , 1997, EUROSPEECH.

[126]  Hermann Ney,et al.  On structuring probabilistic dependences in stochastic language modelling , 1994, Comput. Speech Lang..

[127]  Marie-Francine Moens,et al.  Cross-language linking of news stories on the web using interlingual topic modelling , 2009, CIKM-SWSM.

[128]  Chong Wang,et al.  Reading Tea Leaves: How Humans Interpret Topic Models , 2009, NIPS.

[129]  David M. Blei,et al.  Deep Exponential Families , 2014, AISTATS.

[130]  Chong Wang,et al.  Continuous Time Dynamic Topic Models , 2008, UAI.

[131]  José Luis Vicedo González,et al.  TREC: Experiment and evaluation in information retrieval , 2007, J. Assoc. Inf. Sci. Technol..

[132]  Mirella Lapata,et al.  Bayesian Word Sense Induction , 2009, EACL.

[133]  Jaegul Choo,et al.  UTOPIAN: User-Driven Topic Modeling Based on Interactive Nonnegative Matrix Factorization , 2013, IEEE Transactions on Visualization and Computer Graphics.

[134]  W. Bruce Croft,et al.  Topic models in information retrieval , 2007 .

[135]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[136]  Janyce Wiebe,et al.  Annotating Attributions and Private States , 2005, FCA@ACL.

[137]  Santonu Sarkar,et al.  Mining business topics in source code using latent dirichlet allocation , 2008, ISEC '08.

[138]  Hong Cheng,et al.  The dual-sparse topic model: mining focused topics and focused terms in short text , 2014, WWW.

[139]  Spyridon Matsoukas,et al.  Discriminative Corpus Weight Estimation for Machine Translation , 2009, EMNLP.

[140]  Craig MacDonald,et al.  Search Result Diversification , 2015, Found. Trends Inf. Retr..

[141]  Hal Daumé,et al.  Markov Random Topic Fields , 2009, ACL/IJCNLP.

[142]  Xiang Ji,et al.  Topic evolution and social interactions: how authors effect research , 2006, CIKM '06.

[143]  Hongfei Yan,et al.  Automatic labeling hierarchical topics , 2012, CIKM '12.

[144]  Timothy Baldwin,et al.  Best Topic Word Selection for Topic Labelling , 2010, COLING.

[145]  Jiming Liu,et al.  Learning Topic Models by Belief Propagation , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[146]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[147]  Tanja Schultz,et al.  Bilingual-LSA Based LM Adaptation for Spoken Language Translation , 2007, ACL.

[148]  Philip Resnik,et al.  Holistic Sentiment Analysis Across Languages: Multilingual Supervised Latent Dirichlet Allocation , 2010, EMNLP.

[149]  David M. Blei,et al.  Visualizing Topic Models , 2012, ICWSM.

[150]  Xinyan Xiao,et al.  A Topic Similarity Model for Hierarchical Phrase-based Translation , 2012, ACL.

[151]  Yu Hong,et al.  A Topic-Based Reordering Model for Statistical Machine Translation , 2014, NLPCC.

[152]  Massimo Melucci,et al.  Contextual Search: A Computational Framework , 2012, Found. Trends Inf. Retr..

[153]  Andrew McCallum,et al.  Topic models for taxonomies , 2012, JCDL '12.

[154]  Thorsten Joachims,et al.  Evaluation methods for unsupervised word embeddings , 2015, EMNLP.

[155]  Ji-Rong Wen,et al.  WWW 2007 / Track: Search Session: Personalization A Largescale Evaluation and Analysis of Personalized Search Strategies ABSTRACT , 2022 .

[156]  Viet-An Nguyen,et al.  Lexical and Hierarchical Topic Regression , 2013, NIPS.

[157]  Susan T. Dumais,et al.  Characterizing Microblogs with Topic Models , 2010, ICWSM.

[158]  Chao Liu,et al.  A probabilistic approach to spatiotemporal theme pattern mining on weblogs , 2006, WWW '06.

[159]  James Allan,et al.  A Comparative Study of Utilizing Topic Models for Information Retrieval , 2009, ECIR.

[160]  Jordan Boyd-Graber,et al.  Concurrent Visualization of Relationships between Words and Topics in Topic Models , 2014 .

[161]  Qun Liu,et al.  Translation Model Adaptation for Statistical Machine Translation with Monolingual Topic Information , 2012, ACL.

[162]  Anthony J. Robinson,et al.  Language model adaptation using mixtures and an exponentially decaying cache , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[163]  Gerard Salton,et al.  Automatic Information Organization And Retrieval , 1968 .

[164]  Fabio Gasparetti,et al.  Personalized Search on the World Wide Web , 2007, The Adaptive Web.

[165]  David J. C. MacKay,et al.  A hierarchical Dirichlet language model , 1995, Natural Language Engineering.

[166]  Dietrich Klakow,et al.  Language model adaptation using dynamic marginals , 1997, EUROSPEECH.

[167]  Timothy Baldwin,et al.  Representing topics labels for exploring digital libraries , 2014, IEEE/ACM Joint Conference on Digital Libraries.

[168]  Patrick Juola,et al.  Authorship Attribution , 2008, Found. Trends Inf. Retr..

[169]  Marie-Francine Moens,et al.  Probabilistic topic modeling in multilingual settings: An overview of its methodology and applications , 2015, Inf. Process. Manag..

[170]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Analysis , 1999, UAI.

[171]  Andrew McCallum,et al.  Topic Models Conditioned on Arbitrary Features with Dirichlet-multinomial Regression , 2008, UAI.

[172]  Shion Guha,et al.  Comparing grounded theory and topic modeling: Extreme divergence or unlikely convergence? , 2017, J. Assoc. Inf. Sci. Technol..

[173]  Martin Wattenberg,et al.  TIMELINESTag clouds and the case for vernacular visualization , 2008, INTR.

[174]  Jordan L. Boyd-Graber,et al.  Models for Dynamic Translation Model Adaptation , 2016 .

[175]  Jerome R. Bellegarda,et al.  Statistical language model adaptation: review and perspectives , 2004, Speech Commun..

[176]  Xu Ling,et al.  Topic sentiment mixture: modeling facets and opinions in weblogs , 2007, WWW '07.

[177]  Marie-Francine Moens,et al.  Identifying Word Translations from Comparable Corpora Using Latent Topic Models , 2011, ACL.

[178]  Kirill Kireyev Applications of Topics Models to Analysis of Disaster-Related Twitter Data , 2009 .

[179]  Amanda Spink,et al.  Real life, real users, and real needs: a study and analysis of user queries on the web , 2000, Inf. Process. Manag..

[180]  Xiaojin Zhu,et al.  A Topic Model for Word Sense Disambiguation , 2007, EMNLP.

[181]  Richard M. Schwartz,et al.  Fast and Robust Neural Network Joint Models for Statistical Machine Translation , 2014, ACL.

[182]  Brendan T. O'Connor,et al.  From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series , 2010, ICWSM.

[183]  David Chiang,et al.  Two Easy Improvements to Lexical Weighting , 2011, ACL.

[184]  Ivan Titov,et al.  A Joint Model of Text and Aspect Ratings for Sentiment Summarization , 2008, ACL.

[185]  Ronald Rosenfeld,et al.  Nonlinear interpolation of topic models for language model adaptation , 1998, ICSLP.

[186]  Jure Leskovec,et al.  Meme-tracking and the dynamics of the news cycle , 2009, KDD.

[187]  David J. Newman,et al.  Probabilistic topic decomposition of an eighteenth-century American newspaper , 2006, J. Assoc. Inf. Sci. Technol..

[188]  Eric P. Xing,et al.  BiTAM: Bilingual Topic AdMixture Models for Word Alignment , 2006, ACL.

[189]  Percy Williams Bridgman,et al.  The Logic of Modern Physics , 1927 .

[190]  Frederick Jelinek,et al.  Interpolated estimation of Markov source parameters from sparse data , 1980 .

[191]  Andrew McCallum,et al.  Optimizing Semantic Coherence in Topic Models , 2011, EMNLP.

[192]  Qun Liu,et al.  Maximum Entropy Based Phrase Reordering Model for Statistical Machine Translation , 2006, ACL.

[193]  Thomas L. Griffiths,et al.  The Author-Topic Model for Authors and Documents , 2004, UAI.

[194]  Kathryn B. Laskey,et al.  Stochastic blockmodels: First steps , 1983 .

[195]  Kotagiri Ramamohanarao,et al.  The Sensitivity of Latent Dirichlet Allocation for Information Retrieval , 2009, ECML/PKDD.

[196]  Franco Moretti,et al.  The Slaughterhouse of Literature , 2000 .

[197]  Jianfeng Gao,et al.  Clickthrough-based latent semantic models for web search , 2011, SIGIR.

[198]  John D. Lafferty,et al.  Information Retrieval as Statistical Translation , 2017 .

[199]  Philipp Koehn,et al.  Sparse lexicalised features and topic adaptation for SMT , 2012, IWSLT.

[200]  Yee Whye Teh,et al.  A Hierarchical Nonparametric Bayesian Approach to Statistical Language Model Domain Adaptation , 2009, AISTATS.

[201]  Rajeev Rastogi,et al.  Entity disambiguation with hierarchical topic models , 2011, KDD.

[202]  Scott Sanner,et al.  Improving LDA topic models for microblogs via tweet pooling and automatic labeling , 2013, SIGIR.

[203]  Jian Pei,et al.  Detecting topic evolution in scientific literature: how can citations help? , 2009, CIKM.

[204]  Timothy Baldwin,et al.  Machine Reading Tea Leaves: Automatically Evaluating Topic Coherence and Topic Model Quality , 2014, EACL.

[205]  Wei Song,et al.  Bridging Topic Modeling and Personalized Search , 2010, COLING.

[206]  W. Bruce Croft,et al.  LDA-based document models for ad-hoc retrieval , 2006, SIGIR.

[207]  Qi He,et al.  TwitterRank: finding topic-sensitive influential twitterers , 2010, WSDM '10.

[208]  Mark Johnson,et al.  A Bayesian LDA-based model for semi-supervised part-of-speech tagging , 2007, NIPS.

[209]  Koby Crammer,et al.  Online Passive-Aggressive Algorithms , 2003, J. Mach. Learn. Res..

[210]  I. Miller,et al.  Rebellion, crime and violence in Qing China, 1722–1911: A topic modeling approach , 2013 .

[211]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[212]  David M. Blei,et al.  Relational Topic Models for Document Networks , 2009, AISTATS.

[213]  David M. Blei,et al.  Bayesian Checking for Topic Models , 2011, EMNLP.

[214]  Mari Ostendorf,et al.  Modeling long distance dependence in language: topic mixtures versus dynamic cache models , 1996, IEEE Trans. Speech Audio Process..

[215]  M. Erlin Topic Modeling, Epistemology, and the English and German Novel , 2017 .

[216]  Eric P. Xing,et al.  MedLDA: maximum margin supervised topic models for regression and classification , 2009, ICML '09.

[217]  Andrew McCallum,et al.  Polylingual Topic Models , 2009, EMNLP.

[218]  Thomas C. Rindflesch,et al.  Synonym, Topic Model and Predicate-Based Query Expansion for Retrieving Clinical Documents , 2012, AMIA.

[219]  John D. Lafferty,et al.  A correlated topic model of Science , 2007, 0708.3601.

[220]  Michael J. Paul,et al.  A Two-Dimensional Topic-Aspect Model for Discovering Multi-Faceted Topics , 2010, AAAI.

[221]  Derek J. Pike,et al.  Empirical Model‐building and Response Surfaces. , 1988 .

[222]  Alice H. Oh,et al.  Aspect and sentiment unification model for online review analysis , 2011, WSDM '11.

[223]  Jian Hu,et al.  Mining multilingual topics from wikipedia , 2009, WWW '09.

[224]  ChengXiang Zhai,et al.  Automatic labeling of multinomial topic models , 2007, KDD '07.

[225]  Yue Lu,et al.  Investigating task performance of probabilistic topic models: an empirical study of PLSA and LDA , 2011, Information Retrieval.

[226]  Jianfeng Gao,et al.  Learning Lexicon Models from Search Logs for Query Expansion , 2012, EMNLP.

[227]  David M. Mimno,et al.  Computational historiography: Data mining in a century of classics journals , 2012, JOCCH.

[228]  Hongfei Yan,et al.  Jointly Modeling Aspects and Opinions with a MaxEnt-LDA Hybrid , 2010, EMNLP.

[229]  Deng Cai,et al.  Topic modeling with network regularization , 2008, WWW.

[230]  Snigdha Chaturvedi,et al.  Feuding Families and Former Friends: Unsupervised Learning for Dynamic Fictional Relationships , 2016, NAACL.

[231]  Quan Wang,et al.  Regularized Latent Semantic Indexing: A New Approach to Large-Scale Topic Modeling , 2013, TOIS.