The Hidden Markov Topic Model: A Probabilistic Model of Semantic Representation

In this paper, we describe a model that learns semantic representations from the distributional statistics of language. This model, however, goes beyond the common bag-of-words paradigm, and infers semantic representations by taking into account the inherent sequential nature of linguistic data. The model we describe, which we refer to as a Hidden Markov Topics model, is a natural extension of the current state of the art in Bayesian bag-of-words models, that is, the Topics model of Griffiths, Steyvers, and Tenenbaum (2007), preserving its strengths while extending its scope to incorporate more fine-grained linguistic information.

[1]  James R. Curran,et al.  Scaling Context Space , 2002, ACL.

[2]  Zellig S. Harris,et al.  Distributional Structure , 1954 .

[3]  H. Gleitman,et al.  Human simulations of vocabulary learning , 1999, Cognition.

[4]  Michael N Jones,et al.  Representing word meaning and order information in a composite holographic lexicon. , 2007, Psychological review.

[5]  Gabriella Vigliocco,et al.  Integrating experiential and distributional data to learn semantic representations. , 2009, Psychological review.

[6]  Susan T. Dumais,et al.  The latent semantic analysis theory of knowledge , 1997 .

[7]  Mark Steyvers,et al.  Topics in semantic representation. , 2007, Psychological review.

[8]  Mirella Lapata,et al.  Dependency-Based Construction of Semantic Space Models , 2007, CL.

[9]  Thomas L. Griffiths,et al.  Integrating Topics and Syntax , 2004, NIPS.

[10]  James R. Curran,et al.  Improvements in Automatic Thesaurus Extraction , 2002, ACL 2002.

[11]  Dominic Widdows,et al.  Unsupervised methods for developing taxonomies by combining syntactic and statistical information , 2003, NAACL.

[12]  W. Gilks,et al.  Adaptive Rejection Sampling for Gibbs Sampling , 1992 .

[13]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[14]  Thomas L. Griffiths,et al.  A probabilistic approach to semantic representation , 2019, Proceedings of the Twenty-Fourth Annual Conference of the Cognitive Science Society.

[15]  Hanna M. Wallach,et al.  Topic modeling: beyond bag-of-words , 2006, ICML.

[16]  Naftali Tishby,et al.  Distributional Clustering of English Words , 1993, ACL.

[17]  David M. Blei,et al.  Syntactic Topic Models , 2008, NIPS.

[18]  Thomas L. Griffiths,et al.  Prediction and Semantic Association , 2002, NIPS.

[19]  J. R. Firth,et al.  A Synopsis of Linguistic Theory, 1930-1955 , 1957 .

[20]  H. Schütze,et al.  Dimensions of meaning , 1992, Supercomputing '92.

[21]  Suzanne Stevenson,et al.  A Computational Model of Early Argument Structure Acquisition , 2008, Cogn. Sci..

[22]  Dekang Lin,et al.  Automatic Retrieval and Clustering of Similar Words , 1998, ACL.

[23]  T. Landauer,et al.  A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge. , 1997 .