In this paper, we propose to model and analyze changes that occur to an entity in terms of changes in the words that co-occur with the entity over time. We propose to do an in-depth analysis of how this co-occurrence changes over time, how the change influences the state (semantic, role) of the entity, and how the change may correspond to events occurring in the same period of time. We propose to identify clusters of topics surrounding the entity over time using Topics-Over-Time (TOT) and k-means clustering. We conduct this analysis on Google Books Ngram dataset. We show how clustering words that co-occur with an entity of interest in 5-grams can shed some lights to the nature of change that occurs to the entity and identify the period for which the change occurs. We find that the period identified by our model precisely coincides with events in the same period that correspond to the change that occurs.
[1]
Michael I. Jordan,et al.
Latent Dirichlet Allocation
,
2001,
J. Mach. Learn. Res..
[2]
Andrew McCallum,et al.
Topics over time: a non-Markov continuous-time model of topical trends
,
2006,
KDD '06.
[3]
Garry Robins,et al.
An introduction to exponential random graph (p*) models for social networks
,
2007,
Soc. Networks.
[4]
Murat Ali Bayir,et al.
Identifying breakpoints in public opinion
,
2010,
SOMA '10.
[5]
Erez Lieberman Aiden,et al.
Quantitative Analysis of Culture Using Millions of Digitized Books
,
2010,
Science.