In different times, people use different words to describe concepts. Change and stability in word usage are possible indicators of wider socio-cultural changes. To gain insight into how people perceive concepts, it is valuable to trace how the words denoting a certain concept change over time. Existing tools for exploring historical concepts, such as keyword searching or topic modeling, are ill-suited for the task; they are either too top-down or too rigid for an iterative exploration of historical concepts in large data sets. In this article, we present ShiCo: a graphical interface for visualising concepts over time by monitoring shifts in word usage in a document corpus. As the dimension of time plays a crucial role in ShiCo, this article demonstrates ShiCo on a large corpus of newspaper articles spanning several decades. We describe the design choices made during the development of ShiCo and the key parameters that control the tool's behaviour. Lastly, as ShiCo is meant to be used by the broader community, we describe the steps required for running ShiCo on a novel data set.
[1]
Petr Sojka,et al.
Software Framework for Topic Modelling with Large Corpora
,
2010
.
[2]
Jeffrey Dean,et al.
Distributed Representations of Words and Phrases and their Compositionality
,
2013,
NIPS.
[3]
M. de Rijke,et al.
Short Text Similarity with Word Embeddings
,
2015,
CIKM.
[4]
Slav Petrov,et al.
Temporal Analysis of Language through Neural Language Models
,
2014,
LTCSS@ACL.
[5]
M. de Rijke,et al.
Ad Hoc Monitoring of Vocabulary Shifts over Time
,
2015,
CIKM.
[6]
Qian Wang,et al.
AUGEM: Automatically generate high performance Dense Linear Algebra kernels on x86 CPUs
,
2013,
2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).