Mining Long-term Topics from a Real-time Feed

In our current society, the availability of data has gone from scarce to abundant: huge volumes of data are generated every second. A significant part of these data are generated on social media platforms, which provide a very volatile flow of information. Leveraging the information that is buried in this fast stream of messages, poses a serious challenge. In this paper, we aim to distinguish all topics that are discussed in real-time in a social media feed by employing clustering and algorithmic techniques. We evaluate our approach by comparing the results to a post-hoc clustering approach.