Tracking Topic Evolution in News Environments

For companies acting on a global scale, the necessity to monitor and analyze news channels and consumer-generated media on the Web, such as weblogs and n news-groups, is steadily increasing. In particular the identification of novel trends and upcoming issues, as well as their dynamic evolution over time, is of utter importance to corporate communications and market analysts. Automated machine learning systems using clustering techniques have only partially succeeded in addressing these newly arising requirements, failing in their endeavor to properly assign short-term hype topics to long-term trends. We propose an approach which allows to monitor news wire on different levels of temporal granularity, extracting key-phrases that reflect short-term topics as well as longer-term trends by means of statistical language modelling. Moreover, our approach allows for assigning those windows of smaller scope to those of longer intervals.