Empowering Cross-Domain Internet Media with Real-Time Topic Learning from Social Streams

This paper aims to connect social media from disparate sources on the Internet by building a common topic space in-between, using which cross domain media recommendations can be realized on the web. The topic space is built and updated in real time by extending the Latent Dirichlet Allocation (LDA) model to cater to streaming online data. Our topical model, named Online Streaming LDA (OSLDA), is able to extract, learn, populate, and update the topic space in real time, scaling with streaming tweets. Based on the proposed topic space learned in real time, we present media recommendation applications that cannot be achieved by conventional media analysis techniques: (1) tweet enrichment by recommending related videos, and (2) popular video recommendation for featuring socially trending topical videos. We conduct experiments over a collection of 3.6 million tweets and 1.2 million click-through data from a video search engine. Our results show that the learned topic model plays a natural role connecting cross-domain social media, leading to a better user experience consuming social media.