News2meme: An Automatic Content Generator from News Based on Word Subspaces from Text and Image

Internet users engage in content creation by using various media formats. One of the most popular forms is the “internet meme”, which often depicts the general opinion about events with an image and a catchphrase. In this paper, we propose news2meme, a method for automatically generating memes from a news article, where we aim to match words and images efficiently. We approach this as two multimedia retrieval problems with the same input news text: 1) An image retrieval task where the output is a meme image; 2) A text retrieval task where the output is a catchphrase. These two outputs are combined to generate the meme for the news article. We represent texts and catchphrases as sets of word vectors through the word2vec representation. To handle images similarly, we extract sets of tags from the images using a deep neural network. These tags are then translated to word vectors in the same vector space through word2vec. Finally, we represent the intrinsic variability of features in a set of word vectors with a word subspace. Through word subspaces comparison, we can directly compare images and texts, making retrieval across media formats possible. A preliminary experiment was performed to evaluate our framework.