A Cross-Modal Short Text Semantic Expansion Method for Microblog Search

Image is an important part of microblog, and its visual information can offer additional semantics besides the textual information. To overcome short text’s semantic sparsity problem and fully utilize the semantics of text and image, we propose a cross-modal short text expansion method for microblog search in this paper. First, we expand short texts using the distributed representations of words, and then based on deep neural network, we extract related information of images and append them to the original short text. The expanded pseudo-documents contain richer semantics, and by turning pseudo-documents into vectors, we can achieve accurate microblog search. Experiments on real-world datasets show that the proposed cross-modal short text expansion method can effectively extract the semantics of microblogs and improve search performance.