论文信息 - When E-commerce Meets Social Media: Identifying Business on WeChat Moment Using Bilateral-Attention LSTM

When E-commerce Meets Social Media: Identifying Business on WeChat Moment Using Bilateral-Attention LSTM

WeChat Business, developed on WeChat, the most extensively used instant messaging platform in China, is a new business model that bursts into people's lives in the e-commerce era. As one of the most typical WeChat Business behaviors, WeChat users can advertise products, advocate companies and share customer feedback to their WeChat friends by posting a WeChat Moment--a public status that contains images and a text. Given its popularity and significance, in this paper, we propose a novel Bilateral-Attention LSTM network (BiATT-LSTM) to identify WeChat Business Moments based on their texts and images. In particular, different from previous schemes that equally consider visual and textual modalities for a joint visual-textual classification task, we start our work with a text classification task based on an LSTM network, then we incorporate a bilateral-attention mechanism that can automatically learn two kinds of explicit attention weights for each word, namely 1) a global weight that is insensitive to the images in the same Moment with the word, and 2) a local weight that is sensitive to the images in the same Moment. In this process, we utilize visual information as a guidance to figure out the local weight of a word in a specific Moment. Two-level experiments demonstrate the effectiveness of our framework. It outperforms other schemes that jointly model visual and textual modalities. We also visualize the bilateral-attention mechanism to illustrate how this mechanism helps joint visual-textual classification.

[1] Christos Faloutsos,et al. Beyond Sigmoids: The NetTide Model for Social Network Growth, and Its Applications , 2016, KDD.

[2] Heng Tao Shen,et al. Attention-based LSTM with Semantic Consistency for Videos Captioning , 2016, ACM Multimedia.

[3] Xiaogang Wang,et al. Person Search with Natural Language Description , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5] Kam-Fai Wong,et al. Interpreting TF-IDF term weights as making relevance decisions , 2008, TOIS.

[6] Jiebo Luo,et al. Robust Visual-Textual Sentiment Analysis: When Attention meets Tree-structured Recursive Neural Networks , 2016, ACM Multimedia.

[7] Alex Graves,et al. Supervised Sequence Labelling with Recurrent Neural Networks , 2012, Studies in Computational Intelligence.

[8] Steffen Rendle,et al. Factorization Machines , 2010, 2010 IEEE International Conference on Data Mining.

[9] Jiebo Luo,et al. Cross-modality Consistent Regression for Joint Visual-Textual Sentiment Analysis of Social Multimedia , 2016, WSDM.

[10] Kaigui Bian,et al. On diffusion-restricted social network: A measurement study of WeChat moments , 2016, 2016 IEEE International Conference on Communications (ICC).

[11] Jiebo Luo,et al. Image Captioning with Semantic Attention , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[13] Zhongfei Zhang,et al. DeepIntent: Learning Attentions for Online Advertising with Recurrent Neural Networks , 2016, KDD.

[14] Alexander J. Smola,et al. Stacked Attention Networks for Image Question Answering , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15] Saurabh Singh,et al. Where to Look: Focus Regions for Visual Question Answering , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[17] George R. Milne,et al. Should tweets differ for B2B and B2C? An analysis of Fortune 500 companies' Twitter communications , 2014 .

[18] Yang Wang,et al. Space Collapse: Reinforcing, Reconfiguring and Enhancing Chinese Social Practices through WeChat , 2016, ICWSM.

[19] Michael I. Jordan,et al. Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[20] Margaret Mitchell,et al. VQA: Visual Question Answering , 2015, International Journal of Computer Vision.

[21] Qiang Yang,et al. The Lifecycle and Cascade of WeChat Social Messaging Groups , 2015, WWW.

[22] Quoc V. Le,et al. Distributed Representations of Sentences and Documents , 2014, ICML.