Brand Data Gathering From Live Social Media Streams

Social media streams, such as Twitter, Facebook, and Sina Weibo, have become essential real-time information resources with a wide range of users and applications. The rapidly increasing amount of live information in social media streams has important societal and marketing values for large corporations and government organizations. There is a strong need for effective techniques for data gathering and content analysis. This problem is particularly challenging due to the short and conversational nature of posts, the huge data volume, and the increasing heterogeneous multimedia content in social media streams. Moreover, as the focus of "conversation" often shifts quickly in social media space, the traditional keywords based approach to gather data with respect to a target brand is grossly inadequate. To address these problems, we propose a multi-faceted brand tracking method that gathers relevant data based on not just evolving keywords, but also social factors (users, relations and locations) as well as visual contents as increasing number of social media posts are in multimedia form. For evaluation, we set up a large scale microblog dataset (Brand-Social-Net) on brand/product information, containing 3 million microblogs with over 1.2 million images for 100 famous brands. Experiments on this dataset have demonstrated that the proposed framework is able to gather a more complete set of relevant brand-related data from live social media streams. We have released this dataset to promote social media research.

[1]  Qingshan Liu,et al.  Image retrieval via probabilistic hypergraph ranking , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[2]  Chun Chen,et al.  Music recommendation by unified hypergraph: combining social media information and music content , 2010, ACM Multimedia.

[3]  Chunmei Gu,et al.  Empirical Study on Social Media Marketing Based on Sina Microblog , 2012, 2012 Second International Conference on Business Computing and Global Informatization.

[4]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[5]  Bernhard Schölkopf,et al.  Learning with Hypergraphs: Clustering, Classification, and Embedding , 2006, NIPS.

[6]  M. de Rijke,et al.  Incorporating Query Expansion and Quality Indicators in Searching Microblog Posts , 2011, ECIR.

[7]  Xuelong Li,et al.  Visual-Textual Joint Relevance Learning for Tag-Based Social Image Search , 2013, IEEE Transactions on Image Processing.

[8]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[9]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[10]  George Karypis,et al.  A Comparison of Document Clustering Techniques , 2000 .

[11]  Dong Zhou,et al.  Improving search via personalized query expansion using social media , 2012, Information Retrieval.

[12]  Li Qian,et al.  Brand tweets: How to popularize the enterprise Micro-blogs , 2011, 2011 6th IEEE Joint International Information Technology and Artificial Intelligence Conference.

[13]  Brendan T. O'Connor,et al.  TweetMotif: Exploratory Search and Topic Summarization for Twitter , 2010, ICWSM.

[14]  Xuecheng Yang,et al.  The potential marketing power of microblog , 2010, 2010 Second International Conference on Communication Systems, Networks and Applications.

[15]  Antonio Torralba,et al.  Spectral Hashing , 2008, NIPS.

[16]  Martine De Cock,et al.  Ranking Approaches for Microblog Search , 2010, 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[17]  Miles Efron,et al.  Information search and retrieval in microblogs , 2011, J. Assoc. Inf. Sci. Technol..

[18]  Thomas Gottron,et al.  Searching microblogs: coping with sparsity and document quality , 2011, CIKM '11.

[19]  Yuanxi Li,et al.  Intelligent Social Media Indexing and Sharing Using an Adaptive Indexing Search Engine , 2012, TIST.

[20]  Beng Chin Ooi,et al.  TI: an efficient indexing mechanism for real-time search on tweets , 2011, SIGMOD '11.

[21]  M. de Rijke,et al.  Credibility Improves Topical Blog Post Retrieval , 2008, ACL.

[22]  Yue Gao,et al.  3-D Object Retrieval and Recognition With Hypergraph Analysis , 2012, IEEE Transactions on Image Processing.

[23]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[24]  David Hawking,et al.  New-web search with microblog annotations , 2010, WWW '10.

[25]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, CVPR.

[26]  Romit Roy Choudhury,et al.  Micro-Blog: sharing and querying content through mobile phones and social participation , 2008, MobiSys '08.

[27]  Meredith Ringel Morris,et al.  #TwitterSearch: a comparison of microblog search and web search , 2011, WSDM '11.