An Adaptive Method for Organization Name Disambiguation with Feature Reinforcing

Twitter is an online social networking, which has become an important source of information for marketing strategies and online reputation management. In this paper, we probe the problem of organization name disambiguation on twitter messages. This task is challenging due to the fact of lacking sufficient information both from organization and the tweets. We mine organization information from web sources to train a general classifier. Further, we mine tweets information. We train an adaptive classifier for a given organization name with more features derived from twitter messages labeled by the general classifier. The experiments on WePS-3 show mining web sources to enrich organization are effective. The adaptive classifier trained for a given organization is promising.

[1]  Hakan Ferhatosmanoglu,et al.  Short text classification in twitter to improve information filtering , 2010, SIGIR.

[2]  Paolo Rosso,et al.  On the Difficulty of Clustering Microblog Texts for Online Reputation Management , 2011, WASSA@ACL.

[3]  M. de Rijke,et al.  Adding semantics to microblog posts , 2012, WSDM '12.

[4]  M. de Rijke,et al.  Linking online news and social media , 2011, WSDM '11.

[5]  Hiroshi Nakagawa,et al.  ITC-UT: Tweet Categorization by Query Categorization for On-line Reputation Management , 2010, CLEF.

[6]  Julio Gonzalo,et al.  WePS3 Evaluation Campaign: Overview of the On-line Reputation Management Task , 2010, CLEF.

[7]  Hosung Park,et al.  What is Twitter, a social network or a news media? , 2010, WWW '10.

[8]  Raleigh North Haewoon, Kwak, Changhyun, Lee, Park, Hosung, and Moon, Sue. . What is Twitter, a Social Network or a News Media?. 19th International World Wide Web (WWW) Conference.April. , 2010 .

[9]  Ming Zhou,et al.  Recognizing Named Entities in Tweets , 2011, ACL.

[10]  Paul Kalmar Bootstrapping Websites for Classification of Organization Names on Twitter , 2010, CLEF.

[11]  Miguel Ángel García Cumbreras,et al.  SINAI at WePS-3: Online Reputation Management , 2010, CLEF.

[12]  Brian D. Davison,et al.  A Bootstrapping Approach to Identifying Relevant Tweets for Social TV , 2011, ICWSM.

[13]  Susumu Horiguchi,et al.  Learning to classify short and sparse text & web with hidden topics from large-scale data collections , 2008, WWW.

[14]  Karl Aberer,et al.  It Was Easy, when Apples and Blackberries Were only Fruits , 2010, CLEF.

[15]  Danah Boyd,et al.  Tweet, Tweet, Retweet: Conversational Aspects of Retweeting on Twitter , 2010, 2010 43rd Hawaii International Conference on System Sciences.