In recent years Twitter has become one of the largest online microblogging platforms. Microblogging streams have become invaluable sources for many kinds of analyses, including online reputation management, news and trend detection, and targeted marketing and customer services [4, 9]. Searching and mining microblog streams oers interesting technical challenges, because of the sheer volume of the data, its dynamic nature, the creative language usage, and the length of individual posts [3]. In many microblog search scenarios the goal is to nd out what people are saying about concepts such as products, brands, persons, et cetera. Here, it is important to be able to accurately retrieve tweets that are on topic, including all possible naming and other lexical variants. So, it is common to manually construct lengthy keyword queries that hopefully capture all possible variants. We propose an alternative approach, namely to determine what a microblog post is about by automatically identifying concepts in them. We take a concept to be any item that has a unique and unambiguous entry in a well-known large-scale knowledge source, Wikipedia. Little research exists on understanding and modeling the semantics of individual microblog posts. Linking free text to knowledge resources, on the other hand, has received an increasing amount of attention in recent years. Starting from the domain of named entity recognition, current approaches establish links not just to entity types, but to the actual entities themselves [5]. With over 3.5 million articles, Wikipedia has become a rich source of knowledge and a common target for linking; automatic linking approaches using Wikipedia have met with considerable success [2, 6, 8]. Most, if not all, of the linking methods assume that the input text is relatively clean and grammatically correct and that it provides sucient context for the purposes of identifying concepts. Microblog posts are short, noisy, and full of
[1]
M. de Rijke,et al.
Mapping queries to the Linking Open Data cloud: A case study using DBpedia
,
2011,
J. Web Semant..
[2]
Ian H. Witten,et al.
Learning to link with wikipedia
,
2008,
CIKM '08.
[3]
Paolo Ferragina,et al.
TAGME: on-the-fly annotation of short text fragments (by wikipedia entities)
,
2010,
CIKM.
[4]
M. de Rijke,et al.
Adding semantics to microblog posts
,
2012,
WSDM '12.
[5]
M. de Rijke,et al.
Generating links to background knowledge: a case study using narrative radiology reports
,
2011,
CIKM '11.
[6]
M. de Rijke,et al.
Linking online news and social media
,
2011,
WSDM '11.
[7]
Fabio Crestani,et al.
Statistics of Online User-Generated Short Documents
,
2010,
ECIR.
[8]
Ming Zhou,et al.
Recognizing Named Entities in Tweets
,
2011,
ACL.
[9]
Hosung Park,et al.
What is Twitter, a social network or a news media?
,
2010,
WWW '10.