论文信息 - INEX2014: Tweet Contextualization Using Association Rules between Terms

INEX2014: Tweet Contextualization Using Association Rules between Terms

Tweets are short messages that do not exceed 140 characters. Since they must be written respecting this limitation, a particular vocab- ulary is used. To make them understandable to a reader, it is therefore necessary to know their context. In this paper, we describe our approach submitted for the tweet contextualization track in CLEF 2014 (Confer- ence and Labs of Evaluation Forums). This approach allows the extension of the tweet's vocabulary by a set of thematically related words using mining association rules between terms. Web 2.0 is the term associated with the transition of the World Wide Web from a collection of individual web sites to an emerging platform in its own right. This emergence is due largely to users collaborations, these users have been the driving force for the emergence of new services (1). One of those is the microblogging service, e.g., Twitter, which is a communication medium and a collaboration system that allows broadcasting short messages. In contrast to traditional blogs, media-sharing and social networks services, microblogs (tweets) are textual messages submitted in real-time to report an idea, an actual interest, or an opinion (2). The size of these messages may be limited by a maximum number of characters. This constraint, related to the size of message, causes the use of a particular vocabulary. The aim is to exchange a maximum of information in as little characters as possible (3). In this respect, we will focus on the Tweet Contextualization track. The participants of INEX 2014 1 are required to perform the task of contextualizing tweets, i.e., given a tweet and a related entity, they try to answer questions of the form "why this tweet concerns this entity? should it be an alert?".These questions can be answered by several sentences or by an aggregation of texts from dierent articles of Wikipedia.

Cherif Chiraz Latiri | Yahya Slimani | Mohamed Ettaleb | Meriem Amina Zingla

[1] Mohammed J. Zaki,et al. CHARM: An Efficient Algorithm for Closed Itemset Mining , 2002, SDM.

[2] Florian Boudin,et al. LIA/LINA at the INEX 2012 Tweet Contextualization track , 2012, CLEF.

[3] Patrice Bellot,et al. Overview of INEX Tweet Contextualization 2014 track , 2014, CLEF.

[4] Prasenjit Majumder,et al. Query Expansion for Microblog Retrieval , 2011, TREC.

[5] Yuefeng Li,et al. Microblog Retrieval Using Topical Features and Query Expansion , 2011, TREC.

[6] Tomasz Imielinski,et al. Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[7] Mohand Boughanem,et al. Uprising microblogs: a bayesian network retrieval model for tweet search , 2012, SAC '12.

[8] Mohamed Morchid,et al. INEX 2012 Benchmark a Semantic Space for Tweets Contextualization , 2012, CLEF.

[9] Ujjwal Maulik,et al. A New Approach for Association Rule Mining and Bi-clustering Using Formal Concept Analysis , 2012, MLDM.

[10] Hatem Haddad,et al. Towards an effective automatic query expansion process using an association rule mining approach , 2012, Journal of Intelligent Information Systems.